In [23]:
%%html
<style>table {float:left}</style>
<script>
code_show=false;
function code_toggle(){
    if(code_show){$('.prompt, .input, .output_stderr, .output_error').hide();}
    else{$('.input, .prompt, .output_stderr, .output_error').show();}
    code_show=!code_show;
}
function initialize(){
    var output=$('.output_subarea.output_text.output_stream.output_stdout');
    $.merge(output,$('.output_subarea.output_text.output_result'));
    for(var i=0;i<output.length;i++)for(var j=0;j<output[i].children.length;j++)
        output[i].children[j].style.fontFamily='Palatino';
    code_toggle();
}
$(document).ready(initialize);
</script>
<a href="javascript:code_toggle()">Click here to show/hide code</a >
Click here to show/hide code

SPX Volatility Term Structure Analysis

In this exercise we analyze the SPX volatility structure based on daily close prices over the period of 1970-2018. We rely on the Black-Scholes framework and the Dupire formula to calibrate our volatility curve parameters to fit the historical SPX return distribution.

1. Theoretical Discussion

We start off by assuming that the local volatility is a function of the log-in-the-moneyness $x$ for a fixed maturity window of $T$ business days.

$$ \sigma(x) = g(x) \\ \text{ where } x=log(K/S_0)$$

Under the Black-Scholes framework with risk-free rate $r=0$, a call option premium for $S_0=1$ is:

$$ C(K, T) = \Phi(d1) - K\Phi(d2) \\ d_{\text{1, 2}} = \dfrac{log(1/K) \pm (\sigma^2/2)T}{\sigma\sqrt{T}} $$

If we assume that there exists a probability density function $f_{S_T}$, we have:

$$ C(K, T) = \int^{\infty}_K(S_T-K)f_{S_T}(S_T)dS_T $$

Taking partial derivative w.r.t. $K$, we get the Dupire formula:

$$ \dfrac{\partial C}{\partial K} = \int^{\infty}_K -f_{S_T}(S_T)dS_T \\ \dfrac{\partial^2 C}{\partial K^2} = f_{S_T}(K) $$

We can approximate the second-order derivative numerically:

$$ f_{S_T}(K) \approx \dfrac{C(K + \Delta K) - 2C(K) + C(K - \Delta K)}{(\Delta K)^2} $$

We previously defined that $x = log(K)$. Given any $\Delta K$, let us define $\Delta x$ such that:

$$ \Delta x = \dfrac{\Delta K}{K} \\ \text{therefore, } \Delta K = e^{x}\Delta x $$

We define $r_T$ as the $T$ period log return, and that $r_T=log(S_T)$. Based on chain rule:

$$ f_{r_T}(x) = f_{S_T}(K) \times K $$

So in conclusion we have:

$$ f_{r_T}(x) = f_{S_T}(K) \times K \approx \dfrac{C(e^x + e^x\Delta x) - 2C(e^x) + C(e^x - e^x\Delta x)}{e^x(\Delta x)^2} \\ \text{where } C(x) = \Phi(\dfrac{-x + (\sigma(x)^2/2)T}{\sigma(x)\sqrt{T}}) - e^x\Phi(\dfrac{-x - (\sigma(x)^2/2)T}{\sigma(x)\sqrt{T}}) \text{, and } \sigma(x) = g(x) $$

The goal is to find the optimal volatility function $g(x)$ such that $f_{r_T}(x)$ fits to the historical distribution of $r_T$.

Drawing

2. Data and Limitation

2.1 Data

Plotting the daily log return over the entire data period.

In [132]:
# too many windows, only display six
cm = {'g':'#60854B', 'r':'#C75B56', 'b':'#6C8DBF'}
fig = plt.figure(figsize=(16, 32))

ax = fig.add_subplot(4, 1, 1)

for term in terms.keys():
    ax.plot(df[term+"_ret"], color=cm['r'], label=term+" log return", alpha=.8)

plt.title("time series")
plt.xlabel('Year')
plt.ylabel('log return')
plt.legend(frameon=False, prop={'size': 12})
plt.tight_layout()

3. Term Structure Fitting

3.1 Ordinary Least Square

3.1.1 Use Entire Dataset

We fit the realized return distribution with least square method.

In [40]:
sampled_ols_result = full_model_plot({'20D': 20, '260D': 260}, df, method='ols')

Plotting the term structure of skewness with the following definition:

$$ Skewness = \dfrac{\sigma_{20\Delta put} - \sigma_{20\Delta call}}{\text{log-moneyness}_{20\Delta put} - \text{log-moneyness}_{20\Delta call}} $$
In [17]:
plot_skew_term_structure(ols_result.iloc[:-5, :], legend=['ols_fit'], ylim=(-0.5, 0))

3.1.2 Bootstrapping (Resampling)

We choose the lower and upper bound randomly, covering both centered values and extreme values**

  • $k$: uniformly distributed between 0.5 and 1.2
  • $r$ equidistant samples from $[\mu - 5 k \sigma, \mu + 5 k \sigma]$

  • In each sample, 200 points are selected.

  • 40 different sets of samples were used for fitting the curve.


We then plot the term structure of skewness with resampling.

In [21]:
plot_skew_term_structure(ols_params.iloc[:-5, :], ols_param_std.iloc[:-5, :], delta_skew=True, ylim=(-0.5, 0))

3.1.3 Volatility Regime

We create three different regimes over the entire time horizon by looking at the one-year trailing volatility of daily log return, with cut off volatility value of 0.12 nad 0.22.

In [7]:
df1=df.copy()
mov_vol = df1['r'].rolling(260).std()*(260)**0.5
fig = plt.figure(figsize=(10, 8))
mov_vol.plot(color='#C75B56')
plt.axhline(y=0.12, color='black')
plt.axhline(y=0.22, color='black')
plt.text(-800, 0.12, "0.12", bbox=dict(facecolor="w",alpha=0.5),fontsize=15)
plt.text(-800, 0.22, "0.22", bbox=dict(facecolor="w",alpha=0.5),fontsize=15)
plt.title('Volatility Regime')
plt.xlabel('day');
plt.ylabel('volatility');


We then plot the term structure of skewness by regimes.

In [44]:
plot_skew_term_structure(ols_params, ols_params_std, legend=['low','medium','high'],delta_skew=True)

3.2 Maximum Likelihood Estimation

For best fitting results, here we use SVI to model volatility. Specifically, the SVI model can be stated as:

$$Var(x) = a + b * (\rho * (x - m) + \sqrt{(x-m)^2 + \sigma^2})$$

We would like to find optimal set of ($a$, $b$, $\rho$, $m$, $\sigma$) that fits best to empirical return distribution.

Furthermore, we would like to add additional constriants to ensure we do not get negative volatility and probability density:

$$a+b\sigma \sqrt{1-\rho ^2}\geq 0$$$$b(1+\left | \rho \right |)\leq \frac{2}{T}$$$$b(1+\left | \rho \right |)\leq 2$$
In [52]:
result_partial=full_model({'20D':20,'260D':260}, df, plot=True, method='mle')

We plot the term structure of skewness with the same definition:

$$ Skewness = \dfrac{\sigma_{20\Delta put} - \sigma_{20\Delta call}}{\text{log-moneyness}_{20\Delta put} - \text{log-moneyness}_{20\Delta call}} $$
In [47]:
plot_svi_skew(mle_params, mle_param_std, legend=['mle_fit'],delta_skew=True)

4. Further Study

Additional analysis - volatility reversion.

In [85]:
fig = plt.figure(figsize=(18, len(terms) * 6))
cm = {'g':'#60854B', 'r':'#C75B56', 'b':'#6C8DBF'}
index = 1

for term_str, term_day in terms.items():
    # fit kernel density 
    vol_raw = df['vol_' + term_str].dropna()
    vol_mean, vol_std = vol_raw.mean(), vol_raw.std()

    # normalization
    vol_norm = (vol_raw - vol_mean) / vol_std
    vol_norm = abs(vol_norm)
    #vol_raw_abs = abs(vol_raw)
    #vol_mean_abs = vol_raw_abs.mean()
    #vol_mean_abs = (vol_mean_abs - vol_mean) / vol_std

    temp = pd.DataFrame(vol_norm)
    temp.index.name = "Date"
    temp.columns = ["Normalized Volatility"]
    # set the convergence to be plus/minus 0.1 of standard deviation
    convergence = 0.1
    temp = temp[temp["Normalized Volatility"] < convergence]

    # bins
    bins = {round(b, 1):[] for b in np.arange(0, 4.1, 0.1)}

    for time, vol in vol_norm.items():
        if vol < 0.1:
            bins[0.0].append(0)
        elif vol > 4.0:
            continue
        else:
            tempp = temp[temp.index > time]
            if tempp.empty == True:
                break
            else:
                days = (tempp.index[0] - time).days
                bins[round(vol//0.1*0.1, 1)].append(days)
    for i in bins.keys():
        bins[i] = np.mean(bins[i])

    avg_conv = pd.DataFrame(bins, index=["Average Convergence"]).T
    avg_conv.index.name = "bins"

    # plotting results
    ax = fig.add_subplot(len(terms), 3, 3*index-2)
    vol_raw.plot(ax=ax, color=cm['r'])
    ax.axhline(y=vol_mean, color='black')
    ax.legend(['historical volatility', 'historical mean'], loc='upper left', frameon=False)
    ax.set_xlabel('time')
    ax.set_ylabel('volatility')
    ax.set_title(term_str + " annualized volatility")

    ax = fig.add_subplot(len(terms), 3, 3*index-1)
    vol_norm.plot(ax=ax, color=cm['b'])
    #ax.axhline(y=vol_mean_abs, color='g')
    #ax.legend(['normalized volatility', 'vol_mean_abs'], loc='upper left', frameon=False)
    ax.legend(['normalized volatility'], loc='upper left', frameon=False)
    ax.set_xlabel('time')
    ax.set_ylabel('volatility')
    ax.set_title(term_str + " normalized volatility")

    ax = fig.add_subplot(len(terms), 3, 3*index)
    avg_conv.plot(ax=ax, color=cm['g'])
    ax.legend(['average convergence'], loc='upper left', frameon=False)
    ax.set_xlabel('bins in std')
    ax.set_ylabel('convergence in days')
    ax.set_title(term_str + " average convergence to historical level")

    index += 1

Next Steps:

  • Include risk-free rate in the dupire formula and the Black Scholes model.
  • Heston volatility model
  • Create regime based on interest rate.
In [24]:
%%html
<script>initialize();</script>
<a href="javascript:code_toggle()">Click here to show/hide code</a >