Category: Quantitative Finance

Risk-Neutral Probability Distributions: CLK2020

Colton Smith & Kevin Schneider

Risk-neutral probability distributions (RND) are used to compute the fair value of an asset as a discounted conditional expectation of its future payoff. In 1978, Breeden and Litzenberger presented a method to derive this distribution for an underlying asset from observable option prices [1]. The derivation of the relationship is well presented in A Simple and Reliable Way to Compute Option-Based Risk-Neutral Distributions by Allan Malz which is summarized below [2].

In the absence of arbitrage, the European call option value can be related to the discounted expected terminal value under the risk-neutral distribution.

c(t, K, \tau) = e^{-r_t \tau} \tilde{\mathbb{E}}_t\left[\max(S_T - K, 0)\right] = e^{-r_t \tau} \int_K^\infty (s-K) \tilde{\pi}_t(s) ds


\begin{aligned} t &= \text{Time of call option value observation} \\ T &= \text{Expiration} \\ \tau &= \text{Time till expiration in years (} T-t \text{)} \\ K &= \text{Strike price} \\ S_T &= \text{Underlying price at expiration} \\ S_t &= \text{Time-}t \text{ underlying price} \\ r_t &= \text{Time-}t \text{ risk-free interest rate} \\ \tilde{\mathbb{E}}_t[\cdot] &= \text{The expectation from the risk-neutral probability distribution taken at time-}t \\ \tilde{\pi}_t(\cdot) &= \text{The risk-neutral probability density function of }S_T\text{ at time-}t  \end{aligned}

Differentiating the call option value with respect to the strike price gives what Malz refers to as the “exercise-price delta”.

\frac{\partial}{\partial K}c(t, K, \tau) = e^{-r_t \tau}\left[\int_0^K \tilde{\pi}_t(s) ds - 1\right]

This result contains the risk-neutral cumulative distribution function \tilde{\Pi}_t(K) , the probability that the underlying price at expiration will be K or lower.

\tilde{\Pi}_t(K) = \int_0^K \tilde{\pi}_t(s) ds = 1 + e^{r_t \tau}\frac{\partial}{\partial K}c(t, K, \tau)

To arrive at the risk-neutral probability density function, one more derivative with respect to the strike price is needed.

\tilde{\pi}_t(K) = e^{r_t \tau}\frac{\partial^2}{\partial K^2}c(t, K, \tau)

Now we will present an overview of deriving these distributions numerically using the infamous May 2020 WTI Crude Oil contract that went negative in April. Albeit these are American options, the analysis is interesting nonetheless.

To approximate the partial derivatives, we need option prices for a fine grid of strike prices. In fact, Breeden and Litzenberger simply assume that options are traded at every positive strike price. The first step is thus to fit a smooth function to the Black-Scholes implied volatility smile of the option prices. We interpolate implied volatilities rather than prices because the former tend to be smoother and better behaved. To ensure that the prices and implied volatilities are clean, open interest can be used to weight or exclude certain strikes. We constructed the smile using out-of-the-money calls and puts as these options tend to have more open interest than their in-the-money equivalents. Numpy’s polyfit is then used to interpolate and extrapolate the smile as needed.


The second step is to calculate the call prices using the Black-Scholes formula for constant rates with your fitted implied volatility function, underlying price, time to expiration, and observed risk-free rate.


The third step, following the mathematical derivation above, is to calculate \tilde{\Pi}_t(K) using the first derivative of the call price with respect to the strike price. Numpy’s diff function will numerically difference your call price function. This effectively corresponds to using a forward difference to approximate the partial derivative.


To compare these distributions across strike prices, using moneyness, \frac{K}{S_t} , as the x-axis is helpful. The fourth and final step, is to calculate \tilde{\pi}_t(K) using the second derivative of the call price with respect to the strike price and scaling it by e^{r_t\tau} .


It can be difficult to generate clean RNDs and through this process it became clear how cumbersome it would be to generate these over an extended time series. To evaluate the quality of the fit, we need to check a few conditions.

– The cumulative distribution needs to be bounded between 0 and 1
– The probability density function needs to integrate to 1 and remain positive
– The exercise-price delta needs to be monotone and bounded by -e^{-r_t\tau} and 0
– The strike weighted probability density function needs to integrate to the underlying futures price

There are additional arbitrage conditions to consider on the fitted implied volatility smile but our distributions meet the above conditions nicely so will be sufficient for our analysis. In fact, the absence of arbitrage is one of the few assumptions needed for the above mathematical derivation to hold. Further implicit assumptions include constant interest rates, that the call option price is twice differentiable and that a (smooth) probability density function of the price of the underlying asset exists to start with. Importantly, there are no further restrictions on the probabilistic nature of the underlying asset price process.

The above technique was repeated on the CLK2020 option chain for the four dates shown in the figure below to see how the option implied volatility RNDs reacted to the developing macro landscape including Covid-19, OPEC+, and crude oil physical storage capacity. The analysis was not extended into April as the possibility of negative prices violate some of the fundamental assumptions used.


The resulting risk-neutral densities below tell quite the colorful story. In January, the distribution has a slight negative skew but high kurtosis. In February and March, the tail risk begins to emerge as the kurtosis decreases. Finally in mid-March the distribution flattens, presenting a very expressive RND.


The corresponding fitted implied volatility smiles are shown below.


At the time, all eyes were on the calendar spreads and bear spreading the futures, short a nearby contract expiration month and long a deferred contact expiration month, was a popular trade. Based on the RND, this trade made sense if you believed that the bulk of the tail risk was in the nearby contract. Using the QuikStrike Commitment of Traders tool on the CME website we can look at the positioning of traders through this event [3]. The plots below show the positioning of the two groups that make up the “Non-Commercial” group of traders which includes hedge funds and large speculators.



The “Managed Money” began the year with a larger spreading position (likely bear?) which decreased into February as a more concentrated short position grew. In April and May, they capitalized on the series of events, quickly building a fervent long position to ride the rally back up.

The “Other Reportables” held a large spreading positioning from the end of February until May, likely capturing the historical calendar spread move nicely.

It is important to emphasize that risk-neutral distributions are different from the real-world distributions which govern the likelihood of events in financial markets. A RND is an artificial probability distribution which allows the computation of option prices without specifying the risk aversion of investors (this insight is the reason for Scholes’ and Merton’s Nobel Prize in 1997). Thus, they are a very convenient mathematical tool but not a description of the real world.

Intuitively, RNDs merge real-world probabilities with investor’s attitudes toward risk. They tend to inflate the probability of economically bad states (which are rare but investors are very fearful of them) and understate the probability of booms. Hence, RNDs tend to be more negatively skewed than real-world distributions are.

In fact, the above idea can be formulated more rigorously. Let’s use a discrete setting for simplicity. Suppose we want to price an asset which generates a random cash flow of X tomorrow (e.g. payoff of a call option). Finance theory suggests that today’s price, c , should equal

c = \mathbb{E}[XM],

where the random variable M takes the associated risk into account. Riskier payoffs should yield lower prices. M is called the stochastic discount factor. In a discrete setting, the expectation is just the sum of outcomes weighted by the corresponding probabilities. Thus,

\begin{aligned} c&=\sum_i x_i M_i p_i \\ &= \frac{1}{1+r} \sum_i x_iq_i \\ &= \frac{1}{1+r} \tilde{E}[X], \end{aligned}

where q_i = \frac{M_ip_i}{\sum M_ip_i} . Hence, \sum q_i=1 and q_i\in[0,1] making them valid pseudo-(risk-neutral)-probabilities. Furthermore, denoting the risk-free rate by r , and because 1=\sum (1+r) M_i p_i , we have \frac{1}{\sum M_ip_i}=1+r , explaining the second equality. The final line is simply the discrete analogue to the equation at the beginning of this blog post: prices are discounted risk-neutral expectations of future payoffs.

This short derivation illustrates how merging the risk aversion (in form of M ) and the real life probabilities (p ) yield the risk-neutral probabilities (q ). This however distorts and dilutes the probabilities. We can recover the risk-neutral probabilities q from option prices, but this does not tell us what the real probabilities p are because we do not know the right stochastic discount factor.

Changes in the RND can be attributed to changes in investor’s preferences, real-world probabilities, or both. A priori, it is impossible to know which of the three cases is true. Disentangling real-world probabilities from risk aversion typically requires a general equilibrium model of the economy and remains an open question for current research, see Ross (2015) [4] for a recent attempt.






Using the Dynamic Mode Decomposition (DMD) to Rotate Long-Short Exposure Between Stock Market Sectors

Co-Author: Eric Kammers

Part 1 – Theoretical Background

The Dynamic Mode Decomposition (DMD) was originally developed for its application in fluid dynamics where it could decompose complex flows into simpler low-rank spatio-temporal features. The power of this method lies in the fact that it does not depend on any principle equations of the dynamic system it is analyzing and is thus equation-free [1]. Also, unlike other low-rank reconstruction algorithms like the Singular Value Decomposition (SVD), the DMD can be used to make short-term future state predictions.

The algorithm is implemented as follows [2].

1. We begin with a {N}x{M} matrix, {\mathbf{X}}, containing data collected from {N} sources over {M} evenly spaced time periods, from the system of interest.

2. From this matrix two sub-matrices are constructed, {\mathbf{X}_1^{M-1}} and {\mathbf{X}_2^M}, which are defined below.

\displaystyle \mathbf{X}_j^k = \begin{bmatrix} \mathbf{x}(t_j) & \mathbf{x}(t_{j+1}) & ... & \mathbf{x}(t_k) \end{bmatrix}
\displaystyle \mathbf{X}_1^{M-1} = \begin{bmatrix} \mathbf{x}_1 & \mathbf{x}_2 & ... & \mathbf{x}_{M-1} \end{bmatrix}
\displaystyle \mathbf{X}_2^{M} = \begin{bmatrix} \mathbf{x}_2 & \mathbf{x}_3 & ... & \mathbf{x}_{M} \end{bmatrix}

We can consider a Koopman operator {\mathbf{A}} such that {\mathbf{x}_{j+1} = \mathbf{Ax}_j} and rewrite {\mathbf{X}_1^{M-1}} as

\displaystyle \mathbf{X}_1^{M-1} = \begin{bmatrix} \mathbf{x}_1 & \mathbf{A}\mathbf{x}_1 & ... & \mathbf{A}^{M-2}\mathbf{x}_1 \end{bmatrix}

whose columns now are elements in a Krylov space.

3. The SVD decomposition of {\mathbf{X}_1^{M-1}} is computed.

\displaystyle \mathbf{X}_1^{M-1} = \mathbf{U \Sigma V^*}

Then based on the variance captured by the singular values and the application of the algorithm, the number of desired reconstructions ranks can be chosen.

4. The matrix {\mathbf{A}} is constructed such that it is the best mapping between the two sub-matrices.

\displaystyle \mathbf{A}\mathbf{X}_1^{M-1} \approx \mathbf{X}_2^M

{\mathbf{A}} can be approximated with {\mathbf{\tilde{A}}} from evaluating the expression

\displaystyle \mathbf{\tilde{A}} = \mathbf{U^*X_2^M V\Sigma^{-1}}

where {\mathbf{U}}, {\mathbf{\Sigma}}, and {\mathbf{V}} are the truncated matrices from the SVD reduction of {\mathbf{X}_1^{M-1}}. The eigenvalue problem associated with {\mathbf{\tilde{A}}} is

\displaystyle \mathbf{\tilde{A}}\mathbf{y}_k = \mu_k \mathbf{y}_k \qquad k = 1,2,..,K

where {K} is the rank of approximation that was chosen previously. The eigenvalues {\mu_k} contain information on the time dynamics of our system and the eigenvectors can be used to construct the DMD modes.

\displaystyle \mathbf{\psi}_k = \mathbf{Uy}_k

5. The approximated solution for all future times, {\mathbf{x}_{DMD}(t)}, can now be written as

\displaystyle \mathbf{x}_{DMD}(t) = \sum_{k=1}^K b_k(0) \mathbf{\psi}_k(\mathbf{x}) \exp{(\omega_k t)} = \mathbf{\Psi} \text{diag}(\exp{(\omega t)}) \mathbf{b}

where {\omega_k = \ln(\mu_k)/\Delta t}, {b_k(0)} is the initial amplitude of each mode, {\mathbf{\Psi}} is the matrix whose columns are eigenvectors {\mathbf{\psi}_k}, and {\mathbf{b}} is the vector of coefficients. Finally, all that needs to be computed is the initial coefficient values {b_k(0)} which can be found by looking at time zero and solving for {\mathbf{b}} via a pseudo-inverse in the equation

\displaystyle \mathbf{x_0} = \mathbf{\Psi b}

To summarize the algorithm, we will “train” a matrix {\mathbf{\tilde{A}}} on a subset of the data whose eigenvalues and eigenvectors contain necessary information to make future state predictions for a given time horizon.

Part 2 – Basic Demonstration

We begin with a basic example to demonstrate how to use the pyDMD package. First, we construct a matrix \mathbf{X} where each row is a snapshot in time and each column can be thought of as a different location in our system being sampled.

\displaystyle \mathbf{X} = \begin{bmatrix} -2 & 6 & 1 & 1 & -1 \\ -1 & 5 & 1 & 2 & -1 \\ 0 & 4 & 2 & 1 & -1 \\ 1 & 3 & 2 & 2 & -1 \\ 2 & 2 & 3 & 1 & -1 \\ 3 & 1 & 3 & 2 & -1 \\ \end{bmatrix}

Now we will attempt to predict the predict the 6th row using a future state prediction from the DMD fitted on the first 5 rows.

import numpy as np
from pydmd import DMD
df = np.array([[-2,6,1,1,-1],

dmd = DMD(svd_rank = 2) # Specify desired truncation
train = df[:5,:] # Fit the model on the first 5 rows
dmd.dmd_time['tend'] *= (1+1/6) # Predict one additional time step
recon = dmd.reconstructed_data.real.T # Make prediction

print('Actual :',df[5,:])
print('Predicted :',recon[5,:])

Two SVD ranks were used for the reconstruction and the result is pleasantly accurate for how easily it was implemented.

\displaystyle \mathbf{x_{True}}(6) = \begin{bmatrix} 3 & 1 & 3 & 2 & -1 \end{bmatrix}

\displaystyle \mathbf{x_{DMD}}(6) = \begin{bmatrix} 2.8 & 0.8 & 2.6 & 2.0 & -0.9 \end{bmatrix}

Part 3 – Sector Rotation Strategy

We will now attempt to model the stock market as a dynamic system broken down by sectors and use the DMD to predict which sectors to be long and short in over time. This is commonly known as a sector rotation strategy. To ensure that we have adequate historical data we will use 9 sector ETFs: XLY, XLP, XLE, XLF, XLV, XLI, XLB, XLK, and XLU from 2000-2019 and rebalance monthly. The strategy is implemented as follows:

  1. Fit a DMD model using the last N months of monthly returns. The SVD rank reconstruction number can be chosen as desired.
  2. Use the DMD model to predict the next month’s snapshot which are the returns of each ETF.
  3. Construct the portfolio by taking long positions in the top 5 ETFs and short positions in the bottom 4 ETFs. Thus, we are remaining very close to market neutral.
  4. Continue this over time by refitting the model monthly and making a new prediction for the next month.

Though the results are quite sensitive to changes in the model parameters, some of the best parameters achieve Sharpe ratios superior to the long only portfolio while remaining roughly market neutral which is very encouraging and warrants further exploration with a proper, robust backtest procedure.


The code and functions used to produce this plot can found here. There are also many additional features of the pyDMD package that we did not explore that could potentially improve the results. If you have any questions, feel free to reach out by email at


[1] N. Kutz, S. Brunton, B. Brunton, and J. Proctor, Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems. 2016.

[2] Mann, Jordan & Nathan Kutz, J. Dynamic Mode Decomposition for Financial Trading Strategies. Quantitative Finance. 16. 10.1080/14697688.2016.1170194. 2015.