Navigation

01. Portfolio Theory
02. Option Theory
03. Stochastic Calculus
04. Credit Risk Model
05. Foreign Exchange
06. C++


📖 Portfolio Theory

Portfolio Diversification

Two-Asset Portfolio

Consider an investment portfolio on two assets:

We can calculate the mean and variance of the return on the portfolio, based on the mean and variance of the return on each asset.

We can see that if , we have diversification, where is linear in the portfolio allocation while the standard deviation is convex.

When , the portfolio variance can be as small as desired. If we set then and the portfolio becomes riskless.

In this following chart we can see a two-asset portfolio return volatility plotted against different weights and correlation.

img1.png

We see that to as long as the correlation is not perfect between the two assets, there exist a weight that would minimize the portfolio variance (not taking portfolio return into consideration, yet)

Multi-Asset Portfolio

Consider assets with return volatility and covariance . Let denote the allocation to asset . Then the variance of portfolio return becomes:

In the case of an equally-weighted portfolio with .

We define:

Therefore,

We conclude that in a equally weighted portfolio (or a diversified portfolio where ) with a large number of assets,

  • the individual asset return variances become unimportant to the portfolio return variance
  • the portfolio variance instead now depends on the average covariances between the assets.

Here, the average return covariance is the systematic risk that cannot be eliminated through diversification, whereas the average return volatility is the idiosyncratic risk that is diversifiable.

Note that complete diversification () is achieved when:

  • and for a multi-asset portfolio
  • for a two-asset portfolio

Mean-Variance Frontier

In a mean-variance space, the set of all possible portfolio with assets form a convex set. The bounday of this set is known as the mean-variance frontier and forms a parabola.

img2.png

The top half of the MV frontier is the set of efficient MV portfolios which maximize mean return given return variance.

Let us define as the random variable vector of asset returns on assets:

A particular portfolio is defined by the weights assigned to various assets, and we denote the weights vector . The portfolio return is also a random variable, where:

Also,

The GMV and Tangent Portfolios

The Global Minimum Variance (GMV) portfolio has the lowest return variance among all possible portfolios, characterized by the leftmost point on the MV frontier. It can be constructed with weight which minimizes the objective function under the constraint of :

The Tangent portfolio is a portfolio with the highest mean/variance ratio among all possible portfolios, characterized by the point that is tangent to the MV frontier and going through the origin:

img3.png

The MV Portfolio

It turns out that any portfolios on the efficient MV frontier can be constructed based off a linear combination of the GMV and tangent portfolios, which solves the following optimization:

Thus a portfolio is a MV portfolio if and only if there exists such that:

Consider MV investors that only focus on the mean and variance of a portfolio, then such investors will only hold MV portfolios (which are linear combination of two funds only).

Excess Return with Risk-Free Asset

Consider the existence of a risk-free asset with return that has zero variance and correlation with other assets. The mean excess return is defined as:

And the mean excess return of a portfolio with weight is:

Since the risk-free asset has no variance, the return variance of the portfolio is still

A MV portfolio with a risk-free asset is a vector which solves the following optimization:

Note that the constraint that weights sum up to is now dropped, with the inclusion of the risk-free asset.

Thus a portfolio with mean excess return is a MV portfolio with a risk-free asset if:

Where,

This result show that with a risk-free asset, any MV portfolio simply contains a position in the tangency portfolio and a position in the risk-less asset (a.k.a the Two Fund Separation).

Interesting facts w.r.t. the tangency portfolio :

  • is the unique portfolio that is on both the risky and risk-less MV frontiers
  • is the point on the risky MV frontier at which the tangency line goes through point (0, risk-free rate).

img4.png

Sharpe Ratio

We define the Sharpe ratio (SR) of a portfolio as:

Therefore, the tangency portfolio is the portfolio on the risky MV frontier with the maximum Sharpe ratio.

On the risk-free efficient MV frontier (a.k.a the Capital Market Line) all portfolio has the same SR the tangency portfolio, since the frontier itself is a straight line.

Sortino Ratio

The Sortino ratio improves upon the Sharpe ratio by penalizing the down-side volatility only.

Linear Factor Model

Based on the First Fundamental Theorem of Asset Pricing, given no arbitrage there exists a risk neutral probability measure and a change of measure (R-N derivative) such that for any tradable asset ,

proposition is a linear function of :

Such that all portfolio returns have a factor-beta representation w.r.t. the tangency portfolio,

By mathematical identity, this will hold in sample exactly.

proof Consider the tangency portfolio from the risk-free MV frontier:

Therefore,

And since,

We can show that,

In addition, the covariance can be shown as,

Thus,

Genaralization

The factor-beta representation is not unique to the tengancy portfolio. In fact, it holds for any arbitraty MV portfolio,

We will focus on the tangancy portfolio, W.L.O.G.

Practical Consideration

The factor-beta representation seems to provde a way to estimate the mean return for any given portfolio. However, it difficult to calculate the tangency weight in practice, due to either circularity (direct estimation) or imprecision (inverting

The Linear Factor Model makes an assumption regarding the identify of the tangency portfolio, which avoids the issues stated above.

The LFM assumed tangency portfolio is only used for the pricing of expected returns. However, additional assumptions can be made regarding investor’s MV preference, such that the assumed tangency portfolio will also be used in actual asset allocation.

CAPM

The most famous LFM is the Capital Asset Pricing Model, which assumes a value-weighted market portfolio of all available assets as the tangency portfolio.

The CAPM is a relative pricing formula, which states that the expected return of any asset can be expressed as the sum of the risk-free rate and a portion of the market risk premium. In other words, it says that the expected excess return/risk premium of an asset is proportional to the market risk premium. The factor is estimated based on regression. CAPM also assets that market beta is the only risk associated with higher average returns, and that volatility, skewness, and other covariances do not matter in determining risk premium.

We can also re-write the formula as follow:

This shows that the Sharpe ratio earned on an asset depends only on the correlation between asset return and market returns.

There are two ways to derive CAPM:

  1. If we assume that returns are jointly normal, then the mean and variance are the sufficient statistics for the return distribution, and thus every investor holds a portfolio on the risk-less MV frontier, which is a combination of the tangency portfolio and the risk-free asset. Therefore aggregating across all investors, the market portfolio of all investments is the tangency portfolio.
  2. If we do not assume jointly normal returns, but instead that investors only care about mean and variance of returns. In this case all investor will also choose MV portfolios, and therefore CAPM holds.

Treynor’s Ratio

Fama-French Model

The Fama-French 3-factor model is a well-known multi-factor models:

Where is the excess market return as in CAPM, is a portfolio that goes long small stocks and shorts large stocks, and is a portfolio that goes long value stocks (low market price per fundamental) and shorts growth stocks.

The FF model states that beta to value and small stocks earn premium, NOT being a value or small stock. In other words, the premium is earned on how a stock acts, not how it is classified.

Continue on L3


📖 Option Theory

This is a study note on the fundamental theory of the pricing of a financial derivative, whose payoff is defined in terms of an underlying asset. We hereby try to compute a consistent price of the derivative in relative terms to the market price of the underlying asset.

Option Pricing Theory

We make our first assumption that the market is frictionless, by which we mean that:

  • no transaction cost (commission, bid-ask spread, taxes)
  • can hold negative asset (shortting) and there is no margin constraint
  • can hold fractional asset
  • no market impact from trading

Arbitrage (Static Portfolio)

We assume that the market lives in a probability space and it includes tradable assets with non-random time- prices and random time- prices:

A static portfolio is a vector of quantities, where each is non-random and constant in time:

Thus the time- value of the static portfolio is;

A static portfolio is an arbitrage if its value satisfies that:

Suppose portfolio super-replicates portfolio , which means that . Then , otherwise arbitrage exists. Same goes if it is a sub-replication. Therefore, if replicate , which menas that , then . This is called the law of one price.

Assets

Discount Bond

A discount bond pays at maturity . Given non-random interest rate , the no-arbitrage price of the discount bond is:

Forward Contract

A forward contract on with non-random delivery price obligates its holder to pay and receive at time . The time- value of the forward contract is .

A forward price is delivery price such that the value of forward contract at time- is zero.

European Call Option

An European call option gives its holder the right at time to pay and receive . A call has payoff , and it is in the money if at time .

The time- price of a call option satisfies:

For strike :

European Put Option

An European put option gives its holder the right at time to pay and receive . A put has payoff , and it is in the money if at time .

The time- price of a put option satisfies:

For strike :

In addition,

Put-Call Parity

Binomial Tree

We can create a replicating portfolio to calculate the value of a call option under a simple binomial tree:

Where,

And,

Plugging in and :

We can interpret and as probabilities that construct a risk-neutal measure and that:

The Fundamental Theorem

The fundamental theorem of asset pricing states that:

no arbitrage

if and only if:

there exists a probability measure equivalent to P such that the discounted prices of all tradable assets are martingales w.r.t.

The proof can be summarized as two ideas:

  • :

    a martingale is the cumulative P&L from betting on zero- games, which is always zero no matter how you vary your bet size across games and time. you cannot riskless make something from nothing.

  • :

    the probability of an event is simply the price of an asset that pays 1 unit of B iff that event happen

Risk-Neutral Measure

The physical probability is not accurate in evaluating a payoff’s true market price. Considering a 50/50 coin flip worth or nothing. Using physical probability the price will be .

However, the actual market price would be different. If the market is risk-adverse, the price would be lower, say . We can view it as this market represents a risk-neutral measure where the down move has higher risk-neutral probabilities than up move.

We can see that the risk-neutral probability is price, that the risk-neutral probability of an event is the price of one-unit payout contingent on the event. Taking a risk-neutral expectation is the same as pricing by replication.

Radon-Nikodym derivative

In a discrete settimgn with outcomes , the relatioship between the risk-neutral measure and physical measure can be expressed by the Radon-Nikodym Derivative, or liklehood ratio:

The LR is typically larger in bad states than good states, reflecting the price margin on adverse events.

The Second Fundamental Theorem

A market is said to be complete if every random variable can be replicated by a static portfolio .

The second fundamental theorem of asset pricing states that:

a no arbitrage market is complete

if and only if:

there exists a unqiue measure equivalent to P such that the discounted prices of all tradable assets are martingales w.r.t.

Trading Strategy

A filtration represents all information revealed at or before time . A stochastic process is adapted to if is -measurable for each , meaning that the value of is determined by the information in .

A trading strategy is a sequence of static strategy adapted to . A trading strategy is self-financing if for all :

This implies that the change in the portfolio value is fully attributable to gains and losses in asset prices:

Therefore,

We define that a trading strategy replicates a time-T payoff if it is self-financing and the value . By the law of one price, at any time , the no-arbitrage price of an asset paying must have the same value of the replicating portfolio.

Arbitrage (Trading Strategy)

We now expand on the previous definition of arbitrage, that an arbitrage is a self-finance trading strategy whose value satisfies:

Ito Process

We define an Ito process to be a stochastic process that:

The existence and uniqueness of a solution of can be guaranteed by Lipschitz-type technical condition on and

Ito’s Rule

The Ito's rule states that give n an Ito process , and a sufficiently smooth function :

With two processes and , and :

In a special case where , the formula becomes:

Note that the Ito’s Rule applies under any probability measure, it is purely math.

Black-Scholes Model

Assumptions Consider two basic assets and in continuous time, where:

And follows GBM dynamics,

Conclusion Then by no-arbitrage and Ito's rule, the time- price of a call option with payoff satisfies the Black-Scholes PDE for

We can solve the call price analytically with the Black-Scholes formula:

Here we plotted the BS call price , the intrinsic value and the lower bound against the current underlying price , with paramters , , and

img.png

The Greeks

Delta

Suppose an asset has a time t value , then its Delta at time is . Delta can be interpreted as:

  • the slope of the asset value , plotted as a function of S_t.
  • how much the asset value movies per unit move in
  • humber of needed to replicate this asset.

If the asset is a call option on and we assumes the Black-Scholes assumptions on , then:

The Delta of a call option is strictly between 0 and 1. As the time-to-maturity decreases, the Delta increases faster the the option becomes more ITM. Here we plotted the BS Delta for equals and against the current underlying price .

img2.png

Gamma

For a call option in a B-S model,

In this case, the Gamma can be interpreted as:

  • the convextity of w.r.t. S_t
  • how much the Delta moves, per unit move in
  • how much rebalancing of the replicating portfolio is needed, per unit move in

The Gamma of a call option is strictly positive. As the time-to-maturity decreases, the Gamma increases for ATM options. Here we plotted the BS Delta for equals and against the current underlying price .

img4.png

Theta

For a call in B-S model,

The Theta of a call option is strictly negative. As the time-to-maturity decreases, the Theta decreases for ATM options (faster time-decay). Here we plotted the BS Theta for equals and against the current underlying price .

img3.png

Discrete Delta Hedge and Gamma Scalping

A discretely Delta-hedged portfolio could buy and short . In this case it is a Delta neutral and long Gamma/Gamma scalping portfolio:

  • Delta of the portfolio is
  • Gamma of the portfolio is positive
  • achieve net profit only if the realized volatility of is high enough to overcome time decay, otherwise portfolio loss happens. This is the opposite from a short Gamma position, e.g. sell and long Delta

We can visualize the P&L of a long Gamma portfolio in the following graph, where the green area indicate profits and the red area indicate losses. The curved line is the straight line is . As increases, shifts downwards due to time-decay.

img5.png

In addition, we can show that the P&L of such portfolio does not depend on the drift of the stock:

Continue on L5

Numerical Methods

The Taylor series of a real or complex value function that is differentiable at is:

Implied Volatility

Given the time- price of a European call option on a non-dividend stock , the time- Black Scholes implied volatility is the unique solution to .

Uniqueness is because is strictly increasing in and Existence is because covers the full range of arbitrage-free prices of the European option

If follows the SDE dynamic , where a non-random function of , then we can first find the implied volatility given call prices with different maturity , and use the equation below to find (not uniquely) the true function :

Volatility Smile, Skew and Surface

If truely follows GBM with constant volatility , then . However, empirically the is lower when (volatility smile), possibly because

  • the market price option using a risk-neutral distribution of log-returns with fatter tails than Normal

Note that is also higher when (volatility skew), possibly due to:

  • instantaneous volatility increases as price decreases
  • possibility of severe crash fuels demand for downside protection

In addition, the has a term structure and varies for different . The function is call the implied volatility surface

Tree Model

Binomial Tree

European Option

Given option price at the -th node , we can induct backward to find :

American Option - Put

Given option price at the -th node , we can induct backward to find :

American Option - Call

Given option price at the -th node . If and stock dividend , then it is never optimal to exercise early on an American call option. Therefore

Argument 1 At all , the American call is worth more than the exercise payoff :

Argument 2 If then construct portfolio . Then V is an arbitrage as and .

Trinomial Tree

Let and choose to improve accruacy.

Finite Difference Model

Explicit Scheme

Inducting backward from to :

Solving for the B-S PDE: where , we get:

Where:

Note that are trinomial tree probabilities.

Implicit Scheme

Inducting backward from to :

Solving the requires solutions of a system of equation with unknowns.

Crank-Nicolson Scheme

Inducting backward from to :

If given terminal conditions, then we know ‘s and can solve for .

Monte Carlo Model

Given be a discounted payoff and the time- price of the payoff . The Monte Carlo estimator of :

By the strong law of large numbers, the sample average converges almost surely to the expected value as . By the central limit theorem:

Often times we need to estimate with sample estimator for the variance of :

The standard error , and a confident interval for is

Variance Reduction Techniques

Antithetic Variate

Let . The antithetic variate estimoator :

Control Variate

A control variate is a random variable, correlated to such that has an explicit formula.

Example Let be the discounted payoff on a call on where . We can choose to be the discounted payoff on a call on where , in which case can be calculated explicitely through B-S formula given constant close to .

The control variate estimator estimates by simulating .

Choose to minimize , we get:

Note that when using sample estimate , the estimated is biased, only when is small.

Importance Sampling

Suppose are IID draws from density , and . Ordinary Monte Carlo estimator provides:

With importance sampling, find s.t. iff . Then re-draw from density and the importance sampling estimator is:

Conditional Monte Carlo

Given a random variable :

The condintional Monte Carlo estimator:

Fourier Transform Model

Given be integrable, meaning . The Fourier transform of is the function defined by:

Theorem If is also integrable, then the inversion formula holds:

Characteristic Function

The complex conjugate of a complex number is given by . so .

The characteristic function of any random variable is the function defined by:

Therefore if has density , then . A characteristic function uniquely identifies a distribution. For example, , if

  • To calculate the moments of using CF, take the -derivatives of w.r.t. :
  • To calculate the CDF of using CF:
  • To calculate asset-or-nothing call price using CF, given be the asset share price, define the share measure with likelihood ratio .

Therefore for any , the asset-or-nothing call price:

  • To calculate a vanilla European call price on struck at with :

Heston Model

Provided that:

Where and are BM with correlation , is the rate of mean-reversion, is the long-term mean, and is the volatility of volatility.

We want to find the CF of in order to price options on . The time- conditional Heston CF provides an answer:


📖 Stochastic Calculus

Discrete Time Martingales

Conditional expectation

Definition A Borel set is any set in a topological space that can be formed from open sets through the operations of:

  • complement
  • countable union
  • countable intersection

Definition Let be a random vector and be a integrable random variable with . The conditional expectation of given is the unique measureable function such that for every Borel set :

We denote as

Example 1 Suppose random variable and are discrete.

Example 2 Suppose random variable and are continuous, with joint probability density function and marginal density and .

Here are some basic properties of conditional expectation:

  • Linearity:
  • Constant: if , then
  • Independence: if is independent of , then
  • Tower Property: if then
  • Factorization Property: if Z is -measurable then
  • Monotonicity: if , then a.s.

Theory

Definition A -algebra is a collection of subsets of a Borel set , that is closed under:

  • complement, e.g. if , then
  • countable unions, e.g. if , then

Definition is the set of all -measurable square-integrable random variable , with finite 2nd moment .

Definition A real Hilbert space is a real vector space with an inner product , such that is a complete metric space w.r.t. to the metric , where:

Hilbert space examples: , with inner product . Or, , with inner product . The reason we are interested at rather than for other is that the innner product give rise of orthogonality.

Proposition If , then for any -algebra , the conditional expectation is the orthogonal projection of X onto , such that:

Also, can be interpreted as a -measurable random variable that minimizes the mean square error .

Martingales

Definition A filtration is an increasing sequence of -algebra , where is the -algebra of all events.

Definition A martingale is a sequence of measurable integrable random variable such that:

The tower property implies that .

Example 1 Given I.I.D. random variable with and variance .

  • Sequence , and
  • Sequence

are both martingales.

Example 2 Let be any random variable and be any filtration. Then the sequence is a closed martingales.

Note that the St. Petersburg martingale is not closed, where and and . This is because .

Example 3 Given I.I.D. random variable with moment generating function . Then the exponential martingales is a positive martingale with definition:

Doob’s Indentity

Definition A sequence of random variables is predictable with respect to filtration if is measurable with respect to

Definition A sequence of random variables is adapted to filtration if is measurable with respect to

Proposition If is a martingale with and is a predictable sequence of bounded random variables, then the martingale transform is a martingale:

Definition A stopping time with respect to filtration is a random variable such that

Lemma Let be a stopping time, then the sequence is predictable.

Theorem Let be a martingale and be a stopping time. For all , the Doob’s Identity states that . Note that if is bounded for all , DCT shows that .

Proof. is a martingale:

Theorem Let be a sequence functions on measure space that converge point-wise to a function f. For ,

  • The Dominated Convergence Theroem (DCT) requires to be dominated by an integrable function :

  • The Monotone Convergence Theroem (MCT) requires to be monotone (increasing or decreasing): or

Example 1 Let be a simple random walk with . Let stopping time , where .

We know that is a martingale and . Apply Doobs’s Identity and DCT we have:

We know that is a martingale. Apply Doobs’s Identity we have . Since is bounded by and is monotone, apply DCT on the RHS and MCT on LHS we get:

Combine both results we can get some interesting result for the Gambler’s Ruin problem:

Example 2 Let be a simple random walk. Let stopping time , where . Note that now DCT fails as is not bounded. Hence .

In fact, because :

Doob’s Maximal Inequality

Definition An adapted sequence of random variable is a:

  • sub-martingale if
  • super-martingale if

Proposition If is a convex function and is a martingale, then:

  • The Jensen’s Inequality holds:
  • the sequence is a sub-martingale.

Proposition If is a martingale with and is a predictable sequence of boundedm non-negative random variables, then the martingale transform is a sub-martingale:

Proposition If is a martingale with and is a predictable sequence of random variables such that , then

Corollary If is a non-negative sub-martingale with initial term , then Doob’s Maximal Inequality claims that for any :

and that:

Note that this is a big improvement on the Chebyshev Inequality, which claims that given -bounded random variable and for any :

Martingale Convergence Theorem

Definition a sequence of real numbers is called a Cauchy sequence if for every positive real number , there is a positive integer such that for all natural numbers such that

Definition martingales have orthogonal increments. Given a martingale with increments and , then:

  • , , and

Theorem Suppose is -bounded martingale, then there exists a -bounded random variable such that:

Theorem Suppose is -bounded martingale, then there exists a -bounded random variable such that:

(1)
(2)

Change Of Measure

Proposition Given a probability measure and is a non-negative random variable satisfying , then there exist a probability measure such that for any bounded or non-negative random variable that . Z is called the likelihood ratio of probability measure w.r.t. , written as and that:

Proposition If the outcome space is finite, then for each outcome ,

Example 1 In a -period market with finite set of outcomes and tradable assets. Let denote the risk-neutural measure for USD and EUR investors. Let denote the USD and EUR price of the risk-less (w.r.t. its own measure) asset at time t. Then

Proof. By fundamental theorem, , and , so:

Theorem Let and be two probability measure on the same measurable space, and let be a filtration such that for all n is absolutely continuous w.r.t. on . Then the sequence of likelihood ratio is a martingale:

Brownian Motion

Standard Bronwian Motion

Definition A standard Brownian motion (SBM) is a continuous-time random process such that and:
(a) has stationary increments.
(b) has independent increments.
(c) The sample path are continuous.

Note that (a), (b), and (c) imply that for some constant the distribution of is

Definition Given a SBM , is a Brownian motion with drift and variance .

Proposition Given a SBM , its reflection is also a SBM.

Proposition Given a SBM , then for any , is a SBM

Quadratic Variation

Definition The nth level quadratic variation of a function is the sum of squares of the increments across intervals of length :

Theorem Given a SBM with drift and variance , then for all with probability :

Strong Markov Property

Definition Given a SBM , a stoping time is a non-negative random variable such that for every fixed , the event depends only on the path

Theorem If is a Brownian motion and is a stopping time then the strong Markov property holds:
(a) the process is a Brownian motion, and
(b) the process is independent of the path

Theorem Run Brownian motion , at the first time that , reflect the path in the line , by the reflection principle the new process is another Brownian motion:

  • for ,
  • for ,

Corollary

Corollary has the same distribution as

Corollary has the same distribution as . Hence . Consequently, for every with probability 1 adn . Therefore for every , the Brownian path crosses the t-axis infinitely many times by time

Martingales In Continuous Times

Definition A filtration is a nested family of -algebra indexed by time .

Definition The natural filtration for a Brownian motion is the filtration with -the collection of all events determined by Brownian path up to time .

Definition A continuous-time stohastic process X_t is a martingale relative to a filtration if:
(a) each random variable is measurable w.r.t. and
(b) for any ,

Proposition Given a SBM then each of these is a martingale relative to the natural filtration:
(a)
(b)
(c)

Theorem Define to be the probability measure with likehood ratio . The Cameron-Martin theorem states that the SBM under is a Brownian motion with drift and variance under .

Corollary For any real value and

Corollary For any stopping time and ,

Ito Calculus

Ito Integral

Definition If is an uniformally bounded process with continuous paths adapted to then we can define an Ito Integral , where is truncted at :

Property The Ito Integral satisfy the following properties:
(1) Linearity: 􏰃.
(2) Continuity: the paths are continuous.
(3) Mean Zero:
(4) Variance, a.k.a. Ito Isometry:

Defintion Define the quadratic variation of the Ito Itegral:

Proposition
(a) The process is a martingale
(b) The process is a martingale

Example

Example For any stopping time and any :

Theorem Let be a SBM and let be the −algebra of all events determined by the path . If is any random variable with mean 0 and finite variance that is measurable with respect to , for some , then the Ito representation theorem claims that adapted process such that:

This theorem is of importance in finance because it implies that in the Black-Scholes setting, every contingent claim can be hedged.

Ito Formula

Theorem Let be a SBM, and let be a twice-continuously differentiable function such that are all bounded (or at most have exponential growth). Then for any :

Theorem Let be a SBM, and let be a twice-continuously differentiable function whose partial derivatives are all bounded. Then for any :

Proposition Assume is nonrandom and continuously differentiable. Then:

Ito Process

Definition An Ito process is a stochastic process that satisfies a stochastic differential equation of the form:

Equivalently, satisfies the stochastic integral equation:

Definition For any adapted process define:

Theorem Let be an Ito process, and let be a twice-continuously differentiable function whose partial derivatives are all bounded. Then:

The Ornstein-Uhlenbeck Process

Definition The Ornstein-Uhlenbeck SDE:
(a) This SDE describes a process Xt that has a proportional tendency to return to an “equilibrium” position 0.
(b) In finance, the OU process is often called the Vasicek model.
(c) Solving the SDE:
(d) The Ornstein-Uhlenbeck process is Gaussian.

The Exponential Martingale

Definition The Exponential Martingale SDE:
(a) Solving the SDE:

The Diffusion Process

Definition The Diffusion SDE:

Definition The Harmonic Function is a function that satisfies the ODE:

Example Let be a solution of the diffusion SDE with initial value , and for any real numbers let . Find

We first apply the Ito Formula to and observe that a harmonic function will force the term to vanish. Therefore is a martingale and that

We can solve for :

The Diffusion Process - Bassel Process

Definition The Diffusion SDE:

Example Similar problem as above:

Note that if and then will never reach .

Ito Formula - Multi-Variable

Theorem Let be a K−dimensional SBM, and let be a function with bounded first and second partial derivatives. Then the Ito Formula states:

Where:

Corollary If is a stopping time for the SBM then Dynkin’s Formula shows that for any fixed time :

And that is a martingale

Definition A function is said to be a Harmonic Function in a region if

(a) 2D Harmonic Function Exmaple:
(b) 3D Harmonic Function Example:

Corollary Let be harmonic in the an open region with compact support, and assume that and its partials extend continuously to the boundary . Define to be the first exit time of Brownian motion from , then:

(a) the process is a martingale, and
(b) for every ,

Example If a 2D SBM starts at a point on the circle of radius 1, find out the probability that it hits concentric circles before .

Let be harmonic. Then is a martingale and that .

Example If a 3D SBM starts at a point on the sphere of radius 1, find out the probability that it hits concentric sphere before .

Let be harmonic. Then is a martingale and that .

Ito Process - Multi-Variable

Definition An Ito process is a continuous-time stochastic process of the form:

Where the quadratic variation

Let be a vector of Ito processes. For any function with bounded first and second partial derivatives, then:

Theorem Let be a K −dimensional SBM, and let be an adapted, K−dimensional process satisfying . Then the Knight’s Theorem states that the 1-dimensional Ito process is a SBM:

Proposition Let be a K −dimensional SBM. Define be the radial part of . Then is a Bessel process with parameter :

Barrier Option

Pricing

Definition A barrier option at time pays:
(a) 1max_{0 \leq t \leq T}\;S_t \geq AS_0$,
(b) 0$ otherwise.

Assume that follows GBM:

The no-arbitrage price of the barrier option at is the expected payoff:

At time , there are two possibilities:
(a) if , then
(b) if , then is the same as the time- value of a barrier option with time-to-maturity and

Hedging

Let be the value of the barrier option at time . The Fundamental Theorem and Ito Formula show that v(t, S_t satisfy the Black-Scholes PDE:

A replicating portfolio for the barrier option holds
(a) share of stock
(b) share of cash

provided that . Once the portfolio convert all holdings to cash and hold till maturity.

The Black-Scholes

The Black-Scholes Formula

Theorem Under a risk-neutral , the Fundamental Theorem asserts that discounted share price is a martingale, where:

Therefore :

Definition A European contingent claim with expiration date and payoff function is a tradeable asset with:
(a) share price at time :
(b) discounted share price at time :

Proposition Let be a standard Brownian motion and is a function such that . Then for every :

Corollary Given , the Black Scholes Formula shows:

Under risk-neutral , the time option price is a martingale. With the Ito Formula we can set the drift of to be zero and therefore derive the Black Scholes PDE:

Hedging In Continuous Times

Definition A portfolio is self-financing if for all

Proposition A portfolio is self-financing if and only if its discounted value is a martingale and satisfies:

Definition A replicating portfolio for a payoff function is a self-financing portfolio such that

Theorem A replicating portfolio for contingent claims is given by:
(a) cash, and
(b) shares of stock

where u is the solution of the Black Scholes PDE satisfying

The Girsanov Theorem

Proposition The exponential process is a positive martingale.

Applying Ito Formula and therefore

Therorem Given a SBM under -measure and the likelihood ratio , define the -measure where . Then the Girsanov’s Theorem states that under the -measure:
(a) is a SBM
(b) is a BM with time-dependent drift

Example 1 Given a brownian motion with , define measure be the conditional probability measure on event . Therefore is a BM with drift .

Proof. We know that , therefore by change of measure:

Therefore Girsanov’s Theorem implies that under , is a SBM.

Example 2 Given currency and their respective bank account and . Define exchange rate (# B per A) that

Theorem If is a SBM under measure then .

Proof. is a martingale only if

Theorem

Levy Process

Poisson Process

Definition A Levy process is a continuous-time random process such that and:
(a) has stationary increments;
(b) has independent increments;
(c) the sample paths X_t$ are right-continuous.

Note that Brownian motion and Poisson process are both Levy processes and the basic building blocks of Levy processes. Brownian motion is the only Levy process with continuous paths.

Example Let be a SBM and for , the random variable is a Levy process.

Note that:
(a) has stationary, independent increments
(b) has the same distribution as

Definition A Poisson process with rate is a Levy process such that for all the random variable follows Poisson distribution with mean :

Proposition If are independent Poisson distributions with mean , then .

Proof.

Corollary IF are independent Poisson processes with rates then the superposition is a Poisson process with rate

Proposition Every discontinuity of a Poisson process is of size

Proposition Let be a Poisson process of rate , and let be an independent sequence of i.i.d. Bernoulli− random variables. Then the Thinning Theorem states that are independent Poisson processes with rates :

Theorem If and in such a way that , then the Law of Small Numbers states that the distribution converges to the distribution.

Proposition If is a rate− Poisson process, then for any real number the process 􏰍 is a martingale.

Theorem Define with likelihood ratio such that . Then under the process is a rate- Poisson process.

Compound Poisson Process

Definition A compound Poisson process is a Levy process of the form:

Where is rate- Poisson process and are i.i.d. random variable independent of . The distribution is the compounding distribution and the measure is the Levy measure.

At each , a random is draw from . is the sum of all draws made by time

Proposition If , then , and , is an exponential martingale.

Poisson Point Process

Definition Let be a −finite Borel measure on . A Poisson point process with intensity measure is a collection of extended nonnegative integer-valued random variables such that
(A) If then a.s.
(B) If then
(C) If are pairwise disjoint, then the r.v.s are independent, and

Proposition The point process associated with a CPP is a Poisson point process with intensity measure , where is the Levy measure for the CPP.

Theorem Let be any Levy process, and let be the random set of points such that the Levy process has a jump discontinuity of size at time , i.e.,

Then is a Poisson point process with intensity measure where is a −finite measure called the Levy measure of the process.


📖 Credit Risk Model

Standard Simulation Model on Credit Portfolio

Credit Risk

Lenders, such as banks, are subject to many kinds of risks. among which credit risk is the most likely to cause bank failure.

  • Credit risk
  • Market risk
  • Operation risk
  • Reputation risk


Each loan is part of a legal agreement that requires the borrower to pay interest and repay principle on schedule, while some borrowers are required to obey specified covenants, such as maintaining earning above a certain threshold.

If the borrower fails to follow the agreement, the lender holds the borrower to be in default, which can be money default or covenant default. Purchaser of public bonds only experiences money default.

At default, the loan agreement calls for fee to be paid by the borrower, gives the bank power to seize collateral (for secured loans), and has a cross default provision (where all loans are in default once one loan is in default).

In the 20th century, most banks did not define default until they discovered a model that could help them manage credit risk.


Rating Agencies

There are 3 major Nationally Recognized Statistical Rating Organizations (NRSRO) to which firms pay to rate their bonds to increase liquidity.

  • Standard & Poor
  • Moody’s
  • Fitch

Under S&P ratings, the grades are:

  • Investment grade: AAA, AA, A, BBB
  • Non-investment grade: BB, B, CCC, CC
  • Selectively defaulted: SD
  • Defaulted: D


D and PD

Let D be the default indicator of a loan, taking only two values: 0 and 1. PD is the probability of default annually.

By mathematical identity:

  • Knowing PD, we can simulate D by a Bernoulli Distribution with parameter as PD.
  • Given data on D, we can calculate the implied PD.

In a portfolio of N firms, the portfolio default rate, DR, equals:


Exposure, Recovery and LGD

Exposure is the amount that is owed to the borrowers. Recovery is measured in either of two ways:

  • Market price of the loan at the time of default
  • Discounted future cash flows back to the time of default


LGD (Loss Given Defaults) is a random variable with values usually between 0 and 1:

For a defaulted loan, there are two ways to measure recovery/LGD. For a current loan, there is a distribution for LGD. The expectation is written as:

US investment grade bond LGD is about 0.20%, while non-investment grade is about 3.60%. Bank loans are almost alwasy senior to bonds and have lower LGD.

Loss and EL

Loss is measured as a fraction of exposure:

EL is the expected loss. Because D and LGD are indepndent, so:

Lenders often need to estimate and include EL in the spread they charged.


Change Of Variable

Note the LGD is often measured in fractions. To change the measure to dollar amount, we need to use the Chain Rule.

Given the pdf of LGD:

We define the function g such that:

Hence the function g-inverse is:

The partial derivative can be expressed as:


By definition:

Taking derivative on both sides and with chain rule:

Finally:


Simulate Portfolio Loss On One Single Loan

We know that:

To simulate loss, we first simulate D:

1
2
3
Draw x ~ Uniform[0, 1]
If x < PD, then D = 1
Else D = 0

Then simulate LGD based on the pdf of LGD. Multiple each D and LGD to get Loss. Repeat the process to produce a distribution of Loss.


Simulate Portfolio Loss On N Independent Loan

Assume the default of each of the N loan is independent and have the same probability of default, PD:

Then the total number of defaults follows binomial distribution:

However, based on historically data, the variance is much higher than that of the binomial distribution. Hence default correltion needs to be introduced.


Simulate Portfolio Loss On N Correlated Loan

Assume that there is a latent unobserved variable zi that is responsible for the default of firm i, i.e. firm i defaults if:

Assume any two firms i and j are jointly normal. Denote the correlation between zi and zj:

Let ri, j be the correlation between asset return of firm i and j, we know that almost certainly:

Denote PDJ as the probability that both firm i and j default:

To calculate PDJ with python:

1
2
3
4
5
6
7
8
9
10
11
12
13
import numpy as np
from scipy.stats import norm
from scipy.stats import multivariate_normal


PD1, PD2 = 0.1, 0.2
mean = [0, 0]
cov = [[1, .5], [.5, 1]]

result = multivariate_normal(mean, cov)
PDJ = round(result.cdf(np.array([norm.ppf(PD1),
norm.ppf(PD2)])), 4)
print("Pr[D1=1, D2=1]:", PDJ)

Returns:

1
Pr[D1=1, D2=1]: 0.0515

Now that we have the Di, we can simulate portfolio loss rate, given the LGD distribution and exposures for each firm.

Denote Dcorr to be the correlation between Di and Dj:


Note that holding PDi, PDj fixed:

  • greater Dcorr => greater PDJ
  • greater ρ => greater PDJ
    • ρ between -1 and 1 => PDJ between 0 and min[PDi, PDj]


Copula

When we model more than three firms, pair-wise correlation is not enough to determine the entire distribution of outcomes. For example, there are N PD’s and N(N-1)/2 pair-wise correlations while we want to calculate 2N outcomes. Hence we introduce the Gauss copula which helps describe the group-wise correlations.

Consider a set of multivariate normals:

The quantiles of the set are uniformly distributed by definition:

The copula of the set (Z1, Z2, …, ZN) is defined as the joint cumulative distribution function of (Φ(Z1), Φ(Z2), …, Φ(ZN)):

The Gauss copula is as follow. Note that among all possible copula, the Central Limit Theorem defines and supports the Gauss copula:


In fact, the copula does not contain any information on the marginal distribution. Here we set the marginal distribution FZ to follow standard normal only as an example, but it can be anything continuous such that:

And so:



In the context of default modeling, we assume that each company’s default follows Bernoulli and simulate with standard normal distribution:

The probability of all firms default at the same time is by definition:

Note that given a pair-wise correlation matrix Σ, this probability can take any values between 0 and the lowest single firm default probability.

Now we assume all firms’z are connected by the Gauss copula, which suggests a single value for the probability of all defaulting.


With python we can either numerically evaluate the integral or use simulation to calculate the probability that all firms default at the same time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import numpy as np
from scipy.stats import norm
from scipy.stats import multivariate_normal

np.random.seed(9999)
PD = [.5, .4, .3, .2, .1]
mean = [0, 0, 0, 0, 0]
cov = [[1, .05, .1, .15, .2],
[.05, 1, .25, .3, .35],
[.1, .25, 1, .4, .45],
[.15, .3, .4, 1, .5],
[.2, .35, .45, .5, 1]]

result = multivariate_normal(mean, cov)
PDA = round(result.cdf(np.array(norm.ppf(PD))), 4)
print('Probability Of All Default:', PDA)

N = 10000
simulation = norm.cdf(np.random.multivariate_normal(mean, cov, N))
D = np.array(np.sum((simulation < PD), axis=1).tolist())

PDA_simulated = round(np.count_nonzero(D == 5)/N, 4)
print('Probability Of All Default (Simulated):', PDA_simulated)

DR = round(sum(D)/(5 * N), 4)
print('Average DR:', DR)

Returns:

1
2
3
Probability Of All Default: 0.017
Probability Of All Default (Simulated): 0.0168
Average DR: 0.3014

Note that the compared to the other copulas, the Gauss copula requires only a pair-wise correlation matrix and the PD to tell a lot of information. Most of the times the Gauss copula has not been shown invalid, while the calibration of the marginals and correlation matrix are often proved erroneous.


Simulate Rating Transitions

The default model only has two states, 0 and 1:

To simulate rating transitions, we require two matrix:

  • Transition Matrix: $$$$P[i \rightarrow j], \forall i, j$$$$
  • Cost Matrix, e.g. the loss due to deterioration of borrowers: $$$$cost[i \rightarrow j], \forall i, j$$$$


Factor Model

Single Factor Model

We construct the single risk factor model with latent variable Zi:

The pair-wise correlation between two firms i and j’s latent variables is:

Where:

  • Z and Xi are Independent
  • Z is the systematic factor that affects all firms. If Z increase, all Zi decrease and become more likely to default. Z summarizes the effects of all observable macroeconomic factors plus the effects of unobservable factors.
  • Xi is the idiosyncatic factor that affects only firm i’s latent variable
  • Zi ~ N(0, 1) by construction
  • {Zi} are jointly normal and connected by a Gauss copula

cDR and Vasicek

Define Conditional (Expected) Default Rate (cDR) as:

This gives the final form of cDR, which is called the Vasicek formula, named after Oldrich Vasicek. Note that the Vasicek formula is monotonic in z and in PD, i.e., higher the z/PD, higher the cDR.

The expected default rate for firm i is always PDi, since:

However, when Z is known, the expected default rate is cDRi. Firms are now uncorrelated as Z is known:

If there are large numbers of identical firms with uniform PD and ρ, the default rate of such asymptotic portfolio follows the unconditional Vasicek distribution.

The unconditional Vasicek pdf can be derived with change-of-variable technique. Note that we eliminate z and the pdf only has parameter PD and ρ:

The mean of cDR is PD:

Multi-factor Model

Suppose that there are two jointly normal systematic risk factors ψ and ω, and that there are two group of firms depending on each of the factors:

Between the two groups:

Note that:

  • If corr[ψ, ω] = 1, this becomes the single factor model and that:
  • If corr[ψ, ω] < 1, the cross-correlations are less than that in the single factor case. It is called diversification.
  • With multi-factor model, risk becomes sub-additive, as oppose to additive in the single factor models. This means that the risk in the portfolio is less than the sum of the cDRs’.
  • The Moody's Factor Model attribute each Zi to about 250 factors, along with a firm-specific idiosyncratic factor.

Basel II Capital formula

The Bank For International Settlements is in Basel, Switzerland. The Basel Committee on Bank Supervision drafted legislation requiring banks to hold minimum capital, e.g. Basel II, Basel III, etc.

The Basel II formula is an Asymptotic Single Risk Factor model, where the portfolio is large enough for the Law of Large Number to work and it generalizes the Vasicek Distribution and include a diverse choice of PD and ρ within the portfolio. The core of the capital requirement for credit capital is the inverse CDF of Vasicek Distribution.

Inverse Vasicek (with parameter PD and ρ):

Note:

  • K is the capital requirement per dollar of wholesale loan.
  • LGD is the average LGD in historical downturn conditions
  • R (correlation) = 0.12 + 0.12 x exp(-50 x PD)
  • b = [ 0.11852 - 0.05478 x log (PD) ]2
  • M is maturity

Making sense of the Basel II formula:

  • Capital requirement is for loss, as oppose to only default, hence the formula multiplies by LGD.
  • Capital requirement is for unexpected loss, hence the formula subtracted the expected loss LGD X PD. The expected portion is handled by bank reserves.
  • Loans might deteriorate without defaulting, hence a maturity adjustment is added to impose higher capital for longer maturity loan.
  • The estimation of PD and LGD is performed by the banks and supervised by bank supervisor.


Estimation, Statistical Test and Overfit

Estimating PD

Firms differ widely in their credit quality, and PD tend to change over time as well. So a firm’s PD is neither known or fixed. We analyze analogous firms with identical credit ratings to estimate PD.

Method 1, for all A-rated firms in the dataset:

Method 2, for all A-rated firms in the dataset:

Method 3, estimate PD as a parameter in a pdf describing A-rated firms. This tries to find a distribution that best fits the data. We will focus on this method.

Method Of Moments

Given a dataset {Xi}N, we set the moments of the Vasicek distribution equal to the moments of the data.

First moment:

Second moment (unbiased, using N-1 in denominator):

Note:

  • The method of moment matches the broad features of distribution with the data
  • The solution is not unique. Choices can be made between central moment/raw moment, lower moment/higher moment.
  • By Jensen’s Inequality, functions of moments are not moments of functions

Maximum Likelihood Estimation

The MLE method chooses parameter values that make the data most likely under the assumed distribution. MLE matches the distribution to the data as a whole, as oppose to M.o.M. which only matches the moments. The MLE fits the PDF to the dataset.

When data is not highly dispersed, however, the MLE estimate tend to be close to the M.o.M. estimate.

The MLE method is biased estimate that choose parameters that maximize the likelihood function. Given a dataset {Xi}N, we assume the true default rates follow Vasicek distribution. The likelihood function is:

Often we try to maximize the log-likelihood function, i.e. find PD and ρ such that:

Hypothesis Testing & Wilks’ Theorem

We does not assert truth, as truth is often unknown. With a given set of data, we can only assert some models are better in predicting the future behavior of similar data.

We called the simpler model the null hypothesis, the more complicated ones the alternative hypothesis. The null generally nests under the alternative, i.e. the alternative becomes the null when some parameters are set to certain values.

We prefer the null, because it is simpler, and by doing so we avoid Type 1 error, which is the rejection of a true null.

Hence we only reject the null if the alternative fits the data significantly better through a statistical test.

Wilks Theorem asserts that if:

  • There is an asymptotic amount of data
  • The null hypothesis is true

Then D has a distribution that approaches the χ2 distribution (with df = number of extra parameters in the alternative), given dataset {Xi}N:

The likelihood ratio is defined as follow. It is less or equal than 1 as the alternative is more flexible, and it leads to more probability densities given certain data:


We reject the null hypothesis if D statistic is a tail observation that either the null is not true or the null is true and something (type 1 error) unlikely happen. We reject the null when:

For example when df = 1, the critical value = 3.84, we will reject the null with 95% confidence when:

Overfit

An overfit model makes worse forecast than a simpler model.

We assume the population data (X, Y) follows bivariate normal distribution:

Given ρ, the population regression line is:

The sample regression line is:

From a sample of 30 observations of (X, Y), ordinary least square (OLS) is performed to find the in-sample p-value for the coefficient and R2. MSE is used to evaluate forecast error.

  • When ρ = 0.8, the sample regression line (yellow) is close to the population regression line (red):

image1.png

  • When ρ = 0.2, the sample regression line does NOT match well.

image2.png

This shows that when the population has a week relationship (ρ = 0.2), estimates of slope are more dispersed.



Now we look at the relationship between statistically significance and MSE. The population Mean-Squared Error (MSE) is an out-of-sample measure of forecast errors. The population MSE does NOT depend on any in-sample data:

We can see that the population regression (b = ρ, a = 0) would minimize MSE, by taking partial derivatives. We can also see that higher the ρ, lower the MSE.


A regression is significant (at 95% confidence) if the p-value for the coefficient b is less than 0.05.

We have observed that when population has a weak relationship (ρ = 0.2):

  • Forecasts by significant regressions tend to have greater MSE.
  • Forecasts by regressions with higher R-square tend to have greater MSE.

This is because the strong relationship suggested by the regression does NOT forecast the week population relationship well.

When population has a strong relation (ρ = 0.8), however, the significant regression/high R-square holds out-of-sample.


Conditional LGD Risk

cLGD

The history of bond LGD shows that LGD is elevated when default rate is elevated. The elevation is shown to be moderate and similar across different debt types:

It is important to model LGD appropriately in different economic conditions. Like cDR, we define cLGD:

Note that:

There are two ways to calculate ELGD:

Futhermore,

Where:

  • EcLGD is the average LGD over conditions
  • ELGD is the average LGD over different loans
  • ELGD is higher than EcLGD because when cLGD is higher, cDR/PD is also higher, which increase the probability weight on the higher cLGDs, while in EcLGD, higher cLGD does not have higher weight.

Frye-Jacobs

Modeling cLGD separately from cDR introduces complexity and potential overfit to the cLoss model. Instead, the Frye-Jacobs LGD function assumes that both cDR and cLoss follow Vasicek distribution, and infers cLGD as a function of cDR.

Frey-Jacobs assumptions:

  1. cDR and cLoss are comonotonic.

    • If cDR goes up, cLoss must go up.
    • If cDR is in its qth quantile, then cLoss must also be in its qth quantile. This implies that there is a cLGD function of cDR:

  2. cDR follows Vasicek distribution, which stems from the simplest portfolio structure:

    • Large number of Firms
    • Each firm same PD
    • Each pair-wise ρ the same (same PDJ)
    • Gauss copulas
  1. Distribution of cLoss does NOT depend of the definition of default.
    \times This implies the distribution of cLoss does not have separate parameters for PD and ELGD. It does have a parameter EL.

  2. cLoss follows Vasicek distribution

  1. cLoss and cDR have the same ρ parameter.
    \times This ensure that the LGD function is monotonic

Finally,

Observations:

  1. cLGD is strictly monotonic with range (0, 1), for all k
  1. cLGD increases slowly, and similarly for all k, at low cDR
  2. Elasticity is greatest for loans wth low LGD.

Frye-Jacobs: Develop Alternative Hypothesis

Introduce an additional sensitivity parameter to test the slope of the LGD function.

We know that:

In integration form:

Bring in the Frye-Jacobs cLGD function:

Note that EL is in both lhs and rhs, divide both EL by ELGDa:

Note that we have identified a new LGD function:

Analyzing the choice of a:

  • When a = 0, the cLGD function is the Frye-Jacob formula.
  • When a = 1, cLGD = ELGD, which implies cLGD does not depend on conditions:

Frye-Jacobs: Hypothesis Test

We introduce finite portfolio, which brings randomness into the D’s and LGD^{dollar}s.

  • We assume the finite portfolio is uniform and all N loans have the same PD and ρ
  • We assume that given portfolio cDR, the number of defaults is binomial:
  • We assume that LGD is normally distributed around cLGD, with σ = 0.2. Note under this assumption, ELGD = cLGD which correspond with a = 1.

Under finite portfolio, the probability of 0 defaults is:

When conditional on cDR and Σ D > 0, the average portfolio LGD rate is normal:

Let Y ~ N(0, 1) be a standard normal variable, then LGD becomes:

Now calculate Loss based on DR and LGD:

Use change-of-variable technique to calculate the pdf for Loss:

Where:

Finally, the pdf of loss conditional on Σ D and cDR:

Removing the conditional, the distribution of loss in a uniform portfolio, with N loans, same PD and ρ and the cLGD function, becomes:

Here is a plot of the the unconditional loss density in a finite (N = 10) portfolio in red and loss density in an infinite portfolio (Vasicek) in blue. (note that the plot use D to denote Σ D):

image3.png

Now we have the pdf for loss, we an test the hypothesis:

  • H0: a = 0
  • H1: a = MLE Based On Moody’s Loss data

As a result MLE(a) = 0.01 based on all loan data and the test failed to reject the null. Same with other bonds and bonds/loans data combination. We conclude that the Fyre-Jacob model is consistent with Moody’s data

Vender Estimation

Distance-To-Default and EDF

Robert Merton argues that:

  • the default of firm i depends on its asset return
    • Merton asserts that a firm defaults if and only if the value of its asset drops below the value of its liability, i.e. its asset return is too low
  • joint default of firm i and j depends on PD and asset return correlation

Moody’s suggests that loan contains the option to default, and attempts to use risk-neutral probability to estimate the probability of default. In the context of a put:

Under Moody’s assumption, the firm has an option to default on its assets once it drops below its liability. Here, liability is the strike price, for which Moody’s uses D, or “default point”, to denote short term debt plus half of long term debt to represent liability. DD stands for Distance-To-Default, suggested by Merton. So the probability of default is:

Moody’s then estimates the value and volatility of the assets (unobservable) based on the value and volatility of the market capitalization (observable).

However, since Φ(-DD) gave very poor estimate for the default probability, Moody’s sets the EDF(Estimated Default Frequency) of a firm equal to the average historical default rate of firms with the same Distance-To-Default. An EDF uses DD to find historical analogs of current firms.

Correlation

Merton assumes that the correlation ρ between the latent variable Z’s is equal to the asset return correlation r.

However, data suggests that correlation estimated from credit data is less than the correlation based on asset returns. Hence a credit portfolio model that uses asset correlation to estimate ρ overstates credit risk.


📖 Foreign Exchange

Theoretical Pricing

FX Spot Contract

The spot price is the observable market price of unit of foeign currency. Let denote foreign currency and denote domestic currency:

A FX spot contract is an agreement where the buyer purchase units of foreign currency at a fixed rate at current time .

The contract value to the buyer is:

FX Forward Contract

Denote domestic interest rate = . The price of domestic zero-coupon bond

A FX forward contract is an agreement where the buyer agree to purchase units of foreign currency at a fixed rate at future time :

The time- value of a forward contract is:

We set to calculate the forward price at time . The equation is also called the covered interest parity, or CIP:

Non-Deliverable forward

Non-deliverable currency has restricted exchange by local regulations. CIP does not hold since covered interest arbitrage is not possible. For example:

Asia

  • CNY: China Yuan
  • TWD: New Taiwan Dollar
  • KRW: South Korean Won
  • INR: India Rupee
  • PHP: Philippine Piso
  • IDR: Indonesia Rupiah
  • MYR: Malaysian Ringgit

Latin America:

  • COP: Colombian Peso
  • VEB: Venezuelan Bolívar
  • BRL: Brazilian Real
  • PEN: Peru Sol
  • UYU: Uruguayan Peso
  • CLP: Chilean Peso
  • ARS: Argentine Peso

Europe, Middle East and Africa:

  • EGP: Egyptian Pound
  • KZT: Kazakhstani Tenge

Given CIP, we can calculate the implied yield, which is the foreign interest rate implied by the forward rate, domestic spot rate and domestic interest rate.

We know that the exponential function can be expressed as the sum of the Maclaurin series:

Applying this to the forward rate:

FX Swap Contract

A FX swap contract contains two FX forward contracts at time with opposite directions.

For example, a buy/sell swap contract:

The present value of the swap contract is the sum of the present value of the two sub-contracts:

Note that the value of a swap contract is fairly insensitive to spot rate changes, comparing to that of a forward contract.

FX Option

A FX option conveys the right, but not the obligation, to exchange units of foreign currency for units of domestic currency, at a future date .

For example, the buyer of a foreign currency call strike at , have the right at maturity to buy unit of at even if .

This is equivalent to the the buyer of units of domestic currency put strike at , which grants the buyer the right at maturity to sell unit of at a rate of , even if the exchange rate falls below .

In formula:

Visualizing the transactions on a foreign currency call:

Visualizing the transactions on a domestic currency put:

FX options also satisfy put-call parity:

Garman-Kohlhagen

To evaluate the price of the option:

  • Assumptions on the stochastic nature of St
  • Create a “risk-free” hedge portfolio, in order to find a governing PDE for the option value, which also leads to an equivalent risk-neutral probability measure
  • Solve the PDE directly, with appropriate boundary conditions

We know that if a tradable asset follows the geometric Brownian motion:

Applying Ito's formula any value of a derivative contract :

Setting the drift term to be zero as the derivative contract is tradeable, we can derive the Black-Scholes PDE equation characterize as such:

However, since the foreign exchange spot rate is not tradable, we need to tweak the B-S formula. Let and denote a bank account in domestic and foreign currencies, where and . Construct replicating portfolio and set the drift term to be , the Garman-Kohlhagen PDE equation can be derived:

Solving the PDF:

Using the Freynman-Kac equation with additional derivation, we can conclude that s.t. the arbitrage-free price of the contingent claim is unequivocally determined as the expected value of the discounted final payoff under , and obeys the stochastic differential equation:

Practical Pricing

FX Spot Contract

The trade date is when the terms of the transaction are agreed, and the value date is when transaction occurs, which is trade date for most currency pairs.

The spot rate quote means:

  • , i.e. higher the , stronger the .
  • is the base currency and is set to 1 unit, whereas is the numeraire currency which is used as the numeraire.

The bid-offer spread means:

  • The dealer is willing to buy for
  • The dealer is willing to sell for

Equivalently:

  • The highest price YOU can sell is
  • The lowest price YOU can buy is

FX Forward Contract

The forward point is commonly expressed in the unit pip, or point in percentage, that is worth .

Example 1 When selling a forward for foreign currency , the bid side spot rate plus bid side forward points shall be equal to the bid side outright forward rate.

A market-maker would construct the short forward as follow. Note that borrowing and lending correspond to selling a forward and therefore the bid-side forward point.

Time Transactions
borrow
execute a short spot contract
lend
receive
execute a long spot contract
pay

This is the same as selling an outright forward contract:

Time Transactions
N/A
receive
pay

FX Swap Contract

A FX swap contract intends to adjust the timing of cash flows from to and alter the value date on an existing trade. The near rate should be consistent with the market forward rate for the near date, and the same goes for the far rate. The swap point is equal to:

A buy/sell swap on means that it buys a forward on at and sells a forward on at . This correspond to borrowing and lending .

Example 2 A short outright forward position on can be thought of as a buy/sell swap on with a spot transaction at the near date and , similar to Example 1. Here :

Time Transactions
borrow
execute a short forward contract:
pay
receive
lend
receive
execute a long forward contract:
pay
receive
pay

This is the same as a buy/sell swap:

Time Transactions
recieve
pay
receive
pay

Example 3 From a market-maker perspective:

Contract Swap Point T1 T2
Buy/Sell offer-side swap point pay at bid-side points sell at offer-side points
Sell/Buy bid-side swap point sell at bid-side points pay at bid-side points

Note(): because a swap has less interest rate risk than an outright forward, the market-maker can easily construct a swap with bid-side points for both near and far dates.

Example 4 Say the swap point is , then a party that buy/sell the foreign currency is paying the swap point, because it is selling at a lower Far rate.

Conversely, a party that sell/buy is earning the swap point.

Risk Characteristics

Contract Transactions FX Risk IR Spread Risk
Spot 1 Yes No
Forward (Outright) 1 Yes Yes
Swap 1 No Yes

FX Option

There are four ways to express an option price:

Price in units in units
Notional as



Notional as



Straddle

The meaning of can be different:

  • : at the spot rate
  • : at the forward rate (preferred by traders)
  • : delta-neutral

Risk Reversal

Where a -delta option is an option with a delta of . Risk reversal can also denote the difference in implied volatility:

Butterfly

Note that butterfly is vega () neutral, e.e. the strangle notional is usually larger than the straddle notional to create equal and offestting vega . BF can also denote the difference in implied volatility:

Under the Black-Scholes framework, delta-netural strike () options have the highest vega :

In addition, option gamma


📖 C++

C++ is a complied (vs interpreted: python), general-purpose (vs domain-specific: HTML) programming language created by Danish programmer Bjarne Stroustrup as an extension to C.

Basic

Compiler

A compiler translate a high level language into a low level language and create an executable program.

  1. Pre-processor: read preprocessing lines #include "foo.hpp"
  2. Compiler: turn the above code it into assembly code (ASM).
    • front end create IR (intermediate representation) with SSA (static singale assignment). The runtime is .
    • middle end optimize IR. remove unnecessary operations, or more.
    • back end produce ASM
  3. Assembler: turn ASM into binary code
  4. Linker: link all relevant headers, libraries together
  5. Debugger: type checking
  6. Object Copy: generate .exe (for windows), and .bin (for mac)

G++

Compile with g++ at the command line:

1
2
3
$ g++ toto.cpp
$ g++ toto.cpp -E (show c pre-processor)
$ g++ toto.cpp --verbose (ask compile to give different steps)

Running the complied result:

1
$ /a.exe

The C++ standard library is a collection of classes and functions, represented by different headers. For example, include the <iostream> header to handle input and outputs and other non-standard headers using double quoto.

1
2
#include <iostream>
#include "foo.h"

Macro

1
2
define N 4
std::cout << N + 2; // show 6

Guards

In C++, function, class and variable can only be declared once. We use guards to make sure we do not duplicate declaration in multiple files.

1
2
#ifndef "foo.h"
#define "foo.h"

Namespace

Some classes and functions are grouped under the same name, which divides the global scope into sub-scopes, each with its own namespaces.

Functions and classes in the C++ standard library are defined in the std namespace. For example, the cin (standard input), cout (standard output) and end (end line) objects.

1
2
3
4
char c;
std::cin >> c;
std::cout << c;
std::endl;

Alternatively, we can use using namespace std;.

Data Type

Every variable has to have a type in C++, and the type has to be declared and cannot be changed. There are fundamental types and user-defined types (classes)

Characters In computer, each bit stores a binary (0/1) value. A byte is 8 bits. The computer stores characters in a byte using the ASCII format.

Numbers The computer stores numbers in binary format with bits. The leftmost bit is used to store the sign of a number. (See twos-complement method). Real values are stored using a mantissa and an exponent:

Note that very few values can be exactly represented, and how close we can get depends on the number of bits available.

Type Size (Bytes) Value Range
bool 1 true or false
char 1 -128 to 127
short 2 -32,768 to 32,767
int 4 -2,147,483,648 to 2,147,483,647
float 4 3.4E +/- 38
double 8 1.7E +/- 308

C++ is a strongly typed language, which means type errors needs to be resolved for all variables at compile time.

Function

Every console application has to have a main() function, which takes no argument and returns an integer value by default.

A function that adds two numbers:

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <iostream>
using namespace std;

int Add(int a, int b)
{
return a+b;
}

int main()
{
int result = Add(2, 3);
cout << " Result: " << result << endl;
}

Overloading allows 2 or more functions to have the same name, but they must have different input argument types.

Function Object

Function object, or functors, are objects that behave like functions, are functions with state.

A regular function looks like this:

1
2
3
4
5
int AddOne(int val)
{
return val+1;
}
int result = AddOne(2)

A function object implementaion:

1
2
3
4
5
6
7
8
9
10
11
12
class AddOne
{
public:
int operator()(int& val)
{
return val+1;
}
};

AddOne addone;
int val = 2;
int result = addone(val)

Lambda

Lambdas is a new feature introduced in C++11, which is an inline function that can be used as a parameter or local object.

1
2
3
4
[] (string s) // [] is the lambda introducer/capture clause
{
cout << s << endl;
}

Example 1

1
2
3
vector<int> v{1, 3, 2, 4, 6};
for_each(v.cbegin(), v.cend(), //range
[](int elem) {cout << elem << endl;}) //lambda

Example 2

1
2
3
vector<int> v{1, 3, 2, 4, 6};
transform(v.begin(), v.end(),
v.begin(), [] (int elem) {return elem * elem});

Example 3

1
2
3
4
5
6
7
vector<Person> ppl;
sort(ppl.begin(), ppl.end(),
[](const Person& p1, const Person&p2)
{
if (p1.GetAge() < p2.GetAge()) return true;
else return false;
});

Extern

The keyword extern means the function is declared in another file.

1
2
extern int foo(int a);
int main() { return foo(100); }

Inline Function

C++ provides inline funcitons such that the overhead of a small function can be reduced. When inline function is called the entire code of the function is inserted at the point of the inline function call.

Typedef

Use typedef keyword to define a type alias.

1
2
3
4
typedef double OptionPrice;
typedef double StockPrice;
typedef double Strike;
OptionPrice BSPrice(StockPrice S, Strike K)

Operators

Standard operations:

1
2
3
4
5
6
7
8
9
Arithmetic: +, -, *, /
Comparison: <, >, <=, >=
Negate: !
Equality, non Equality: ==, !=
Logical and, or, &&, ||
Assignment: =
Modulo: %
Increment, Decrement: i++, i--
Multiple Operations: i += 1, i -= 1, i *= 1, i /= 1

Note the difference between i++ and ++i

1
2
i++; // return (old) i and increment i
++i; // increment i and return new i

Const

Use the const keyword to define a constant value. The compiler will stop any attempt to alter the constant values.

Since C++ is a strongly typed language, it is preferred to use const int N = 4, instead of #define N 4, as the former defines a type.

Reference

Example 1 A reference is an alias for a variable and cannot rebind to a different variable. We can change val by changing ref:

1
2
3
int val = 10;
int& ref = val;
ref = 20; // this will change val to 20

Example 2 We can also bind a const reference to a const object. An error will be raised if attempt to change the value or the reference.

1
2
3
4
const int val = 10;
const int& ref = val;
val = 20; // error
ref = 20; // error

Example 3 We can also bind a const reference to a non-const object, thereafter we can NOT change the object using the reference.

1
2
3
4
int val = 10;
const int& ref = val;
val = 20; // ok
ref = 20; // error


Pass By Value In a function, we can pass an argument by either value or reference. When passing by value, the variable x will NOT be changed. In this case, we waste time to both create a copy inside the function and memory to store the copy

1
2
3
4
5
6
7
8
9
10
11
void DoubleValue(int number)
{
number = number * 2;
}

int main()
{
int x = 5;
DoubleValue(x);
cout<<"x = "<<x<<endl;
}

1
x = 5

Pass By Reference When passing by reference (by adding & in the function argument parameter), the variable x WILL be changed.

1
2
3
4
5
6
7
8
9
10
11
void DoubleValue(int& number)
{
number = number * 2;
}

int main()
{
int x = 5;
DoubleValue(x);
cout<<"x = "<<x<<endl;
}

1
x = 10

Pass By Const Reference We add const when we do not want the specific function argument to be tempered when passed by reference. In this example, there will be a compiler error as we are trying to change the const reference number in the function.

1
2
3
4
5
6
7
8
9
10
11
void DoubleValue(const int& number)
{
number = number * 2; // error, cannot change const ref "number"
}

int main()
{
int x = 5;
DoubleValue(x);
cout<<"x = "<<x<<endl;
}

Pointer

In computer memory, each stored values has an address associated with it. We use a pointer object to store address of another object and access it indirectly.

There are two pointer operator:

  1. &: address of operator, used to get the address of an object
  2. *: de-reference operator, used to access the object

Example 1

1
2
3
int* ptr = nullptr; // initiate an empty pointer
int* ptr = &val; // initiate ptr with the address of val
*ptr = 20; // change val using the ptr pointer

Example 2 If the object is const, a pointer cannot be used to change it.

1
2
3
const int val = 10;
const int* ptr = &val;
*ptr = 20; // error

Example 3 You can have a pointer that itself is const

1
2
3
4
5
6
int val = 10;
int* const ptr = &val;
*ptr = 20; // ok

int val2 = 20;
ptr = &val2 // error, as the pointer is const

Casting

C++ allows implicit and explicit conversions of types.

1
2
3
4
short a = 1;
int b;
b = a; // implicit conversion
b = (int) a; // explicit conversion

However, the traditional explicit type-casting allows conversions between any types, and leads to run-time error. To control these conversions, we introduce four specific casting operators:

  • dynamic_cast<new_type>( ): used only with pointers (and/or references to objects); can cast a derived class to its base class; base-to-derived conversions are allowed only with polymorphic base class
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
class Base {virtual void foo() {} };
class Derived : public Base { };

int main() {

Derived* derived_ptr;
Base* base_ptr = dynamic_cast<Base*> (derived_ptr);

Base* base_ptr_2 = new Derived;
Derived* derived_ptr_2 = dynamic_cast<Derived*> (base_ptr_2);
// ok, base class polymorphic

Base* base_ptr_3 = new Base;
Derived* derived_ptr_3 = dynamic_cast<Derived*> (base_ptr_3);
// will not work, derived_ptr_3 will be assigned a nullptr

std::cout << "derived_ptr_2: " << derived_ptr_2 << std::endl;
std::cout << "derived_ptr_3: " << derived_ptr_3 << std::endl;
return 0;
}
1
2
derived_ptr_2: 0x7fa5cec00630
derived_ptr_3: 0x0


  • static_cast < new_type>( ): used only with pointers (and/or references to objects); can cast base-to-derived or derived-to-base, but no safety check at run-time;
1
2
3
Base* base_ptr_3 = new Base;
Derived* derived_ptr_3 = static_cast<Derived*> (base_ptr_3);
// not nullptr this time, but lead to error when de-referencing derived_ptr_3
1
derived_ptr_3: 0x7fc3d7400690


  • reinterpret_cast <new_type>( ): convert pointer to another unrelated class; often lead to unsafe de-referencing
1
2
3
4
5
class A {};
class B {};

A* a = new A;
B* b = reinterpret_cast<B*> (a);


  • const_cast <new_type>( ): remove/set the constant-ness of an object

Array (C-Style)

An array is a fixed collection of similar kinds of items that are stored in a contiguous block in memory. We define the size of the array at creation, and the array index starts a 0 in C++.

1
2
int a[10];
int a[] {1, 2, 3} // uniform initializer syntax

The address of the array is the same as the address of the first element of the array. Therefore, we can access an array using pointer increment - very efficient.

1
2
3
4
int a[10];
int* ptr = &a[0]; // the same as int* ptr = a
int a0 = a[0]; // the same as int a0 = *ptr
int a3 = a[3]; // the same as int a3 = *(ptr+3) or *(a+3)

Dynamic Allocation

Dynamic memory allocation is necessary when you do NOT know the size of the array at compile time. We use a new keyword paired with a delete keyword.

1
2
3
int* a = new int[10];
delete[] = a; // correct. this tells the CPU that it needs to clean up multiple variables instead of a single variable
delete a; // incorrect. using this version will lead to a memory leak.

Dynamic allocate a matrix with cast.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <iostream>
void func(double** a) {
* a = new double[16];
}


int main() {
int (* a)[4];
func( (double**)&a );

for (int i=0; i<4; i++) {
for (int j=0; j<4; j++) {
a[i][j] = 1;
}
}

for (int i=0; i<4; i++) {
for (int j=0; j<4; j++) {
std::cout << a[i][j] << " " ;
}
std::cout << std::endl;
}
}

1
2
3
4
5
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
`

Library

A C++ library is a package of reusable code typically with these two components:

  • header file
  • precompiled binary containing the machine code for functionality implemntation

There are two types of c++ libraries: static and dynamic libraries.

  • a static library has a .a (.lib on Windows) extension and the library codes are complied as part of the executable - so that user only need to distribute the executable for other users to run the file with a static library.
  • a dynamic library has a .so (.dll on Windows) extension and is loaded at run times. It saves space as many program can share a copy of dynamic library code, and it can be upgraded to new versions without replacing all the executables using it.

Condition

If/Else

1
2
3
4
5
6
7
8
9
10
11
12
if (condition_1)
{
statement1;
}
else if (condition_2)
{
statement2;
}
else
{
statement2;
}

Switch

A switch statement tests an integral or enum value against a set of constants. we can NOT use a string in the switch statement.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
int main()
{
int value = 0;
cin >> value;
switch(value)
{
case 0:
cout << "value is zero";
break; // if remove this break, it will also show case 1 even if value is 0
case 1:
cout << "value is one";
break;
default:
cout << "value is not 0 or 1";
}
}

While / Do While / For Loop

While loop:

1
2
3
4
5
6
int n = 0;
while (n < 10)
{
cout << " n: " << n << endl;
n = n + 1;
}

Do while loop:

1
2
3
4
5
6

do {
cout << "Enter number (0 to end): ";
cin >> n;
cout << "You entered: " << n << "\n";
} while (n != 0);

For loop:

1
2
3
4
for (unsigned int n = 0; n < 10; ++n)
{
cout << "n: " << n << endl;
}

For loop with two variables:

1
2
3
4
for (unsigned int i = 0, j = 0; i < 10 && j < 10; ++i, j+=2)
{
cout << "i:" << i << ", j:" << j << endl;
}

Enum

The enum (enumerated) type is used to define collections of named integar constants.

1
2
3
4
5
6
7
enum CurrencyType {USD, EUR, GBP};
cout << USD << " " << EUR << " " << GBP;
0 1 2

enum CurrencyType {USD, EUR=10, GBP};
cout << USD << " " << EUR << " " << GBP;
0 10 11

Class

A class achieve data abstraction and encapsulation.

  • abstraction refers to the separation of interface and implementation
  • encapsulation refers to combining data and functions so that data is only accessible through functions.

Member Variable & Function

Define a customer class with member variable and function.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Customer
{
public:
Customer(); // default constructor
Customer(string name, string address);
~Customer(); // destructor, to free up resources

string GetName();
string GetAddress();
void SetAddress(string address);

private:
string name_;
string address_;
};

Instantiate Customer class instances to represent different customer.

1
2
3
4
5
6
7
Customer c1("Joe", "Hyde Park");
Customer c2("Jim", "Chicago");
Customer c3("John", "New York");

// Use `.` to access member function.
c1.GetName()
c2.SetAddress("Beijing")

Protection Level

There are three protection levels to keep class data member internal to the class.

  1. public accessible to all.
  2. protected accessible in the class that defines them and in classes that inherit from that class.
  3. private only accessible within the class defining them.

Constructor / Destructor

A constructor is a special member functions used to initialize the data members when an object is created. This is an example to use initializer list to create more efficient constructors

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Customer::Customer()
: name_(""),
address_("")
{
// name_ = "";
// address_ = "";
}

Customer::Customer(string name, string address)
: name_(name),
address_(address)
{}

Customer::~Customer()
{}

Free-Store

There are several ways to create objects on a computer:

  • Automatic/Stack int a;

  • Dynamic Allocated

    • Free Store int* ptr = new a[10];
    • Heap allocated/freed by malloc/free

Summarized in a table from geeksforgeeks

Parameter Stack Heap
Basic Memory is allocated in a contiguous block Memory is allocated in any random order
Allocated and de-allocation Automatic by compiler instructions Manual by programmer
Cost Less More
Access time Faster Slower
Main issue Shortage of memory Memory leak/fragmentation

We use -> to access free-store object’s member functions:

1
2
3
Customer* c = new Customer("Joe", "Chicago");
c->GetName()
c->SetAddress("New York")

Const Member Functions

A const object can only invoke const member function on the class. A const member function is not allowed to modify any of the data members on the object on which it is invoked. However, if a data member is marked mutable, it then can be modified inside a const member function.

1
2
const Customer c1("Joe", "Hyde Park");
cout << c1.GetName(); // ok if GetName() is a const member function.

Static Member

We use static keyword to associate a member with the class, as oppose to class instances. A static data member can NOT be accessed directly using a non-static member function.

Static member variables can NOT be initialized through the class constructor, rather, they are initialized once outside the class body. However, a const static member variable can be initialized within the class body.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Counter
{
public:
Counter();
static int GetCount();
static void Increment();
private:
static int count_; // non-const static need to be initialized outside
const static int count_2_ = 0; // const static can be initialized within
};

int Counter::count_ = 0;

Counter c;
c.Increment(); // or Counter::Increment()

This

Every non-static member function has access to a this pointer, which is initialized with the address of the object when the member function is invoked.

1
2
3
4
5
6
double Currency::GetExchangeRate()
{
return exchangeRate_;
return this->exchangeRate_; // equivalent
return (*this).exchangeRate_; // equivalent
}

Copy Constructor

We use the copy constructor to construct an object from another already constructed object of the same type.

1
2
3
4
5
6
7
8
9
10
11
class Customer
{
Customer(const Customer& other);
};

Customer::Customer(const Customer& other)
: name_(other.name_)
address_(other.address_)
{}

Customer c2(c1);

Assignment Operator

We use the assignment operator to assign an object of the same type.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
class Customer
{
Customer& operator=(const Customer& other);
};

Customer& Customer::operator=(const Customer& other)
{
if (this != &other) //checking for self assignment
{
name_ = other.name_;
address_ = other.address_;
}
//return the object on which the function was invoked
return (*this);
}

Shallow / Deep Copy

The default copy constructor and assignment operator provides shallow copy, which copies each member of the class individually. For pointer member, the shallow copying copies the address of the pointer, resulting in both members pointing to the same object on the free store.

A deep copy, however, creates a new object on the free store and copy the contents of the object the original pointer is pointing to.

Deep Copy copy constructor

1
2
3
4
5
6
Customer::Customer(const Customer& other)
:name_(other.name_),
address_(other.address_),
account_(new Account(other.account_->GetAccountNumber(),
other.account_->GetAccountBalance()))
{}

Deep Copy assignment operator

1
2
3
4
5
6
7
8
9
10
11
12
Customer& Customer::operator=(const Customer& other)
{
if (this != &other)
{
name_ = other.name_;
address_ = other.address_;
delete account_;
account_= new Account(other.account_->GetAccountNumber(),
other.account_->GetAccountBalance());
}
return (*this);
}

The Rule of 3

There are 3 operations that control the copies of an object: copy constructor, assignment operator, and destructor. If you define one of them, you will most likely need to define the other two as well.

Singleten Class

The Singleton design pattern makes sure only one instance of an object of a given type is instantiated in a program, and provides a global point of access to it

  1. change the access level of the constructor to private
  2. add new public member function Instance() to create the object
  3. use static member variable to hold the object
1
2
3
4
5
6
7
8
9
10
class CurrencyFactory
{
public:
static CurrencyFactory* Instance();
Currency CreateCurrency(int currencyType);

private:
CurrencyFactory();
static CurrencyFactory* instance_;
};
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
CurrencyFactory* CurrencyFactory::Instance()
{
if (!instance_)
instance_ = new CurrencyFactory;
return instance_; // no more than one CurrencyFactory object.
}

Currency CurrencyFactory::CreateCurrency(int currencyType)
{
switch(currencyType)
{
case EUR:
return Currency("EUR", 0.7901);
case GBP:
return Currency("GBP", 0.6201);
case CAD:
return Currency("CAD", 1.1150);
case AUD:
return Currency("AUD", 1.1378);
default:
return Currency("USD", 1.0);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include "CurrencyFactory.h"
int main()
{
cout << "Enter amount in USD:";
double amount;
cin >> amount;

cout << "Enter currency to convert to (ECU/GBP/CHF/JPY): ";
string symbol;
cin >> symbol;

double convertedAmount = 0.0;
Currency currency = CurrencyFactory::Instance()->CreateCurrency(symbol);
cout << currency.ConvertFromUSD(amount) << endl;
}

Inheritance

Classes related by inheritance form a hierachy consisting of base and derived classes. The derived class inherit some members from the base class subject to protection level restrictions, and may extend/override implementation of member functions in the base class.

1
2
3
4
5
6
7
8
9
10
class Person
{
protected:
string name_;
string address_;
};
class Student : public Person
{
string school_;
};



Virtual

Different derived classes may inplement member functions from the base class differently. The base class uses virtual keyword to indicate a member function that may be specialized by derived classes.

1
2
3
4
5
6
7
8
9
10
11
12
13
class Base
{
public:
virtual void Method1();
virtual void Method2();
void Method3();
};
class Derived : public Base
{
void Method1(); // specializes Method1()
// uses default implementation of Method2()
// can NOT specialize Method3()
};

Abstract Class

The base class has to either provide a default implementation for that function or declare it pure virtual. If a class has one or more pure virtual function, it is called an abstract class or interface. An abstract class cannot be instantiated.

1
2
3
4
5
6
7
8
9
class Base
{
public:
virtual void Method1() = 0;
};
class Derived : public Base
{
// this derived is also an abstract
};

Virtual Destructor

When we delete a derived class we should execute both the derived class destructor and the base class destructor. A virtual base class destructor is needed to make sure the destructors are called properly when a derived class object is deleted through a pointer to a base class.

If we delete a derived class object through a pointer to a base class when the base class destructor is non-virtual, the result is undefined.

Polymorphism

The types related by inheritance are known as polymorphic. types. We can use polymorphic types interchangeably.

We can use a pointer or a reference to a base class object to point to an object of a derived class – this is known as the Liskov Substitution Principle (LSP). This allows us to write code without needing to know the dynamic type of an object

1
2
3
4
5
BankAccount* acc1 = new Savings();
acc1->ApplyInterest(); // ApplyInterest() on the Savings object

BankAccount* acc2 = new Checking();
acc2->ApplyInterest(); // ApplyInterest() on the Checking object

We can write one function which applies to all account types.

1
2
3
4
5
void UpdateAccount(BankAccount* acc)
{
acc->ApplyBankingFees();
acc->ApplyInterest();
}

1
2
3
4
5
void UpdateAccount(BankAccount& acc)
{
acc.ApplyBankingFees();
acc.ApplyInterest();
}

Standard Template Library (STL)

Sequential Container

std::array

The STL array class from offers a more efficient and reliable alternative for C-style arrays, where size is known and we do not have to pass size of array as separate parameter.

1
2
3
4
5
6
7
8
#include <array>
array <int> a1 = {1, 2, 3};

a1.front();
a1.back();
a1.size();
a1.at(1);
get<1>(a1);

std::vector

Vectors are the stored contiguously same as dynamic arrays with the ability to resize itself automatically when an element is inserted or deleted. Vector size is double whenever half is reached.

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <vector>
vector<int> v1;

v1.begin();
v1.end();
v1.size();

v1.push_back(); // pushes the elements into a vector from the back
v1.pop_back(); // removes the elements from a vector from the back.

v1.insert(i); // inserts new elements before the element at the specified position
v1.assign(i); // assigns new value to the vector elements by replacing old ones
v1.erase(i); // removes elements from a container from the specified position or range

std::list

Different from arrays and vectors, A list is a sequential container that allows non-contiguous memory allocation.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <list>
list<int> l1;
for (int i = 0; i < 10; i++) {
l1.front(); // returns the value of the first element
l1.back(); // returns the value of the last element

l1.push_front(i); // adds a new element ‘i’ at the beginning of the list
l1.push_back(i); // adds a new element ‘i’ at the back of the list

l1.pop_front(); // removes the first element and reduces list size by 1
l1.pop_back(); // removes the last element and reduces list size by 1

l1.begin(); // returns an iterator pointing to the first element of the list
l1.end(); // returns an iterator pointing to the last element of the list
}

std::string

The STL string class stores the characters as a sequence of bytes, allowing access to single byte character. Any string is terminated by a \0, so the string foo actually stores four characters.

size()

The use sizeof() to return the size of an array in bytes. Use .size() member function to return the number of elements in a STL container.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <iostream>
#include <vector>

using namespace std;

int main() {
int a[5] {1, 2, 3, 4, 5};
cout << "The size of a: " << sizeof(a) << " bytes" << endl;

vector<int> b {1, 2, 3, 4, 5};
cout << "The size of b: " << sizeof(b) << " bytes" << endl;
cout << "The size of b: " << b.size() << " elements" << endl;

}

1
2
3
The size of a: 20 bytes
The size of b: 24 bytes
The size of b: 5 elements

Associative Container

std::set

Sets are an associative container where each element is unique. The value of the element cannot be modified once it is added to the set.

1
2
3
4
5
6
7
8
9
10
11
#include <set>
set<int> s1;
for (int i = 0; i < 10; i++) {
s1.begin();
s1.end();
s1.size();

s1.insert(i);
s1.erase(i);
s1.find(i);
}

std::map

A std::map sorts its elements by the keys.

Algorithm

The STL provides implementations of some widely used algorithms.

  • <algorithms> header: sorting, searching, copying, modifying elements
  • <numeric> header: numeric operation

Sort

1
2
3
4
5
6
int main()
{
vector<int> values{10, 1, 22, 12, 2, 7};
//sort takes a range
sort(values.begin(), values.end());
}
1
2
3
4
5
6
int main()
{
vector<int> values{10, 1, 22, 12, 2, 7};
//binary_search takes a range and a value
bool found = binary_search(values.begin(), values.end(), 12);
}

Copy

1
2
3
4
5
6
7
8
int main()
{
vector<int> values1{ 10, 1, 22, 12, 2, 7 };
//destination
vector <int> values2;
copy(values1.begin(), values1.end(), //input range
back_inserter(values2)); //output iterator
}

Replace

1
2
3
4
5
6
7
int main()
{
vector<int> values{ 10, 1, 22, 12, 2, 7 };
replace(values.begin(), values.end(), //range
1, //old value
111); //new value
}

Numeric

1
2
3
4
5
6
7
8
9
int main()
{
vector<int> v2{ 5, 4, 3, 2, 1 };
vector<int> v2{ 1, 2, 3, 4, 5 };
int r1 = accumulate(v1.begin(), v1.end(), 0); //range

int r2 = inner_product(v1.begin(), v1.end(),
v2.begin(), 0);
}

Complexity Comparison

complexity.png

Smart Pointer

std::unique_ptr

A unique pointer takes unique ownership in its pointed object. The unique pointer delete the object they managed either when the unique pointer is destroyed or when the object’s value changes.

1
2
3
4
5
6
7
8
9
10
11
12
13
#include <memory>

std::unique_ptr<Option> sp(new Option());
// initates a smart pointer (or through reset: sp.resert(Option()).)

std::unique_ptr<Option> sp2(sp);
// error: does not allow two reference (sp, sp2) to the same object (new Option());

std::unique_ptr<Option> sp2(std::move(sp));
// now sp is destroyed and sp2 takes ownership of the Option object

sp2->getPrice();
// smart pointer can be used as regular pointer

std::shared_ptr

The shared pointer counts the reference to its pointed object and can store and pass a reference beyond the scope of a function. In OOP, the share pointer is used to store a pointer as a member variable and can be used to reference value outside the scope of the class.

1
2
3
4
5
6
7
std::share_ptr<Option> sp2;
{
std::share_ptr<Option> sp(new Option());
sp2=sp;
}
sp2->getPrice();
// the Option object is not deleted after local scope ends

Creating a vector of shared_ptr:

1
2
3
4
5
#include <vector>
std::vector<std::shared_ptr<Option>> option_list;
for (int i=0; i< 10; i++) {
option_list.push_back(std::shared_ptr<Option>(new Option(i)));
}

std::weak_ptr

A weak_ptr works the same as shared pointer, but will not increment the reference count.

1
2
3
4
5
6
std::weak_ptr<Option> sp2;
{
std::share_ptr<Option> sp(new Option());
sp2=sp;
}
sp2->getPrice(); // error! the Option object does not exist beyond scope.

Parallel Processing

Threading

A thread is a small sequence of programmed instruction and is usually a component of a process. Multi-threading can exist within one process, executing concurrently and share resources such as memory, while processes do not share their resources.

The std::thread class in c++ supports multi-threading, and can be initiated to represent a single thread. We need to pass a callable object (function pointer, function, or lambda) to the constructor of the std::thread class. We use the std::thread.join() method to wait for the copmletion of a thread.

Here we initiate two threads. Both threads share memory and attempt to modify the balance variable at the same time which lead to concurrency issue.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <iostream>
#include <thread>

using namespace std;

int main() {

int balance = 0;

// t1 starts
thread t1([&balance] {for (int i=0; i<1000000; i++) {balance++;}});

// t2 starts
thread t2([&balance] {for (int i=0; i<1000000; i++) {balance--;}});

t1.join(); // the main() waits here until t1 completes
t2.join(); // the main() waits here until t2 completes

cout << balance << endl;
cout << "END OF CODE" << endl;
}

1
2
153258
END OF CODE

We introduce the an mutex, or mutual exclusive, object, which contains a unique id for the resources allocated to the program. A thread can lock the resource by a std::mutex.lock() method, which prevent other thread from sharing the resource until the mutex becomes unlocked.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <iostream>
#include <thread>
#include <mutex>

using namespace std;

int main() {

int balance = 0;
mutex m;

// t1 starts
thread t1([&balance, &m] {for (int i=0; i<1000000; i++) {
m.lock();
balance++;
m.unlock();
}});

// t2 starts
thread t2([&balance, &m] {for (int i=0; i<1000000; i++) {
m.lock();
balance--;
m.unlock();
}});

t1.join(); // the main() waits here until t1 completes
t2.join(); // the main() waits here until t2 completes

cout << balance << endl;
cout << "END OF CODE" << endl;
}
1
2
0
END OF CODE

Condition Variable

A condition variable is an object that can block the calling thread until notified to resume. It uses a unique_lock (over a mutex) to lock the thread when one of its wait functions is called.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>

using namespace std;

mutex m;
condition_variable cv;
vector<int> v;

bool ready = false;
bool processed = false;

void make_vector() {

unique_lock<std::mutex> lk(m); // own the mutex
cv.wait(lk, []{return ready;}); // wait until main() sends data

for (int k = 0; k < 10; ++k) {
v.push_back(k);
}

processed = true;
lk.unlock(); // manual unlocking is done before notifying
cv.notify_one();
// unblocks one of the threads currently waiting for this condition
// if no threads are waiting, the function does nothing
// if more than one threads are waiting, it is unspecified which will be selected
}

int main() {
thread t(make_vector);

ready = false;
processed = false;
{
cout << "main() signals ready for processing\n";
ready = true;
}
cv.notify_one();

{
unique_lock<std::mutex> lk(m); // own the mutex
cv.wait(lk, []{return processed;}); // wait for cv.notify_one
cout << "back to main(), vector is processed\n";
}

for (auto i : v)
{
cout << i << " ";
}

t.join();
}
1
2
3
main() signals ready for processing
back to main(), vector is processed
0 1 2 3 4 5 6 7 8 9




Reference:

  • Stochastic Calculus: An Introduction with Applications, Gregory F. Lawler
  • FINM 32000, 33000, 34500, 36700, 36702, 322 Lecture Notes, the University of Chicago