Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

You have nothing in your shopping cart yet.

Title: Statistical simulation lecture notes
Description: There are chapters: simulation from a known distribution, simulation techniques and methods(rejection sampling, importance sampling, etc.), Random vectors and copulas, generating sample path, Bayes, Bootstrap. It's from Rutgers University in USA.

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above


Proof

1

Simulation Lecture Notes
Wanlin Juan
September 2019

1

Simulation From a Known Distribution

Assumptions

We know how to simulate from U(0,1)
...
, xn ∼ U (0, 1), i
...
d
...
rk ,
...

M
yi
E
...
M=50000
...
Simulate from an exponential distribution x ∼ exp(λ)
Density function: f (x) = λe−λx 1{x>0}

Algorithm
1
...
Compute x = − λ1 log(u)
Claim: x ∼ f (x)

Reason

∀t > 0,
1
P (x ≤ t) = P (− log(u) ≤ t)
λ
= P (u ≥ e−λt )
= 1 − P (u < e−λt )
= 1 − e−λt)
Z t
=
λe−λs ds
0
Z t
=
fexp (s)ds
0

2

B
...
Simulate u ∼ U (0, 1)
2
...
g
...
Then x = − λ1 log(1 − t) = F −1 (t)

C
...
2
...
Simulate u1 , u2 ,
...
Compute x = u1 +
...
Yn ∼ g(•),

√ ¯
n(Y − EY )
p
→ N (0, 1), n → ∞
V ar(Y )

E(u) = 21
R1
V ar(u) = E(u2 ) − (E(u))2 = 0 u2 du − ( 12 )2 =

1
12

Q: n=12 large enough?
A: It is enough for u ∼ U (0, 1)
...
Simulate u1 ∼ U (0, 1) and calculate θ = 2πu1 ∼ U (0, 2π)

2
...
Calculate x = rcosθ, y = rsinθ
Claim:
x ∼
N (0, 1),
1), x is independent with y

 y∼ N (0, 
x
0
1 0
∼N
,
y
0
0 1
Joint density function:
2
2
1
1 − x2 +y2
1
2
f (x, y) = φ(x)φ(y) = √ e−x /2 √ e−y /2 =
e




Take the Box-Muller transformation,
r=

p

x2 + y 2

x
cosθ = p
2
x + y2
Cartesian coordinate (x, y) ↔ Polar coordinate (r, θ)
f (r, θ) = f (x, y)|J|
(Jacobian Matrix)

D1
...
Call Algorithm for normal distributions to get x1 ,
...
Calculate y = x21 +
...
Simulate x1 ,
...
Calculate y = x21 +
...
Simulate from F-distribution
F-distribution: X =

Y1 /d1
Y2 /d2 ,

where Y1 ∼ χ2d1 , Y2 ∼ χ2d2 , Y1 and Y2 are independent
...
Simulate from multivariate normal distribution
 

µ1
σ12
ρσ1 σ2
x=
∼N
,
µ2
ρσ1 σ2
σ22





x1
µ1
 x2 



 ∼ N  µ2  , Pp×p 
More generally, x=

...


xp
µp


x1
x2





Algorithm:
1
...
Calculate x1 = µ1 + ρσ1 z1 , x2 = µ2 + ρσ2 z1 + 1 − ρ2 σ22 z2



 

x1
µ1
σ12
ρσ1 σ2
Claim: x=
∼N
,
x2
µ2
ρσ1 σ2
σ22
Why: Linear combinations of normal ⇒ still normal
Thus, we only need to check
E(x1 ) = µ1 + ρσ1 z1 = µ1
q
E(x2 ) = µ2 + ρσ2 z1 + 1 − ρ2 σ22 z2 = µ2
V ar(x1 ) = σ12 V ar(z1 ) = σ12
V ar(x2 ) = ρ2 σ22 V ar(z1 ) + (1 − ρ2 )σ22 V ar(z2 ) = σ22
q

(1 − ρ2 )σ22 z2 )
q
= Cov(σ1 z1 , ρσ2 z1 ) + Cov(σ1 z1 , (1 − ρ2 )σ22 z2 )
q
= ρσ1 σ2 V ar(z1 ) + σ1 (1 − rho2 )σ22 Cov(z1 , z2 )

Cov(x1 , x2 ) = Cov(σ1 z1 , ρσ2 z1 +

= ρσ1 σ2

5

Check 

σ1 p
0
σ1
AAT =
2 )σ 2
ρσ
(1

ρ
0
2
2


P
σ12
ρσ1 σ2
=
=
ρσ1 σ2
σ22
More generally,
suppose

 
x1
 x2  
 
Then x=

...

µp



p ρσ2
(1 − ρ2 )σ22

p×p

1
T
can
s
...
AA
 = 4 
 find A 
z1
 z2 


p×p



+A

...

µp

Scalar version:
z ∼ N (0, 1)
x = c + az ∼ N (c, a2 )

Pp×p
Call Algorithm to solve for AAT =
Algorithm:
1
...
, zp ∼ N (0, 1), i
...
d
2
...
, zp )T
Claim:
x ∼ N (µ,

X

)

This leads to:
for i = 1, 2,
...

For j = 1, 2,
...
,p
Set vi = σij ;
For k = 1, 2,
...
Simulate from double exponential distribution (Laplace Distribution)
x ∼ fDE (x)
|x−µ|

1 − θ
Density: fDE (x) = 2θ
e
, for x ∈ (−∞, +∞)
Rx
|x−µ|
CDF: FDE (x) = −∞ fDE (t)dt = 12 + 12 sign(x − µ) • {1 − e θ }
−1
FDE
(t) = µ − θ • sign(t − 12 ) • log{1 − 2|t − 21 |}, for t ∈ (0, 1)

Algorithm1: use algorithm in Part B
...
Simulate u1 , u2 ∼ U (0, 1)
2
...
If u2 < 0
...
Return x
Claim: x ∼ fDE (x; µ, θ)

G1
...
Simulate u ∼ U (0, 1)
2
...
Simulate from Binomial Distribution
Algorithm
1
...
un ∼ U (0, 1), i
...
d
2
...
+ 1{un ≤p}
Claim: x ∼ Binomial(n, p)
Why: x =

Pn

i=1 {Independent

Bernoulli}

G3
...
))
Multinomial (1, (p1 , p2 ,
...


Algorithm
1
...
Set (x1 , x2 , x3 ) = (1{0Claim: x ∼ M ultinomial(1,(p1 , p2 ,
...
Simulate from Multinomial Distribution
Algorithm
1
...
, un P
∼ U (0, 1)
Pn
Pn
n
2
...
))

H
...


e−λ λk
k!

Connections between Poisson(λ) and exp(λ):
Poisson x is the number of events in [0,1], when the time between consecutive events are i
...
d
exponential(λ)
...
, Tx , Tx+1 ∼ exp(λ) ⇒ x ∼ P oisson(λ)

Algorithm
1
...
While(s ≤ 1)

8

Simulate u ∼ U (0, 1)
...
Return x = k − 1
Alternatively, recall U ∼ U (0, 1) ⇒ X = F −1 (u) ∼ F (•)

Q: In discrete case, how to define F −1 (u)?
A: x = F −1 (u) =the largest integer such that F (x) ≤ u

Algorithm(numerical)
1
...

Generate u ∼ U (0, 1)
2
...
Return x

9

2

Simulation Techniques and Methods

A
...

Assumption
1
...

2
...


R
R
Remark C has to be more than 1, since f (x)dx = g(x)dx = 1
...


Algorithm
1
...

f (z)
, set x = z (accept z), and then go to 3;
2
...

3
...

Claim: x ∼ f (x)

Reason(intuitively)

10

Reason(in math)
Define
Aj = {Accept z as x in the j − th loop}
= {Reject loop 1} ∩ {Reject loop 2} ∩
...

C
C

We can easily verify that
+∞
X

P (Aj ) =

(1 −

j=1

j=1

Algorithm converges!
Rejection rate at each loop is 1 −

+∞
X

1
C , if

P (x ≤ t) =

=

1 j−1 1
)
=
...
So we can only
use Laplace distribution to generate Normal distribution, but cannot use Normal distribution to
get Laplace
...
Importance Sampling Method
Goal

To simulate x ∼ f (x), which is difficult
...
We know f(x) (up to a constant)
...
g
...
Exists a candidate distribution (helper) g(z) close to f(z) and easy to simulate from
...


The assumptions are similar to the rejection sampling but weaker than rejection sampling
...
g
...

Algorithm
1
...
, zN ∼ g(z)
2
...
zN according to the probability
P (x = zj |z1 ,
...
, zN ) = PN
g(zj )
j=1 wj

3
...
, zN ) = PN

k=1

P (X ≤ t|z1 ,
...
zN )P (X = zk |z1 ,
...
zN ) •

−∞

Remark
For computing efficiency, try to find g(z) close to f(z), so that wj ’s are not too close to 0 or 1
...

Z
E[a(x)] = a(x)f (x)dx
Z
a(x)f (x)
g(x)dx
=
g(x)


a(z)f (z)
=E
g(z)


N
1 X a(zi )f (zi )
N i=1 g(zi )

=

N
1 X
a(zi )wi
N i=1

13

C
...


Assumption We know how to simulate from f (x|y, z), f (y|x, z) and f (z|x, y)
...

Algorithm
1
...
For k=0, 1,
...
Stop after a large number N iterations,
set (x, y, z) = (x(N ) , y (N ) , z (N ) ) and return (x,y,z)
Claim: When N is large, (x, y, z) ∼ f (x, y, z)
Remarks
1
...

2
...

Markov Chain:

 (1) 
 (2) 
 (N ) 
x(0)
x
x
x
 y (0)  →  y (1)  →  y (2)  →
...
Junker from CMU)
Need to simulalte


x
y


∼ f (x, y) =

2x + 3y + 2
, f or 0 < x < 2, 0 < y < 2
...
We can use Rejection
Sampling and Importance Sampling to simulate from f(x,y), using uniform distribution as the
helper
...

f (x|y) =

f (x, y)
f (X, y)
2x + 3y + 2 3y + 4
2x + 3y + 2
=R
=
/
=
f (y)
28
14
6y + 8
f (x, y)dx

f (y|x) =

f (x, y)
f (X, y)
2x + 3y + 2 2x + 5
2x + 3y + 2
=R
=
/
=
f (x)
28
14
4x + 10
f (x, y)dy
14

Suppose we know how to simulate from f (x|y) for a fixed y and f (y|x) for a fixed x
...
Set k = 0 and starting point (x(0) , y (0) ) = (1, 1)
2
...

x(k+1) ∼ f (x|y (k)
y (k+1) ∼ f (y|x(k)
3
...
5y 2 + 2(x + 1)y
2x + 3y + 2
dt =
F (y|x) =
f (t|x)dt ==
4x + 10
4x + 10
0
0
Solve u = F (y|x) for y
...
Simulate u ∼ U (0,
q1)
2
...

F −1 (v|y) =

p

v(6y + 8) + (1
...
5y + 1)

1
...
Given y, set x = v(6y + 8) + (1
...
5y + 1)

15

D
...
−→ xk −→
...
, xk ,
...

Theorem 1
...
−→ xk −→
...
Key Condition: There exists a density function f(x) such that (*) holds
...
The chain is also π-irreducible(the set of possible values for xk not change for all k =
0, 1, 2,
...
The chain is aperiodic(not going to the same value periodically)
...

Recall Gibbs,
x(k+1) ∼ f (x|y (k) , z (k) )
y (k+1) ∼ f (y|x(k+1) , z (k) )
z (k+1) ∼ f (z|x(k+1) , y (k+1) )
⇒p(x(k+1) , y (k+1) , z (k+1) |x(k) , y (k) , z (k) )
=f (x(k+1) |y (k) , z (k) )f (y (k+1) |x(k+1) , z (k) )f (z (k+1) |x(k+1) , y (k+1) )

16

Z Z Z

p(x(k+1) , y (k+1) , z (k+1) |x(k) , y (k) , z (k) )f (x(k) , y (k) , z (k) )dx(k) dy (k) dz (k)

Z Z Z

f (x(k+1) |y (k) , z (k) )f (y (k+1) |x(k+1) , z (k) )f (z (k+1) |x(k+1) , y (k+1) )f (x(k) , y (k) , z (k) )dx(k) dy (k) dz (k)
Z

Z Z
f (x(k) , y (k) , z (k) )dx(k) dy (k) dz (k)
=
f (x(k+1) |y (k) , z (k) )f (y (k+1) |x(k+1) , z (k) )f (z (k+1) |x(k+1) , y (k+1) )
Z Z
=
f (x(k+1) |y (k) , z (k) )f (y (k+1) |x(k+1) , z (k) )f (y (k) , z (k) )dy (k) dz (k) f (z (k+1) |x(k+1) , y (k+1) )
Z Z
=
f (x(k+1) , y (k) , z (k) )f (y (k+1) |x(k+1) , z (k) )dy (k) dz (k) f (z (k+1) |x(k+1) , y (k+1) )
Z
= f (x(k+1) , z (k) )f (y (k+1) |x(k+1) , z (k) )dz (k) f (z (k+1) |x(k+1) , y (k+1) )
Z
= f (x(k+1) , y (k+1) , z (k) )dz (k) f (z (k+1) |x(k+1) , y (k+1) )

=

=f (x(k+1) , y (k+1) )f (z (k+1) |x(k+1) , y (k+1) )
=f (x(k+1) , y (k+1) , z (k+1) )
Verify (*)!

E
...

Goal
from
...

M-H Algorithm
1
...

2
...
, choose a new xk+1 as follows
a
...
Accept xk+1 = z by the following probability
α(xk , z) = min{1,

f (z)/f (xk )
}
g(z|xk )/g(xk |z)

Otherwise set xk+1 = xk
3
...
Stop!
Claim: When N is large, x∗ ∼ f (x)
Reason: verify (*)
Special cases (Algorithms)
1
...
Independent Chain M-H Algorithm: g(z|x) = g(z)
(z)g(xk )
⇒ In M-H, change α(xk , z) = min{1, ff (x
}
k )g(z)
(MCMC always depends on the previous one, even if x and z are independent
...
Initial value: x0 = 0
2
...
Simulate z ∼ g(z) = ft3 (z)
...
Simulate u ∼ Bernoulli(α(xk , z)), where




ft (z)[1 − sin(20z)]1|z|≤3 ft3 (xk )
[1 − sin(20z)]1|z|≤3
α(xk , z) = min 1, 3
= min 1,
ft3 (xk )[1 − sin(20xk )]ft3 (z)
[1 − sin(20xk )]
Set


xk+1 =

z, if u = 1
xk , if u = 0

3
...
Pick {x5000 , x5050 ,
...

Claim: {x5000 , x5050 ,
...
Pearson Correlation Coefficient
(Continuous x and y
...
)
Population Version
ρ(X, Y ) = p

EXY − EXEY
=p
2
V ar(X)V ar(Y )
[EX − (EX)2 ][EY 2 − (EY )2 ]
Cov(X, Y )

Sample Version
ρˆ =
where s1 =

1
n−1

Pn

i=1 (Xi

¯ 2 , s2 =
− X)

1/n

1
n−1

P

Pn

¯ Y¯
Xi Yi − X
s1 s2
(Yi − Y¯ )2

i=1

1
...

2
...

3
...
Spearman’s Correlation
(Good for ordinal data and skew data
...
Then
E[F1 (X)F2 (Y )] − E[F1 (X)]E[F2 (Y )]
q
1
1
12 × 12
Z Z
=12E[F1 (X)F2 (Y )] − 3 = 12
F1 (x)F2 (y)f (x, y)dxdy − 3

ρs (X, Y ) =ρ(F1 (X), F2 (Y )) =

Sample Version Sort the samples within X’s and Y’s separately
...
, Xn ⇒ r1

(Y )

(X)

, r2

,
...
, Yn ⇒ r1 , r2 ,
...
ρˆs → ρs when n → ∞ (consistent estimator)
2
...
X and Y are not linearly related, but F1 (X) and F2 (Y ) are linearly related
...
Kendall’s τ
(Also related to ranks
...
)
19

Population Version
Suppose (X, Y ) ∼ F (X, Y ), (X 0 , Y 0 ) ∼ F (X, Y )
τk = P {(X − X 0 )(Y − Y 0 ) > 0} − P {(X − X 0 )(Y − Y 0 ) < 0}
Concordance: (X − X 0 )(Y − Y 0 ) > 0
Discordance: (X − X 0 )(Y − Y 0 ) < 0
X and Y positively correlated ⇒ 0 < τk ≤ 1
X and Y negatively correlated ⇒ −1 ≤ τk < 0
X and Y independent ⇒ τk = 0

Sample Version
Supposedataset (x1 , y1 ),
...
For each pair, we say (xi , yi ), (xj , yj )
...

τˆk =

# of positive pairs # of negative pairs



n
n
Pn

=

i=1

Pn 2

2

j=1 [1{(xi −xj )(yi −yj )>0} − 1{(xi −xj )(yi −yj )<0} ]

n
2

Remark
1
...
When n is large, computation can be a little slow
...
Further/Alternative expression:
Z Z
τk = 4
F (X, Y )f (X, Y )dXdY − 1
4
...


D
...

Independence ⇒ ρ = 0, ρs = 0, τk = 0
Let’s start with a random vector of length = 2 (general case length > 2)
...
It is a way to describe the dependence
between X and Y
...
F1 (x) is the marginal distribution of X, F2 (y) is the
marginal distribution of Y
...
Vice versa, since F1 (x) = F (x, y)dy, F2 (y) = F (x, y)dx
...

C(t, s) =

21

X, Y are independent⇒ C(t, s) = ts ⇒ τk = 4

RR

ts(tds + sdt) − 1 = 0

More generally, for random vector (x1 , x2 ,
...
, xn )
...
, 0 ≤ tn ≤ 1, C(t1 ,
...
, Fn−1 (tn )), where
F (x1 ,
...

Similar as before,
F (x1 ,
...
, Fn (xn ))
Again, Joint distribution F (x1 ,
...
, tn )
...


E
...
xn ) such that xi ∼ Fi (x) for any distribution Fi (•), ∀i = 1, 2,
...
For Multivariate normal distribution: see (1
...

2
...
D)
...
Marginal distributions Fi (x), i = 1, 2,
...

2
...


Algorithm
Call Cholesky Decomposition algorithm to get A, where AAT = Σ
...
Simulate z1 ,
...
Set z ∗ = (z1∗ ,
...
, zn )T
z∗
3
...
, n, set ui = Φ( σii ) ∼ U (0, 1), xi = Fi−1 (ui ) ∼ Fi (x)
...

xn1

x12
x22

...
x1p

...


...
, x
˜10
...

Covariance Matrix: Σ = Cov(x∗ ), where x∗i = Φ−1 (F (xi ))
Then call Gaussian Copula to generate x
˜1 ,
...


22

F
...
, xn ) such that marginal distribution xi ∼
Fi (x)
...


Algorithm
Call Cholesky Decomposition algorithm to get A, where AAT = Σ
...
Simulate z1 ,
...

2
...
, zn∗ )T = A(z1 ,
...
Simulate s ∼ χ2r

z∗ / σ
For i = 1, 2,
...
, n
...

Note: You can use step function to estimate Fi (ui ), and then calculate Fi−1 (ui )
...
Archimedean Copulas
For a given strictly decreasing function, h(t) : [0, 1] 7→ [0, +∞), we can get a copula
C(t, s) = h−1 (h(t) + h(s))
Examples
1
...
Clayton Copula: hC (t) =

−θt

−1
3
...
Define Brownian Motion
One dimensional Standard Brownian Motion
{W (t)|t ∈ [0, T ]}, has the following properties:
1
...
W(t) is continuous in t
3
...
W (t) − W (s) ∼ N (0, t − s)
Notation: W (t) ∼ BM (0, 1) W (t) = W (t) − W (0) ∼ N (0, t)
Brownian Motion BM(µ, σ 2 )
drift: µ
diffusion coefficient: σ 2
X(t) = µt + σW (t), X(0) = 0
dX(t) = µ + σdW (t)
Brownian Motion BM(µ(t), σ 2 (t))
R
R
X(t) = µ(t)dt + σ(t)dW (t)
dW (t) = µ(t)dt + σ(t)dW (t)

B
...
, W (tn ) at a fixed time path t0 = 0 < t1 <
...

In essence, we need to simulate

W (t1 ) − W (0)
 W (t2 ) − W (t1 )



...


...
 = D 

...


...


...

tn − tn−1

t1
0


 0 t2 − t1
 ∼ N 0, D 



...

0
0
24


...


...




0
 T
0
D 



...

1



0

0 
,Σ = D



...

1
...

1
...


...



...


...


...

tn − tn−1



...

0



...



...
Simulate z = (z1 ,
...
Set (W (t1 ),
...
, zn )T
3
...

Claim: When n is large, a good approximation to BM(0,1)
Algorithm 1-b (Random Walk)
1
...
, n,

iteratively update W (ti ) = W (ti−1 ) + ti − ti−1 zi , zi ∼ N (0, 1)
2
...

Simulate from BM (µ, σ 2 )
X(t) = µt + σW (t), ∀i = 0, 1,
...
< tn = T
X(ti ) − X(tj ) = µ(ti − tj ) + σ(W (ti ) − W (ti−1 ))
Algorithm
1
...

2
...
, n,
X(ti ) = X(ti−1 ) + µ(ti − ti−1 ) + σ

p

ti − ti−1 Zi , Zi ∼ N (0, 1)

3
...

Claim: When n is large, the path is from BM (µ, σ 2 )
...
Set X(t0 ) = 0
...
For i = 1, 2,
...
Connecting neighboring points to become a path
...


C
...

Basic task: For a given W(u) and W(t) at time u and t, 0 ≤ u < t ≤ T
...

Recall
 1×1   1×1
 1×1 

1×k
µX
X
σ11
σ12

N
,
k×1
k×k
Y k×1
µk×1
σ21
σ22
Y
2
X|Y ∼ N (µX|Y , σx|Y
)

µX|Y = µX + σ12 Σ−1
22 (Y − µY )
2
σX|Y
= σ11 − σ12 Σ−1
22 σ21



X 1×1
Y 2×1




  
u
0
W (s)
↔  W (u)  ∼ N  0  ,  u
u
0
W (t)


u
s
s

µX = 0, µY = 0
2
σX

= s, σ12 = (u, s)


u u
Σ22 =
u t

Thus,
2
W (s)|W (u), W (t) ∼ N (µs|u,t , σs|u,t
)

(t − s)W (u) + (s − u)W (t)
t−u
(s − u)(t − s)
=
,∀ 0 ≤ u < s < t ≤ T
t−u

µs|u,t =
2
σs|u,t

For a given BM(0,1), say W (t0 ) = X0 , W (t1 ) = X1 ,
...
Multidimensional Brownian Motion



X1 (t)
X(t) = 
...

Wd (t)


has the following properties:
1
...
Continuous sample paths
3
...
Set W(0)=0
...
For i = 1,
...
t
...
Set X(t0 ) = 0
...
For i = 1,
...
Set X(t0 ) = 0
...
For i = 1, 2,
...
Geometric Brownian Motion
• A continuous-time stochastic process in which the logarithm of the random varying quantity
follows a Brownian Motion
...

• A stochastic process S(t) is GBM(µ, σ 2 ) if it follows the differential equation
dS(t)
= µdt + σdW (t)
S(t)

(∗)

Advantages
:
• Mathematical simple and trackable
...

• Percentages of change

)−S(tn−1 )
S(t2 )−S(t1 )
,
...


Ito’s integration solution of (*)
S(t) = S(0)e(µ−σ

2

/2)t+σW (t)

E[S(t)] = S(0)eµt
2

V ar[W (t)] = [S(0)]2 e2µt (eσ t − 1)
∀0 ≤ r < t ≤ T, S(t) = S(r)e(µ−σ

2

/2)(t−r)+σ[W (t)−W (r)]

Algorithm for GBM(µ, σ 2 ) (Random Walk Construction)
For i = 1,
...
CIR (Interest Rate Model)
By Cox, Ingersoll and Ross (1985)
...


dr(t) = a(b − r(t))dt + σ

p
r(t)dW (t)

Where W (t) ∼ BM (0, 1), a > 0, b > 0
...
If 2ab > σ 2 , then r(t) remains strictly non-negative for all t
(almost surely)
...

p
• The diffusion term σ r(t) decreases to 0 as r(t) approaches the origin and it also prevents
r(t) from taking negative values
...
, n, set
r(ti ) = r(ti−1 ) + a(b − r(ti−1 ))(ti − ti−1 ) + σ

p

p
r(ti−1 ) ti − ti−1 Zi , Zi ∼ N (0, 1)

G
...

N(t) is a counting process, and Y1 ,
...

In particular, suppose the random arrival times of the jumps are 0 < τ1 <
...

We often model (not always):
• τj − τj−1 ∼ exp(λ) ⇔ N (t) ∼ P oisson(λt)
• Yt ∼ lognormal, i
...
d
Ito’s representation of (*) (omit derivatives)
N (t)

S(t) = S(0)e

(µ−σ 2 /2)t+σW (t)

Y
j=1

29

Yj

Algorithm to simulate S(t) for 0 = t0 < t1 <
...
For i = 1,
...
, YN (ti−1 )+D=YN (ti ) ∼ lognormal

Alternatively, we can simulate the jump exactly
...

Within each (τk , τk+1 ), we can just use the regular way to simulate from GBM(µ, σ 2 ) for the
grids
...
The estimation uncertainty
is quantified by ”confidence” which is related to ”repeated experiments”
...

To account the uncertainty, a Bayesian approach treat parameter to be random
...
Bayesian Approach Steps
1
...

2
...

3
...
The posterior
distribution could be very complex, but we can simulate from the distribution
...


B
...

31

1
σ2

Inproper prior: approximated by a proper prior
e
...
Normal mean µ: π(µ) ∝ c
Approximation: π(µ) ∼ N (0, 1000000)
But uniform prior depends on parameterigation
...
Now if my interest is ψ = θ2 , uniform
prior ψ ∼ U (0, 1) is not equivalent to θ ∼ U (0, 1)
...

• If data size is large, in most situations, the conclusion based on posterior is very close to
conclusions obtained by MLE (related inference)
...


C
...
5), Fθ|y (t) = P (θ ≤ t|y) =

Z

t

p(θ|y)dθ
−∞

If we have explicit formula of p(θ|yobs ), just compute
...
, θk∗ ∼ p(θ|yobs ) using tricks mentioned before
...
≤ θ[k]

Interval Estimator
Credible interval/credible set is close but not really confidence interval
...

Z
p(θ|y)dθ = P (θ ∈ C|yobs ) ≥ 1 − α
θ∈C

C is called a level 1 − α credible set
...
, θk∗ ∼ p(θ|yobs ),

Pk

i=1

p − value ≈

1{θi∗ >θ0 }
k

Alternatively, a lot Bayesians try to promote the concept of Bayes Factor, which can compare
more than 2 models
...
)
BF12 =

P (Y |M1 )
P (M1 |yobs )/P (M1 )
=
...

P (Y |M 1) is the marginal distribution of Y given M1
...
g
...

Not a significant test!
Sometimes BF does not exist, heavily relying on P (M 1), P (M 2) choice
...

• The patties were kept in two different freezers
...
high quality (expensive)
B
...

• Study results: 13 out of 16 prefer the more expensive patties
...
6? What is P (p ≥ 0
...
8125, var(ˆ
n
ˆ
p)
ˆ
Thus, pˆ ∼ N (p, p(1−
), when n is large
n
q
ˆ
p)
ˆ
)
95% Confidence interval: pˆ ± p(1−
n
Issue: n=16 is not large!
Bayesian solution:
Specify prior distribution π(p) ∼ Beta(α, β)
Different choices of (α, β) = (1, 1) or (0
...
5) or (2, 2)
plots of Uniform/Jeffrey/skeptical
Likelihood: L(p|yobs ) = f (yobs |p) =
Posterior:
p(p|yobs ) = R

16
13



p13 (1 − p)3

π(p)f (yobs |p)
∝ π(p)f (yobs |p)
π(p)f (yobs |p)dp
∝ pα−1 (1 − p)β−1 p13 (1 − p)3
= pα+13−1 (1 − p)β+3−1
Γ(α + β + 16 α+13−1
p
(1 − p)β+3−1
Γα + 13Γβ + 3
∼ Beta(α + 13, β + 3)


Bayesian Inference:
α+13
Point Estimator: θˆmean = mean of posterior = α+β+16
= 14
18 = 0
...
025), QBeta(14,4) (0
...
6|yobs = 13) = 0
...
954
Hypothesis Testing:
M1: p ≥ 0
...
6
Assume π(M 1) = 0
...
6
...
954/0
...
1
π(M 1)/π(M 2)
0
...
6

Linear Regression
y = Xβ + ,  ∼ N (0, σ 2 I)
34

Bayesian Solution:
Prior: π(β, σ 2 ) = π(β|σ 2 )π(σ 2 )
π(β|σ 2 ) ∼ N (b0 , σ 2 B0 )
π(σ 2 ) ∼ Gamma(c0 , C0 )
Likelihood:
yobs |β, σ 2 ∼ N (Σβ, σ 2 I)
f (yobs |β, σ 2 ) =

−1
1
1
e− 2σ2 (yobs −Xβ) (yobs −Xβ)
(2πσ 2 )n/2

Posterior:
p(β, σ 2 |yobs ) ∝ π(σ 2 )π(β|σ 2 )f (yobs |β, σ 2 )
c0
−1
−1
1
1
1
1
1
∝ ( )n e− 2σ2 (yobs −Xβ) (yobs −Xβ) ( )p e− 2σ2 (β−b0 )B0 (β−b0 ) ( 2 )c0 +1 e− σ2
σ
σ
σ
Bayesian inference is based on this posterior p(β, σ 2 |yobs )
...
, (β(M
) , σ(M ) )
...


Example 3 What is the probability of each of these NFL football team win their home game
this weekend?
Home win-loss record: Two simple simulation:
1
...
Pool every team together
pˆwin =

All these wins total
= 54%
T otal HM games by all teams

Statistical model - Hierarchical model to model this:
yi ∼ Binomial(ni , pi ), i = 1, 2,
...

ATC
NEP

Win
2
6
1
4
7
2
3
5

Loss
5
2
7
4
1
5
5
2

4
7

4
2

Table 1: Caption
Thus, pi |y1 ,
...
We can use MLE to get α
ˆ hist ,βˆhist
...
yAB = 0 1 for all (A,B) pairings
...

B
...
Applications of bootstrap
Example 1
Assume x ∼ F , θ ∼ T (F )
...
, xn
...
, xn )
Goal

ˆ =
Estimate se(θ)

q

ˆ
var(θ)

Under some conditions,
θ − θˆ
→ N (0, 1), n → +∞
ˆ
se(
ˆ θ)
Since


zα/2

θ − θˆ
≤ z1−α/2

ˆ
se(
ˆ θ)

ˆ α/2
Thus, the confidence interval is θˆ ± se(
ˆ θ)z

36

!
=1−α

Step 1

Estimate θˆ

Step 2

Bootstrap
∗(1)

,
...
, x∗(2)
→ θ2∗
n

x1

x1


...
, x∗(N
n


→ θN

Step 3
ˆ =
se(
ˆ θ)

N
1 X ∗ ˆ2
(θ − θ)
N i=1

Example 2
Goal

ˆ = E ˆ(θˆ − θ) = Eθ∗ (θ∗ − θ)
ˆ
Estimate Bias(θ)
θ

Example 3: Linear Regression
yi = β0 + β1 xi + i , i ∼ N (0, σ 2 )
β0 ,β1 are unknown
...
i
...

Goal

Make inference for β1

Step 1

Estimate βˆ1 using Least Square method

Step 2

Bootstrap

Step 3
N
X
∗LS
LS
ˆ )= 1
(βˆ1
− βˆ1 )2
se(
ˆ β1LS
N i=1

There might be issues since there might be outliers in the original data sets, so we might select
the outliers using bootstrap
...
In order to
bootstrap from the original data set to generate the new data, we take the residuals as the new
data
...
, (xn , yn ) → r1 ,
Title: Statistical simulation lecture notes
Description: There are chapters: simulation from a known distribution, simulation techniques and methods(rejection sampling, importance sampling, etc.), Random vectors and copulas, generating sample path, Bayes, Bootstrap. It's from Rutgers University in USA.