Foundations of Stochastic Modeling - 2010 Lecture Notes | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: Foundations of Stochastic Modeling - 2010 Lecture Notes
Description: Columbia Business School - First Year of the Doctoral Program in Decisions, Risk and Operations • Stochastic processes o Notes from Prof Assaf Zeevi's "Foundations of Stochastic Modelling".

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

Foundations of Stochastic Modeling

Page 1

FOUNDATIONS OF STOCHASTIC
MODELING
LECTURE 1 – 19th January 2010

Probability Preliminaries


Set operations
o

When we talk of a probability space, we quote a triple (W,  ,  ) , where


W is the space of outcomes



 is a s -field – the space of “admissible sets/events”
...



 is a probability distribution/measure based on which statements are
made about the probability of various events
...
In particular,
for A Î  , (A) =  (w Î W : w Î A)
...
We will assume this always
hold throughout this course and omit this condition
...
Then  ( n An ) = 1
...

o



Definition (limsup and liminf): Let {an } be a real-value sequence
...
Similarly
lim infn an := supm infn ³m an

Graphically (source: Wikipedia):
an
lim supn ¥ an
lim infn ¥ an

n
Note that these limits need not be finite
...
If
lim supn an > a , then an > a infinitely often – in other words, there are
infinitely many elements in the sequence above a
...
and lim inf an < a  an < a i
...

We now attempt to extend this definition to sets
...
Then

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

{A }
n

be a sequence of

Foundations of Stochastic Modeling

Page 3

lim supn An := m n ³m An = limm ¥ n ³m An
lim infn An := m n ³m An = limm ¥ n ³m An

We can interpret these as follows
...
o
...

{lim inf

n

An } = {w Î W : w Î An ev
...

Remark: Take {An , ev
...
Then

{A ,
n

c
ev
...
o
...

Proof

of

(i):

Let

us

Bm =  n ³m An
...
In other words, the sets Bm increase monotonically to
B =  m Bm =

 
m

n ³m

An = lim infn An

Since the events are increasing, a simple form of monotone convergence gives us
that  (Bn )   (B )
...
Thus
 (Bm ) £ infn ³m  (An )

supm  (Bm ) £ supm infn ³m  (An )
 (B ) £ lim infn  (An )

And given how we have defined B:
 (lim infn An ) £ lim infn  (An )

As required
...
Then

{A } be a sequence of
 (A i
...
) = 0
...
1 We offer two proofs – the first is somewhat
mechanical, the second is more intuitive
...



The second proof will require a lemma:
Lemma (Fubini’s): If {Xn } is a sequence of real-valued random variables with

X n ³ 0 , then

1

 ( å n Xn ) = å n  (Xn ) (which could be infinite)
...
o
...
To see

this more clearly, note that the first statement is simply a shorthand for  (w Î W : w Î An i
...
) = 0
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 5

effectively a condition under which we can exchange a summation and an
integration
...
We also have that

å  (A ) = å  ( )
=  (å  )
n

n

An

n

n

An

<¥

å

This means that the random variable

n

A

must be less than or equal to

n

infinity for every outcome in W ; in other words

å

n

A < ¥

a
...

n

This must mean that A = 1 only finitely many times
...

o



Definition (independence): A sequence of random variables

{X , , X }
n

1

are

independent if and only if
 (X1 £ x1 , , X n £ x n ) = i =1  (X i £ x i )
n

For all x 1 , , x n Î 
...

o

Proposition (second Borrel-Cantelli Lemma): Let

independent measurable events such that

c

n

n

be a sequence of

å  (A ) = ¥ , then  (A
n

n

Proof: We showed in a remark above that

{A , i
...

= m n ³m Anc
Let us fix m > 1
...
o
...

Foundations of Stochastic Modeling


Page 6

(

n ³m

)

( )

Anc = n ³m  Anc

= n ³m 1 -  (An )

£ n ³m exp (-(An ))

(

)

= exp -å n ³m (An )
=0

But we have that


( 
m

n ³m

)

Anc £ å m 

(

n ³m

Anc

)

=0
As required
...

Let us understand these two statements intuitively


The first statement tells us that for every single outcome in the sample
space W , the sequence of random variables Xn tends to X
...

o

Example: Consider the sequence X n = n1 U [0,1]
...

Proof: In this case, W = [0,1]
...



Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

o

Page 7

Definition (convergence in probability): Xn p X as n  ¥ if, for all

e>0

(

)

 Xn - X > e  0

as n  ¥

Let us understand this concept as compared to convergence almost surely
...

This, however, allows the possibility that for some outcomes

w,

X n (w)  X (w) infinitely often, provided that the probability of these w
is small and gets smaller
...
For every outcome, we require
X n (w)  X (w)
...
o
...
If An is the event that “Xn is far from X”, convergence in
probability requires
requires that

å
o

n

limn ¥ (An ) = 0 , whereas convergence almost surely

(An i
...
) = 0

– which would, for example, be satisfied if

(An ) < ¥
...
Imagine carrying out an
“experiment” by generating random variables X1 , X 2 , 
...



If X n  X almost surely, then it is certain that in every experiment we
might carry out, the values X1 (w), X 2 (w),  will tend to X (w )
...
The probability of

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 8

this happening decreases with the number of X generated, but it is still
possible that in certain experiments, it will happen
...
Let
{ n n}

us fix e Î (0,1) , and consider the definition

(

)

 X n - 0 > e =  (X n > e)
=  (U n £
1
=
n
0

1
n

)

This proves the Xn do indeed converge to 0 in probability
...
o
...
So Xn cannot be converging to 0 almost surely
...

o



Claim: Xn  X a
...
 Xn p X as n  ¥
...

Proof: Recall that convergence almost surely can be written as

(

)

 X n - X > e i
...
= 0

or in other words

(

}) = 0

{

 lim supn X n - X > e

We have, however, that lim infn An Í lim supn An
...
v
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011



Foundations of Stochastic Modeling

o

Page 9

Remark: It is interesting to note that if Xn p X , it is possible to find a

subsequence

k = 1, 2, 
...
s
...

Proof: Assume we have Xn
...
The
convergence of Xn almost surely follows by the BC Lemma
...
We then define random

variables X1 and X2 as follows:

ìï1
X1 (w) = ïí
ïï0
î

ì
ï0
X 2 (w ) = ï
í
ï
1
ï
î

w Î [0, 12 )
otherwise

w Î [0, 12 )
otherwise

Graphically:

X 1 (w )

X 2 (w )

0

w

1
2

0

1

w

1
2

1

Similarly, we define X3, X4 and X5 to be random variables divided into three
parts:
X 3 (w)

X 4 (w)

1

1

0

1
3

w
2
3

1

0

X 5 (w)

1
3

w
2
3

1

0

1
3

We continue this pattern to form a pyramid of random variables
...
Now, consider that  (X1 = 1) =  (X 2 = 1) = 12
...

However, if we fix any w Î [0,1] , X n (w) = 1 infinitely often (because every set of
random variables will involve at least one that is equal to 1 at that w )
...

o

Definition (convergence in expectation) – also called L1 convergence
...

1

Proof: For the first part, consider that

Xn = Xn - X + X £ Xn - X + X
 Xn - X £ Xn - X
Taking expectations yields the result
...
Taking limits yields the
answer required
...
Then for

all a > 0
 (X > a ) £

(X )
a

Proof:

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 11

(X ) =  (X , X > a ) +  (X , X £ a )

³  (X , X > a )
³ a  (X > a )

As required
...

1

Proof: Fix an e > 0 and use Markov’s Inequality

(

)

 Xn - X > e £

(

 Xn - X

e

)0

As required
...
Clearly, X n  0 a
...
(Note
{ n}

that this does not hold true for w = 0
...

However,

it

is

clear

that

 (Xn ) = n  (U £ n1 ) = 1
...
(This illustrates an interesting example where
taking limits inside and outside expectations gives a different result
...
)
o

Example: Let Xn be a sequence of IID random variables with exponential

distribution with parameter 1
...
s
...
Consider

 (X n > (1 + e)log n ) = e -(1+e ) log n
1
= 1+e
n
This means that

å  (X
n

n

> (1 + e)log n ) < ¥

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 12

So by BC-1,

 (Xn > (1 + e)log n, i
...
) = 0
As such,

Xn £ (1 + e)log n ev
...
o
...
s
...



Note: This only applies to the lim sup – to say that the actual sequence tends to

1 is nonsensical, because most of the Xn will indeed be very small!

LECTURE 2 – 26th January 2010



Interchange arguments
o

Example: In our last lecture, we considered the sequence of random variables

X n = n  U £ 1
...
We showed,
{ n}
however, that (X n ) = 1 for all n
...
It looks like certain conditions need to be
satisfied
...
(Clearly, in the example
above, the n that sits outside the indicator prevents us from bounding Xn)
...
Because X n  X a
...
, (Bn )  0
...



Remark: It is clear that this theorem still holds if Xn only converges in

probability to X rather than almost surely, because all that is required is for the
probability of rogue events to fall to 0
...

o

Lemma (Fatou): If {Xn } is a sequence of non-negative random variables, then

 (lim infn Xn ) £ lim infn (Xn )
(Note: there is no converse for lim sup)
...
Note that:


Ym  Y a
...



infn ³m (Xn ) ³ (infn ³m Xn ) = (Ym )
...

The

(w)(dw)
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

LHS

then

ò inf

n

has

value

X n (w)(dw)
...
So we can apply
bounded convergence:

lim inf Xn ³  (limn ¥ min[Ym , k ])
=  (min[Y , k ])

As k  ¥ , the last line tends to (Y ) = (lim infn Xn )
...

Proof: Since Xn £ Y , we have that Y + X n ³ 0 and Y - X n ³ 0
...

o

Theorem (MON – Monotone Convergence): Let



{X }
n

be a sequence of

random variables such that 0 £ X1 £ X 2 £  almost surely, then


X n  X (possibly ¥ ) almost surely
...

Proof: Since Xn > 0, we can apply Fatou’s Lemma, to get

lim infn (Xn ) ³ (lim infn X ) = (X )
However, the fact that Xn < X also gives (X n ) £ (X )  lim infn (Xn ) £ (X )
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011



Foundations of Stochastic Modeling

o

Page 15

{X }

Proposition (Frobini I): Let

n

be a sequence of random variables with

Xn ³ 0 for all n, then

 ( å n Xn ) = å n (Xn )
Proof: Define

Sn = å m =1 Xn  S = å m =1 X m
¥

n

(possibly ¥)

(The last step follows because the Xn are non-negative, which means the sequence

{S }
n

is non-decreasing almost surely – there is no outcome in the probability

space in which adding an extra term to the sum decreases it)
...
However,
 (Sn ) = 

(å

n
m =1

Xn

)

Finite n

n
= å m =1 (X n ) 

å

¥
m =1

(X n )

And

(S ) = 

(å

¥
m =1

Xm

)

Combining these two results yields

å

¥
m =1

(Xn ) = 

(å

¥
m =1

Xm

)

As required
...
Then

 ( å n Xn ) = å n (Xn )
Proof: As above, set S n =
Sn =

å

n
m =1

å

n
m =1

X m
...
s
...
If the sum
was equal to infinity of a set of non-zero measure, the expectation would blow
up
...
)
Now, note that

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 16

Sn - å m =1 Xm =

å
= å
£å

¥

Xm - å m =1 Xm
¥

n
m =1
¥

m =n +1

¥

m =n +1

Xn
Xm

0

a
...
as n  ¥

(The last line follows from the fact the sum is finite almost surely
...

The above implies that Sn 

å

¥

m =1

Xm

almost surely
...
First note that
Finite n

å m =1 (Xm ) = 

n

(å

n
m =1

)

X m = (Sn )

Now take limits of both sides
limn

å

(X m ) = å m =1 (X m ) = limn (Sn )
¥

n
m =1

By dominated convergence, we have

limn (Sn ) = (limn Sn ) = 

(å

¥
m =1

Xm

)

Combining the last two equations, we do indeed find that

å

¥
m =1

(Xm ) = 

(å

¥
m =1

Xm

)

As required
...
Is there a
“common thread” that runs through these conditions?

o

Definition (Uniform Integrability): A sequence of random variables {Xn } is

said to be uniformly integrable (u
...
) if for all e > 0 , there exists a K (e) < ¥
such that

(

)

(

supn  X n : X n > K (e) º supn  X n 

{ Xn >K (e )}

)£ e

Note that the supremum implies that the K we find must work uniformly for the
entire sequence – in other words, for a given e , there must be a single K (e)
which makes the tail expectation small for every n
...

o

Proposition: Let

{X }
n

be a sequence of random variables and suppose

X n  X almost surely


If {Xn } are uniformly integrable, then (X n )  (X )
...



{X }

If Xn > 0 for all n and (Xn )  (X ) < ¥ , then

n

are uniformly

integrable
...
We have almost identified the “bare
minimum” condition for interchange
...
i


£ supn (a + e)
<¥

Now, let Yn = Xn - X
...
Hence, {Yn }
is also uniformly integrable (since {Xn } is uniformly integrable, and X is
integrable)
...
s
...
s
...
Thus,
n
(Yn )  0 , as required
...
Thus, applying bounded

(

)

convergence, fa  éê Xn ùú
...

Thus, the RHS above can be made arbitrary close to the difference
between these two expectations
...
i
...



If supn  X n



If Xn £ Yn and {Yn } is uniformly integrable, then

{X }
n

is uniformly

integrable
...

Proof (part 1): We will prove the first part of the statement above
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 19

Proposition (Holder’s Inequality): Let X and Y be random variables

such that  X

p

< ¥ and  Y

q

< ¥
...
Then

(

 XY £  | X |p

) ( | Y | )
1/ p

q

1/q

When p = q = 2, this inequality is none other than the Cauchy-Schwartz
inequality
(XY ) £ (X 2 ) (Y 2 )
Proof: If X = 0 or Y = 0 a
...
, the inequality holds trivially
...
Suppose we are

able to show that, for every point w in the sample space,

X (w)Y (w)
xy

p

1 X (w )
1 Y (w )
£
+
p
p x
q yp

q

Then, taking expectations and rearranging, this becomes Holder’s
Inequality
...
The statement then becomes
ab £

a p aq
+
p
q

With a and b non-negative
...

The expression then follows trivially from the convexity of the
exponential function:
æ s t ö es et
exp çç + ÷÷÷ £ +
çè p q ÷ø p
q
We have therefore proved Holder’s Inequality
...

Proof:

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011



Foundations of Stochastic Modeling

Page 20

q
æ q
ö
æ q
ö
 X =  çç X ; X < 1÷÷ +  çç X ; X ³ 1÷÷
è
ø
è
ø
æ p
ö÷
£ 1 +  çç X ; X ³ 1÷
è
ø
£¥

As required
...
We
then use Holder’s Inequality to obtain

(

(

)

 Xn ; Xn > K =  Xn 
æ
£ çç  Xn
è

p

)
æ
çç 
çç ( {
è

{ Xn >K }
1/ p

÷ö
÷ø

1/q

ö÷
÷
Xn >K } ÷
÷ø

)

q

Taking a supremum

æ
 Xn ; Xn > K £ supn çç  Xn
è

(

)

p

1/ p

÷ö
÷ø

((X

> K ))

1/q

n

By the assumption in the theorem, the first expression is finite
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011



Foundations of Stochastic Modeling

Page 21

Proof (part 1 – alternative): We propose an alternative, somewhat simpler

proof to part 1:
p -1
æ
ö÷
ì
ï X n ïü
ï
ççç
ï
ïý ; X > a ÷÷÷
 X n ; X n > a £  ç X n ïí
÷÷
n
çç
ï
a ï
ï
ï
÷ø÷
ï
ï
î
þ
èç
1
= p-1  X n ; X n > a
a
K
= p-1
a

(

)

(

)

Setting a ³ K / a p-1 gives the required result
...
What condition is needed on
the distribution of the X for this to happen? Clearly, if the Xn are IID, this will
not be the case, since every additional variable in the sum add something to the
sum – IID variables are either “dead at the start” or “never die out”
...
Then

å

n

{X }
n

be a sequence of independent

X n converges if and only if for some (and therefore

every) K > 0, the following conditions hold:


å ( X
n

n

)

> K < ¥ (clearly, for example, if the Xn are IID without

bounded support, we’ll never be able to satisfy this)
...



å  (X



å

n

n

n

)

; Xn £ K < ¥

(

)

ar X n ; X n £ K < ¥

Note that there are no issues of existence of the expected value and the variance
in the last two points because these are taken over a finite range – namely on
X n Î (-K , K )
...
We will

cover it later in this course
...
If

{X }
n

n

be independent

ar(X n ) < ¥ , then

X m converges almost surely
...

o

Example: Set X n = Yn / n , where the Yn are IID exponential random variables

with mean 1
...
In light of the result we derived in the
previous lecture (that draws from an exponential will be arbitrarily large
infinitely often) one might expect this series not to converge
...



First condition:

(

)

 Xn > K = 

(

Yn
n

>K

)

=  (Yn > nK )
= e -nK

This does indeed sum to a finite number

å ( X
n

n

)

>K <¥

The first condition is therefore met
...
So, as we expected, Sn does not converge almost surely
...
The intuition here is

(-1)n / n does converge
...
We wonder whether this will
“spoil” the convergence
...



Second condition

(

)

1
 Yn ; Yn £ n
n
1
= (Yn )
n
2p - 1
=
n

(

 Xn ; Xn £ 1 =

We know, however, that

å

)

diverges, so the infinite sum of these

k
n

expectations only converges if p = ½, in which case the numerator is 0
...



Third condition (assuming p = ½)

(

)

ar X n ; X n £ 1 =
=
=

1
n2
1

(

ar Yn | Yn £ n

)

ar(Yn )
n2
(Y 2 ) - 0
n2

The infinite sum of the variances is therefore finite
...

o

Example: Set



Xn = Yn / n , where Yn are IID random variables with

probability ½ of being 1 and probably ½ of being –1
...
Since, in this case, (X n ) = 0 , we can simply apply the second
form of the 3-series theorem:

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 24

(

)

1
ar X n ; Yn £ n
n
(X 2 )
=
n

(

ar Xn ; X n £ 1 =

)

The infinite sum of these variances clearly diverges, and so Sn does not converge
...





The Strong Law of Large numbers
o

Theorem (SLLN): Let {Xn } be IID with  X1 < ¥ , then

1
n
X k  (X1 )
å
k =1
n

a
...

Note: there is a simpler proof of this theorem that assumes the fourth moment of
X1 is bounded
...

Proof: Let us fix n
...
We will first prove the
n

result for the truncated form, and later show that the result then carries
over to the non-truncated form
...
Note that
1
...
Since X1 has finite mean, it is the case that
X1 

{ X1 £n}

 X1 

{ X1 <¥}

= X1

a
...

(1) justifies using dominated convergence, and (2) gives us the limit
...
To do this, we will need two lemmas
...
Then
1
n
a  a¥
å
k =1 k
n

as n  ¥

Proof: Let e > 0 , and choose an N such that

ak > a¥ - e whenever k ³ N
Then
lim infn ¥

ì1
ü
ï
ï
1
n -N
n
N
ï
ï
³
+
lim
in
f
(
)
a
a
a
e
í
ý
å
å
k
n
k
¥
k =1
k =1
ï
ï
n
n
ï
ï
în
þ
³ 0 + a¥ - e

By a similar argument, lim sup £ a ¥
...

Lemma (Kronecker’s Lemma): Let

sequence
...
By the assumption in our proof,

u¥ = limn un exists
...
By the Cesaro Sum
Property, so does the second term
...



Using these two lemmas, our (rough) strategy will be as follows:


Step 3a: Show that

å

n

wk

converges almost surely, using the

k =1 k

Kolmogorov 3-series theorem
...

Step 3c: Note that since
1
n

å

å

n

wk

k =1 k

converges almost surely,

wk converges to 0 almost surely
...
Furthermore, the wk are independent of each other (because

they are build up from the Yk, which are built up from the Xk, which are
independent)
...
(Note that

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 27

some steps in the exposition below require some results about moments of
random variable which we will prove in a homework – these are indicated
by a *)
...
Thus, we
can write

å

¥
k =1

ar

( )£ò
wk
k

C1

¥
0

C2 + y

£C3 ò

¥

0

(

(

2y  X1 > y

 X1 > y

)

* = C 3  X1
<¥
So the sum does indeed converge
...
But by Step 3a, (A) = 1 , since A
occurs almost surely
...
s
...
s
...
All

that remains to be shown is that the result also applies to the full series

{X }
...

Thus, for large enough n, Xn = Yn almost surely
...

As such
1
n

å

n

Xk (w) - n1 å k =1Yk (w)  limn Xn (w) -Yn (w)
n

k =1

a
...

Thus, the SLLN holds for the original sequence Xn
...
Let Sn = å k =1 Xn
...
s

Foundations of Stochastic Modeling

Page 29

We might wonder, however, whether this still holds true if, instead of dividing by
n, we divide by some quantity an, where an  ¥ , but possibly slower than n
...

1 +e

Let us consider, for example, an = n (log n )2

...
Consider that

this is really very close to a

æ X ö÷
s2
ar ççç n ÷÷ =
1 +e
çè an ÷ø÷
n (log n )2
And this implies that

å

n

ar

( )<¥
Xn
an

Thus, by the Kolmogorov 3-series theorem,

å

Xn
n an

converges almost surely, and

by Kronecker’s Lemma, this means that S n / an  0
...

However, it is interesting to note that if we take an = n (slightly smaller), then
no almost sure result holds anymore
...
These arguments are clearly
extremely sensitive to scaling
...

Comments:


This is clearly a very difficult definition to use, because there are “too
many” bounded continuous function
...



If X n  X almost surely, then f (X n )  f (X ) , because f is continuous,
and  éêë f (Xn )ùúû   éêë f (X )ùúû , by bounded convergence (since the functions
are bounded)
...



Each member of

{X }

as well as X need not even live on the same

n

probability space
...

Note that this is radically different to the context of almost sure
convergence, which fixes an w in the probability space and requires
convergence
...

o

Definition (Weak convergence): A sequence of real valued random variables

{X }
n

converges weakly to X if and only if
Fn (x )  F (x )

for all continuity points of F, where Xn ~ Fn and X ~ F
...

o

Example: Take X1 , X 2 ,  IID, with (X1 ) = m and ar(X1 ) = s 2 , then

å

¥
i =1

(Xi - m)
n

 sN (0,1)

This is none other than the central limit theorem
...
We will prove this late in this lecture
...
Now, fix k < n (the number of people amongst whom we are looking
for a match), and let

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 31

Yn = min {i : X i = X j for some j < i }
This is effectively the smallest i for which Xi matches an earlier X
...
This is somewhat inconvenient – let’s try to use weak convergence
arguments to get an approximation to this instead
...
Similarly, an ~ bn if

= 1
...

(

)

Now, note that log(1 - x ) ~ -x as x  0
...
Therefore
æY
ö÷
êx n ú i - 1
ê
ú
log  ççç n > x ÷÷ ~ -å ië =2 û
÷
çè n
n
ø÷
ê
ú
1
êx n ú
= - å ië =2 û (i - 1)
n
11 2
xn
~n2
= -x 2 / 2
As such
æY
÷ö
 ççç n > x ÷÷  exp - 12 x 2
÷÷ø
çè n

(

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

)

Foundations of Stochastic Modeling

Page 32

The RHS is clearly a distribution, because it takes value 1 at x = 0 and decreases
thereafter, and we therefore have weak convergence by the second definition
...
For the lower bound:

ò

¥

x

e -y

2

æ1
¥æ
3ö 2
1ö 2
dy ³ ò ççç1 - 4 ÷÷÷e -y /2 dy = çç - 3 ÷÷÷e -x /2
x ç
çè x x ÷ø
y ÷ø
è

/2

(Which can be shown by differentiating the result)
...
Making this more specific
e
x

to a normally distributed random variable, suppose X ~ N (0, s 2 ) , then
 (X > x ) =
=
~

ò

1

¥
x

1

2ps
2

2

sò

exp-y
¥

x /s

2ps
1 s -x 2 /2s 2
e
2p x

2

/2 s 2

exp-y

2

/2

dy
dy

It is interesting to note, therefore, how similar our result looks to the central
limit theorem, despite the fact we did not center the original variables Yn
...
(In fact, the random variable with
this tail integral is a Rayleigh random variable)
...

o

Example:

Let

X1 , X 2 , 

be

IID

N (0, s 2 )

random

variables,

and

let

M n = max (X1 , , Xn )
...
We would like

to show that

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 33

{

 {an (M n - an ) £ x }  exp -e -x

}

"x Î 

We first note the RHS has no discontinuity points
...
Finally, we note
the RHS is monotone
...

Let us now show convergence:

(

 Mn £

x
an

+ an

)

(
é
= ê  (X
ë

=  X1 £

x
an

£

1

x
an

(

+ an , , Xn £

)

ù
+ an ú
û

é
= ê1 -  X1 >
ë

x
an

x
an

+ an

)

n

)

ù
+ an ú
û

n

We will now require a lemma
Lemma: Consider two (possibly non-real) sequences

{a }
n

and

{b }
...

This

is

precisely

(

an = - X1 >

x
an

the

situation

in

this

case,

with

bn = n  ¥

and

)

+ an  0 , since an  ¥
...
To do this, we will make extensive use of the Lemma in the previous
example (namely that  (X1 > x ) =

(

n ⋅  X1 >

x
an

)

1

+ an ~ n ⋅

=
=

=

2p

1

=

~

1
x 2p

⋅

x
an

e -x

2

/2

)

ìï
1
⋅ exp ïí- 12
ïîï
+ an

(

x
an

2ü
ï
+ an ïý
ïþï

{

2p
1

x
an

1
2
a2
n ⋅ exp - x 2 - x - 2n
2 an
+ an

2p
1

x
an

1
-x 2 /2 an2 -an2 /2
n ⋅ e -x e
e
+ an

2p

x
an

1
x

2p an
an
+ an

x
an
-x

)

}

1
1
-x 2 /2 an2
-a2 /2
n ⋅ e -x e
an 2p
e n
+ an
an 2p
1/n

=
2
2
1
-x -x /2 an
n ⋅e e
an 2p  (X1 > an )
+ an
e -x e

-x 2 /2 an2

e

As such, by the Lemma

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

(

 Mn £

Page 34

x
an

)

+ an

(

é
= ê1 -  X1 >
ë -x
 e -e

x
an

)

ù
+ an ú
û

n

As required
...
To see why,

with K > 0, and consider that by Markov’s Inequality

(

)

 Xn > K £
o

 Xn

K

£

<¥


supn  Xn

K

Proposition: Let {Xn } be a sequence of random variables


If Xn  X , then {Xn } is tight



If

{X }
n

is tight, then there exists a subsequence

{n }
k

such that

{X }
nk

converges weakly
...

o

Definition (Characteristic Function): A characteristic function (CF) of a

random variable X is given by

jX (q) =  éëê exp(iqX )ùûú

qÎ

Remarks


jX (0) = 1



jX (q) £ 1 , because by Jensen’s inequality, jX (q) £  exp(i qX ) = 1



jX(n ) (0) = i n (X n ) , provided  X



If X and Y are independent, jX +Y (q) = jX (q)jY (q) , though the converse

{

n

}

< ¥
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

o

Page 35

Examples: If X ~ N (m, s 2 ) , then

(

jX (q) = exp i mq - 12 s 2 q 2

)

Note also that if Y = m + sX , then

(

)

jY (q) =  e iq[ m+sX ] = e iqm + jX (sq)

This is convenient in relating all normally distributed random variables to the
standard normal
...
If


jX (q)  j(q) for all q Î 



j(q) is continuous at 0
...

Comments:


If Xn  X , then the two conditions in the proposition trivially hold –
this is really an “if and only if” statement
...
To
see

why,

consider,

for

example,

X n ~ N (0, n )
...
To see why,
consider that with K > 0,
æ
K ö÷
1
 X n > K =  çççN (0,1) >
÷÷÷ 
çè
2
nø

(

)

In other words, the distribution in question is not tight – tail probabilities
grown with n
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling


Page 36

The Central Limit Theorem
o

X1 , X 2 , 

Theorem: Let

(X1 ) = m

be IID with mean

and variance

s 2 = ar(X1 )
...
Let S n =

å

n
i =1

X i , and

consider its characteristic function
jS

n

/ n

(
(

)

é
ù
(q) =  ê exp i qS n / n ú
ë
û
n
é
iq
Xj
=  ê exp
n å j =1
ë
n
=  éê j =1 exp iq X j
n
ë
n
é
iq
Xj
=  j =1  ê exp
n
ë
n
= éêjX q ùú
n û
ë

(
(

( )

)ùúû

)ùúû
)ùúû

Now, let us take a Taylor Expansion

jX

q2
j ¢¢ (0) + Rn
2n X
n
q
q2 2
= jX (0) +
(X ) +
i (X 2 ) + Rn
2n
n
q 2s 2
= 1+ Rn
2n

( )= j
q

n

(0) +
X

q

jX¢ (0) +

Now, consider that
jS

n

( ( ))

(q) = jX
n

/

n

q

n

n

æ
ö
q s2
= ççç1 + Rn ÷÷÷
÷ø
çè
2n
2

= (1 + an )

bn

2 2
Now, suppose we can show that anbn = n éê q2sn + Rn ùú =
ë
û

q2s2
2

other words, that nRn  0 ) as n  ¥ , then we would have
jS

n

/

æ q 2 s 2 ö÷
çç÷÷
q

(
)
exp
çç
n
2 ÷ø
è
= j(q)

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

+ nRn 

q2s2
2

(or in

Foundations of Stochastic Modeling

Page 37

This is the characteristic function of N (0, s 2 ) , and it is continuous at q = 0
...

All we now need to do is to prove that nRn  0 as n  ¥
...
Let us apply expectations
n ö
æ n +1
n
÷÷
çç x
x
2
(
)
ix
÷÷
£  çç

 e ix - å m =0
÷÷
ç (n + 1) !
!
m!
n
÷ø
çè
n

Now, use Jensen’s Inequality on the LHS to obtain

( )

 e

i qX

2
2ö
æ
çç X q
æ 2 (iX q)m ÷ö
2 X q ÷÷
ç
÷÷ =  é f (X , q)ù
÷÷ £  çç
-  ççå m =0

êë
ûú
çç 3!
çè
m ! ÷ø
2 ÷÷÷
è
ø

Now, observe that if q  0 , f (X , q)  0 almost surely, since  X is bounded
...
We can therefore apply the

dominated convergence theorem to conclude that

 éëê f (X , q)ùûú  0

as q  0

More details?
But we have

(

Rn =  f (X ,

q
n

)

)

3
ì
ü
ï
ï
q
ï
2ï
X
2 q ï
ï
n
ï
= n ï
X
í
ý
ï
ï
n
3!
ï
ï
ï
ï
ï
ï
î
þ
3 3
ì
ü
ï
ï
q
2
ï
2 q ï
ïX n
ï
= í
X
ý
ï
n ïï
3!
ï
ï
ï
ï
ï
î
þ

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 38

Inside goes to 0 as n -> infinity for all theta, and is dominated by |X|^2 theta^2
which has finite expectation, hence by dominated convergence it all goes to 0
...



Odds & Ends
o

Proposition (The Skorohod Representation): Let

{X }
n

be a sequence of

random variables such that X n  X
...

Proof: Let Fn be the distribution of Xn and F be the distribution of X
...
Now, define

F -1 (p) = inf {x : F (x ) ³ p}
(This generalized inverse deals with cases in which there are discontinuities in the
distribution)
...

Now, take (W,  , ) = ([0,1],  , uniform distribution)
...
Let

X n¢ = Fn-1 (U )

X ¢ = F -1 (U )

Note that

 (X n¢ £ x ) =  (Fn¢(U ) £ x )

=  (U £ Fn (x ))
= Fn (x )
d

d

Similarly,  (X ¢ £ x ) = F (x )
...
Note
also, however, that by () , X n¢  X ¢ for all but continuity points in W
...
Thus, Xn¢  X ¢ almost surely
...
Let f be a function with  (X Î D f ) = 0 with Df being the set of
discontinuity points of f
...

LECTURE 4 – 9th February 2011



Introduction to Large Deviations
o

Consider a sequence of IID random variables,

Sn =

å

n
i =1

X1 , X 2 ,  ~ N (m, s 2 )
...
Fix a > m
...
It is easy to derive in the
case of a Gaussian variable – we seek to develop this result in more generality
...
The
central limit theorem would tell us that

Sn » nm + sZ n

Z ~ N (0,1)

A “natural” question, therefore, would be to consider

(Sn > nm + sK n )
The theory of large deviations goes even further – since a > m , we can write
(Sn > na ) = (Sn > n m + sKn )
In other words, the central limit theorem deals with “normal” deviations, of order
n from the mean
...
We will now see why the central limit theorem is powerless to deal with
these large deviations
...
Fix a > m
...
The central limit theorem is clearly

not equipped to deal with large deviations
...
Then for any a > m ,
1
log  (Sn > na )  -I (a )
n
Where

I (a ) = supqÎ éëêqa - log M (q)ùûú
I (a ) = supqÎ éêëqa - j(q)ùúû
(We denote j(q) = log M (q) )
...

Remark: The same conclusions hold if M (q ) exists in a neighborhood of the

origin (plus some mild technicalities)
...
In this case,

{

I (a ) = supq qa - qm - 12 q 2 s 2

This is simply a quadratic, with argmax q * =
I (a ) =

a -m

s2

}

...

o

Proof of Cramer’s Theorem: We will carry out this proof in two steps


Step 1: establish an upper bound

lim supn


1
log  (Sn > na ) £ -I (a )
n

Step 2: establish a lower bound

lim infn

1
log  (Sn > na ) ³ -I (a )
n

Fix a > m and q > 0
...

Lemma (Generalized Form of Markov’s Inequality): Let

g :  +   + be an increasing function such that  g(X1 ) < ¥
...



Now, consider that in our case
...
Note that we have used the
generalized form of Markov’s Inequality in the last line)
...
To do
this, we choose q such that

q Î arg supq>0 {qa - j(q)}
Note, however, that

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling


The

derivative

Page 43
of

our

objective,

at

q = 0,

is

given

by

a - j ¢(0) = a - m > 0
...

Together, these two points imply that extending our optimization
problem to q Î  will not change our optimum, since it necessarily lies in
the positive quadrant
...
Going
back to the above, we therefore find that

 (Sn > na ) £ exp (-nI (a ))
Precisely as requested
...
Fix a Î (0,1) and q1 , q2 Î 
...
Assume,

for notational convenience, that X has a density p(x) – this is not,
however, required
...
Suppose

g:

is such that

 g(X1 ) < ¥ , then
 ëêég(X )ûúù =

ò

g(x )p(x ) dx =

ò

g(x )

é
p(x )
p(x ) ùú
pq (x ) dx = q êêg(X1 )
pq (x )
pq (x ) úûú
ëê

Similarly,
é
p(X1 , , X n ) ùú
 éêëg(X1 , , X n )ùúû = q êêg(X1 , , X n )
pq (X1 , , X n ) úúû
êë

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 44

Where here, g : n  
...

Now, consider
é
ù
 (Sn > na ) =  ê {S >na } ú
n
ë
û
é
p(X1 , , X n ) ùú
= q êê {S >na }
p0 (X1 , , X n ) úúû
êë n
é
p(X i ) ùú
n
= q êê {S >na } i =1
n
pq (X i ) úûú
ëê
é
ù
p(X i )
n
ú
= q êê {S >na } i =1 qX
ú
n
i
e p(X i ) / M (q) úû
êë
é
ù
n
éM (q)ù
ê
ú
ê
ú
ë
û
ú
= q êê {S >na }
ú
n
ê n
exp q å i =1 X i ú
êë
úû
é
ù
= q ê {S >na } exp (-qSn + nj(q))ú
n
ë
û

(

)

Now, fix a d > 0
é
ù
 (Sn > na ) ³ q ê 
exp (-qSn + nj(q))ú
na
S
na
n
<
£
+
d
}
êë { n
úû
é
ù
³ q ê 
exp -q na + d n + nj(q) ú
na
S
na
n
<
£
+
d
}
êë { n
úû

( (

(

)

)

= e -n (qa -j(q ))e -qd n q na < Sn £ na + d n

)

This is starting to look like what we want – it is crucial, however, to
ensure that the last probability stays bounded away from 0 – if not, it
tends to negative infinity
...
Our hope is to choose our new distribution
(characterized by q ) to ensure we get what we want
...
Then the mean of Sn under our new
distribution would indeed be na, which gives us hope that the probability
above will be bounded away from 0
...
Under the conditions of
Cramer’s Theorem, it turns out this is also satisfied
...
Now, returning to our expression above,
1
q *d 1
log  (Sn > na ) = - éêq *a - j(q * )ùú + log q* na < Sn £ na + d n
ë
û
n
n n

(

)

Taking a lim inf,
lim infn

1
log  (Sn > na ) ³ - éêq *a - j(q * )ùú
ë
û
n

The last thing we now need to verify is that the q* required is indeed the
maximizer of the function above, to get I(a)
...
In this case, they are a = j ¢(q) ,
precisely as desired
...
These extra conditions are precisely those that ensure such a q does exist
...
Fix a > m
...
Using
this result, however, we can be more exact – we can find a sequence an such that
Sn eventually lies below n m + an almost surely
...
Let us
consider, for d > 0 , an = 2(1 + d )s 2n log n
...
With that choice of
envelope,

 (Sn > n m + an ) £ exp (-(1 + d )log n )
= n -(1+d )
This

{S

is

indeed

summable
...
Using a similar kind of argument, we could show

n

that

{S

n

³ n m - an } eventually almost surely
...

o

Lemma,



Moderate deviations theory
...
What about deviations in between?
Theorem

(Moderate

Deviations): Under the conditions of Cramer’s

Theorem, for any sequence an such that


an



an

n

n

¥
0

for any a > 0 ,
n

an2

log  (Sn > n m + a an )  -

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

a2
2s 2

Foundations of Stochastic Modeling

Page 47

Proof of upper bound: WLOG, set m = 0 and a = 1
...
Let us now use a Chernoff bound to
deduce that



(

Sn

>

n

an

n

) £ exp(q
=e

But j( nq ) ~

q2 s 2
2n 2

an

n

nj( nq ) -q

e

S
) éê exp(q nn )ùú
ë
û
a
n
n

, so



(

Sn
n

>

an

Sn

>

n

æ

2

) £ C exp ççççè q2sn

2

-q

an ö÷
÷÷
n ÷ø÷

Optimizing, we find



(

n

an
n

)

æ a2 ÷ö
ç
£ C exp çç- n 2 ÷÷÷
çè 2ns ø÷

As required
...

Remarks: If S0 = 0:


(Sn ) = n (X1 ) and ar(Sn ) = n ar(X1 )



By the Strong Law of Large Numbers, Sn » n (X1 )



Sn » n (X1 ) + s nN (0,1)

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

{X }
n

Foundations of Stochastic Modeling

o

Page 48

Question: Suppose T is a positive, integer valued random variable
...
We then have

(ST ) = 

(å

T
i =1

(

)

)

T
X i =   êé å i =1 X i | T úù =  (T [X1 ]) = (T )(X1 )
ë
û

Unfortunately, this answer is not particularly interesting or useful, simply
because in all “interesting” situations, T is allowed to depend on the behavior of
the system
...

o

Example: Consider a random walk defined as follows
IID ì
ï1
Xi ~ ï
í
ï1
ï
î
Sn

In this case,

n

with prob p
with prob 1 - p

p>

Sn = å i =1 Xi
n

1
2

 2p - 1 > 0 a
...
, and so Sn  ¥ almost surely
...

Furthermore,

by

definition,

ST = 0  (ST ) = 0
...
Thus, it is clear, in this case that
(ST ) ¹ (T )(X1 )
Clearly, therefore, there can be some issues
...

o

Definition (Filtration): We previously defined a probability space by the triplet

(W,  , ) , where

 represented possible events
...
This sigma algebra constrains all
the information that is available from knowing the value of the variables

X 0 , , Xn
...

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

is

a

filtration,

and

clearly,

Foundations of Stochastic Modeling

o

Page 49

Definition (Stopping time): Suppose T is a non-negative integer-valued random

variable
...
In other words, we require

{T = k } Î 
Example (Hitting Times): Let T = inf {n ³ 0 : X

"k

k

o

n

Î A}
...



If  Xi < ¥ and (T ) < ¥ , then (ST ) = (T )(X 0 )

Proof: Let ST =

å

T
i =1

X i = å i =1 X i {i £T }
...
Going back to our sum

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 50

(

)

¥
é
ù
(ST ) = å i =1  ê  Xi {i £T } | i -1 ú
êë
úû
¥
é
ù
= å i =1  ê {i £T }  (Xi | i -1 )ú
ë
û
¥
é
ù
= å i =1  ê {i £T }  (X1 )ú
ë
û
¥
= (X1 )å i =1  {i £T }

(

)

Using Fubini I again:

(ST ) = (X1 )

(å

¥
i =1

{i £T }

)

= (X1 )(T )


Second part: In this case, consider that, by the triangle inequality
ST £ å i =1 X i
T

By part 1, however,



(å

T
i =1

)

Xi = (T ) X1 < ¥

The result then follows by Fubini II
...
We let

Z n = k + Sn , with

0
–

intuitively, k is our initial wealth, and Zn is our wealth at time n
...
But consider also that, by definition
(ZT ) = 0 (ZT = 0) + N (ZT = N )

= N (ZT = N )

Now, unfortunately, the Xi are not non-negative
...
Let us now assume (T ) < ¥ (this will be proved
in homework 2)
...
Let T be a stopping time with respect to

{ } ,
n

and

such that (T 2 ) < ¥
...
We seek

to extend this to stopping times
...
Now

(å
=  (å

(ST2 ) = 

T

)
X ) + 2 ( å

Xi
i =1

T
i =1

2

2
i

T -1
i =1

å

T
j =i +1

Xi X j

)

But by Wald I,


(å

T
i =1

)

X i2 = (T )(X12 )
= s 2 (T )

Now, consider the second term



(å

T -1
i =1

å

T
j =i +1

) (å

Xi X j = 

¥
i =1

å

¥
j =i +1

Xi X j {T -1³i} {j £T }

)

Now, imagine it were possible to apply Fubini 2
...

As such, we have that {T ³ j } {T ³i +1} = {T ³ j }
...
X is bounded, and so X £ K

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 52

ST2 =

(å

£

(å

T

)
X )

Xi
i =1

2

2

T
i =1

i

£ (KT )2 = K 2T 2
This has a finite expectation, since T has a second moment
...
Then


S n Î n



Provided  Xi < ¥ , then  Sn £ n  X1 < ¥



 éëêSn | n -1 ùûú =  éëêSn -1 + Xn | n-1 ùûú = Sn -1 + (Xn )

Thus, if we let (Xn ) = 0 , then our random walk is a martingale
...
Then


Condition 1 holds
...

{ }
n

and put

Di = M i - M i -1
...
Indeed, it expresses our martingale as a
sum of mean-0 uncorrelated random variables
...



As we will see later, the boundedness condition is required to be able to
argue that why exactly is this needed?

 Di D j £  Di

2

 Dj

2

<¥

(Note: in this case, it would be enough to require the second moment of
each difference to be bounded, but it seems less clunky to require the
slightly more stringent uniform boundedness condition)
...

n

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011



Foundations of Stochastic Modeling

o

Example: Let

{X }

Page 54
(X i ) = 1

be IID, with

n

 Xi < ¥
...
This is a martingale:
n



Condition 1 satisfied
...
Let us consider some
simple examples
...
When is it the case

that (MT ) = (M 0 ) = 0 ? Effectively, we are asking when it is the case that
(ST ) = m (T )
This is precisely the subject matter of Wald’s First Identity
...
When is it

true that (MT ) = (M 0 ) = 0 ? Effectively, we are asking when it is the case that

(ST2 ) = s 2 (T )
This is precisely the subject matter of Wald’s Second Identity
...
Then for each m > 1
 (MT m ) =  (M 0 )
Where T  m = min(T , m )
...



Remark: Consider that



Consider that limn ¥ MT n  MT
...

By the theorem above, however,  éêëMT n ùúû =  éêëM 0 ùúû , which implies than
limn ¥  éëêMT n ùûú =  éëêM 0 ùûú
...

o

Corollary

II:

Let

MT n £ Z ,

with

(Z ) < ¥
...

o

Corollary III: Let (M n : n ³ 0) be a martingale with respect to

{ }
n

and T be

a stopping time such that (T ) < ¥
...

ë
û

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 56

Remark: This is a stronger version of Wald I
...
In the case of
a random walk, the X are exactly the martingale differences
...

Proof: We have

MT n =

å

T n
i =1

Di + M 0

M0 is integrable
...
By Fubini I, we can then swap the expectation and
sum, since the summands are positive
¥
¥
¥
é
ù
é
ù
å i =1 Di {T ³i} = å i =1  ê Di {T ³i} ú = å i =1  ê {T ³i}  éê Di | n -1 ùú ú
ë
û
ë
û
ë
û

Since the martingale differences are uniformly bounded bt C, this is < C (T )
...
As such, our
martingale is bounded by an integrable variable, and the OST holds by corollary
II
...

We now consider very large times T
...
Then
Mn  M ¥
for some finite bounded random variable M ¥
...
s
...
This is a random-walk like structure, even
n

though the Di may not be independent
...
If we succeed in showing that, we have
effectively shown that Mn/n converges to 0
...
Then
Mn
n

0

a
...

n
D
 =
Proof: Put M
å k =1 kk (where, as usual Dk = M k - M k -1
...

and  M
n
Thus, it is a martingale
...

n
 2£
M
å k =1
n

supn ³1 (Dk2 )

k2

<¥

This implies that

 2 < ¥  sup  M
 <¥
supn ³1  M
n
n ³1
n
(Notice that it is often easier to work with second moments rather than first
moments, despite the fact finiteness of second moments is a stronger condition

Transcribed by Daniel Guetta from
lectures by Assaf Zeevi, Spring 2011

Foundations of Stochastic Modeling

Page 58

than finiteness for first moments)
...
s

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview