Stochastic Processes Beichelt.pdf | More Info | Notesale | Buy and Sell Study Notes Online | Extra Student Income | University Notes

Search for notes by fellow students, in your own course and all over the country.

Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.

My Basket

Buy These Notes

You have nothing in your shopping cart yet.

Title: Stochastic Processes Beichelt.pdf
Description: This book contains detailed description of stochastic processes with examples

Buy These Notes Preview

Document Preview

Extracts from the notes are below, to see the PDF you'll receive please use the links above

Stochastic Processes in Science, Engineering, and Finance
by Frank E
...
1 RANDOM EVENTS AND THEIR PROBABILITIES
Probability theory comprises mathematically based theories and methods for investigating random phenomena
...
A random experiment is characterized by two properties:
1
...

2
...

Thus, the outcomes of a random experiment cannot be predicted with certainty
...
Examples of random experiments are:
1) Counting the number of vehicles arriving at a filling station a day
...
The possible
outcomes are, as in the previous random experiment, nonnegative integers
...

4) Recording the lifespans of technical systems or organisms
...
The possible outcomes
are, as in the random experiments 3 and 4, nonnegative numbers
...
This 'profit'
can be negative, i
...
any real number can be the outcome
...

Random Events A possible outcome a of a random experiment is called an elementary or a simple event
...
Here and in what follows, the sample space is denoted as M
...

A random event (briefly: event) A is a subset of M
...

Let A and B be two events
...

© 2006 by Taylor & Francis Group, LLC

2

STOCHASTIC PROCESSES

A\B

A∩B

B\A
B

A

M

Figure 1
...

A\ B is the set of all those elementary events which are elements of A, but not of B
...
Note that A\ B = A\ (A ∩ B)
...
If A occurs, then A cannot occur and
vice versa
...
, A n be a sequence of random events
...

(1
...

(1
...
By definition, M contains all elementary events so that it
must always occur
...
Two events A and B are called disjoint or (mutually) exclusive if their joint occurrence is impossible, i
...
if
A ∩ B = ∅
...
In particular, A and A are disjoint events (Figure 1
...

Probability Let M be the set of all those random events A which can occur when
carrying out the random experiment, including M and ∅
...
, i
...

A i ∩ A j = ∅ for i ≠ j,
P⎛
⎝

∞
∞
⎞
i=1 A i ⎠ = Σ i=1 P(A i )
...
3)

The number P(A) is the probability of event A
...
This interpretation of the probability is justified by the
following implications from properties I) to III)
...

2) If A ⊆ B , then P(B\A) = P(B) − P(A)
...

For any events A and B, P(B\A) = P(B) − P(A ∩ B)
...
e
...

4) For any events A, B, and C,
P(A B) = P(A) + P(B) − P(A ∩ B),
P(A B C) = P(A) + P(B) + P(C) − P(A ∩ B) − P(A ∩ C) − P(B ∩ C)

(1
...

5) In generalizing implications 4), one obtains the Inclusion-Exclusion-Formula: For
any random events A 1 , A 2 ,
...

with
Pk =

n

Σ
...
∩ A i ),
1

2

k

where the summation runs over all k-dimensional vectors
(i 1 , i 2 ,
...
< i k ≤ n
...
e
...

The probabilities of random events are usually unknown
...
If in a series of n repetitions of one and the same
random experiment the event A has been observed m = m(A) times, then the relative
frequency of A is given by
m(A)
p n (A) = n
...

n→∞

(1
...
9
...

Conditional Probability Two random events A and B can depend on each other in
the following sense: The occurrence of B will change the probability of the occurrence of A and vice versa
...
This is done by defining the conditional probability of A given B
...
Then the conditional probability of A given
B or, equivalently, the conditional probability of A on condition B is defined as
P(A B) =

P(A ∩ B)

...
6)

Hence, if A and B are arbitrary random events, this definition implies a product formula for P(A ∩ B) :
P(A ∩ B) = P(A B) P(B)
...
, B n } is called an exhaustive set of random events if
n
i=1 B i = M
...
, B n be an exhaustive and disjoint set of random events with property P(B i ) > 0 for all i = 1, 2,
...
Then the following formulas are true:
n

P(A) = Σ i=1 P(A B i ) P(B i )
P(B i A) =

P(A B i ) P(B i )
=
P(A)

P(A B i ) P(B i )
, i = 1, 2,
...

n
Σ i=1 P(A B i ) P(B i )

(1
...
8)

Equation (1
...
8) is called Bayes' theorem or Formula of Bayes
...

Independence If the occurrence of B has no influence on the occurrence of A, then
P(A B) = P(A)
...

(1
...
Obviously, (1
...
Hence, defining independence of two random
events by (1
...

Note that if A and B are independent random events, then the pairs A and B, A and B,
and A and B are independent as well
...

The events A 1 , A 2 ,
...
, A i } of the set {A 1 , A 2 ,
...
∩ A i ) = P(A i ) P(A i )
...

1
2
1
2
k
k

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

5

Specifically, the independence of the A i implies for k = n a direct generalization of
formula (1
...
∩ A n ) = P(A 1 ) P(A 2 )
...

(1
...
1 In a set of traffic lights, the colour 'red' (as well as green and yellow) is
indicated by two bulbs which operate independently of each other
...
What is the probability that in the time
interval [0, 200 hours] colour 'red' is visible if it is known that a bulb survives this
interval with probability 0
...

The event of interest is
C=A

B = 'red light is clearly visible in [0, 200]'
...
95 + 0
...
95) 2 = 0
...

Another possibility of solving this problem is to apply the rule of de Morgan (1
...
95)(1 − 0
...
0025
...
9975
...
2 1% of the population in a country are HIV-positive
...
98 that
the person is HIV-positive if it is HIV-positive, and with probability 0
...
What is the probability that a test
person is HIV- positive if the test indicates that?
To solve the problem, random events A and B are introduced:
A = 'The test indicates that a person is HIV-positive
...
'
Then,
P(B) = 0
...
99
P(A B) = 0
...
02,
P(A B) = 0
...
04
...
7) is applicable to determining P(A) :
P(A) = P(A B) P(B) + P(A B) P(B)
= 0
...
01 + 0
...
99 = 0
...

© 2006 by Taylor & Francis Group, LLC

6

STOCHASTIC PROCESSES

Bayes' theorem (1
...
98 ⋅ 0
...
1984
...
0494

Although the initial parameters of the test look acceptable, this result is quite unsatisfactory: In view of P(B A) = 0
...
In such a situation the test has to be repeated several times
...
96 ⋅ 0
...
99979
...
0494
P(A)
This result is, of course, an excellent feature of the test
...
2 RANDOM VARIABLES
1
...
1 Basic Concepts
All the outcomes of the random experiments 1 to 6 at page 1 are real numbers
...
With such outcomes, no quantitative analysis of the random experiment is
possible
...
Or consider a problem in quality control
...
The random experiment consists in checking the
quality of the units in a sample of size n
...
Usually, one is
not primarily interested in these sequences, but in the total number of faulty units in a
sample
...
This leads to the concept of a random variable:
Given a random experiment with sample space M, a random variable X is a real
function on M: X = X(a), a ∈ M
...
The set of all possible values or realizations which X can assume is called the
range of X and is denoted as R = {X(a), a ∈ M}
...
(When flipping a coin, a '-1' ('+1) may be assigned to
head (tail))
...
By
introducing a random variable X, one passes from the sample space M of a random
experiment to the range R of X, which is simply another sample space for otherwise

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

7

the same random experiment
...
The advantage of introducing random variables X is that they do not depend on the physical
nature of the underlying random experiment
...
This 'probabilistic law' is called probability distribution of X and will be denoted as P X
...

A discrete random variable has a finite or a countably infinite range, i
...
the set of its
possible values can be written as a finite or an infinite sequence (examples 1 and 2)
...
Further, let p i
be the probability of the random event that X assumes value x i :
p i = P(X = x i ), i = 0, 1, 2,
...
can be identified with the probability distribution P X of X,
since for any interval (a, b] the interval probabilities are given by
P(X ∈ (a, b]) = P(a < X ≤ b) =

Σ

x i ∈(a,b]

pi
...

On the other hand, any sequence of nonnegative numbers {p 0 , p 1 , p 2 ,
...

The range of a continuous random variable X is a finite or an infinite interval
...

(1
...
Any distribution function F(x) has properties
1) F(−∞) = 0,

F(+∞) = 1

2) F(x) is nondecreasing in x
...
12)

On the other hand, every function F(x) which is continuous from the right and satisfies properties (1
...
2)
...

© 2006 by Taylor & Francis Group, LLC

(1
...
2 Qualitative graph of the distribution function of a continuous random variable

The definition (1
...
Let {x 0 , x 1 , x 2 ,
...
Then,
⎧ 0
for x < x 0
⎪
F(x) = P(X ≤ x) = ⎨ k
⎪ Σ i=0 p i for x k ≤ x < x k+1 ,
⎩

k = 0, 1, 2,
...

(1
...
14) has to
be supplemented by F(x) = 1 for x n ≤ x
...
Therefore (Figure 1
...

Given p 0 , p 1 ,
...

Hence, the probability distribution of any random variable X can be identified with
its distribution function
...
3 Qualitative graph of the distribution function of a discrete random variable
© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

1
...
2

9

Discrete Random Variables

1
...
2
...
However, to get quick information on essential features of a random
variable, it is desirable to condense as much as possible of this information to some
numerical parameters
...

Thus, the mean value of a discrete random variable X is a 'weighted mean' of all its
possible values x i
...

Another motivation of this definition (see section 1
...
2): The arithmetic mean of n
values of X, obtained from n independent repetitions of the underlying random experiment, tends to E(X) as n tends to infinity
...
}, then its mean value can be written in the
form
∞
∞
∞
E(X) = Σ i=1 P(X ≥ i) = Σ i=1 Σ k=i p k
...
15)
If y = h(x) is a real function, then the mean value of the random variable Y = h(X) can
be obtained from the probability distribution of X :
∞

E(X ) = Σ i=0 h(x i ) p i
...

Hence, Var(X ) is the mean squared deviation of X from its mean value E(X) :
Var(X) = E((X − E(X) 2 )
...

The standard deviation of X is defined as
σ = Var(X ) ,
and the coefficient of variation of X is
V(X ) = σ / μ
...
16)

10

STOCHASTIC PROCESSES

Variance, standard deviation, and coefficient of variation are measures for the variability of X
...

The n th moment μ n of X is the mean value of X n :
∞

μ n = E(X n ) = Σ i=0 x n p i
...
2
...
2 Important Discrete Probability Distributions
Uniform Distribution A random variable X with range R = x 1 , x 2 ,
...
, n
...
Mean value and variance are
n
E(X ) = 1 Σ i=1 x i ,
n

2
n
Var(X ) = 1 Σ i=1 ⎛ x i − E(X ) ⎞
...
In particular, if
x i = i, then
n(n + 1)
(n − 1) (n + 1)
E(X) =
, Var(X) =

...
, 6} and p i = 1/6
...
} has a geometric distribution with parameter p, 0 < p < 1, if
p i = P(X = i) = p (1 − p) i−1 ; i = 1, 2,
...

For instance, if X is the random integer indicating how frequently one has to toss a
die to get for the first time a '6', then X has a geometric distribution with p = 1/6
...

Sometimes the geometric distribution is defined with range R = {0, 1,
...

In this case, mean value and variance are
E(X) =

© 2006 by Taylor & Francis Group, LLC

1−p
p ,

Var(X) =

1−p
p2

...
} has a Poisson
distribution with parameter λ if
i
p i = P(X = i) = λ e −λ ;
i!

i = 0, 1,
...

The parameter λ is equal to mean value and variance of X :
E(X) = λ , Var(X) = λ
...

Mean value and variance are
E(X) = p and Var(X) = p(1 − p)
...
In case
R = {0, 1} , X is a (0, 1) -variable
...
, n} has a binomial distribution with parameters p and n if
p i = P(X = i) = ⎛ n ⎞ p i (1 − p) n−i ;
⎝i⎠
Frequently the following notation is used:

i = 0, 1, 2,
...

p i = b(i, n, p) = ⎛ n ⎞ p i (1 − p) n−i
...

The binomial distribution occurs in the following situation: A random experiment,
the outcome of which is a (0,1)-variable, is independently repeated n times
...
The outcome X i of experiment i can be considered the indicator variable of a random event A with probability p = P(A) :
⎧ 1 if A occurs
Xi = ⎨
; i = 1, 2,
...

⎩ 0 if A occurs
If the occurrence of event A is interpreted as 'success', then the sum
n

X = Σ i=1 X i
is equal to the number of successes in a Bernoulli trial of length n
...

Note that the number of experiments which have to be performed in a Bernoulli trial
till the first occurrence of event A has a geometric distribution with parameter p and
range {1, 2,
...
} has a
negative binomial distribution with parameters p and r, 0 < p < 1, r > 0, if
P(X = i) = ⎛ r + i − 1 ⎞ p i (1 − p) r ;
⎝
⎠
i

i = 0, 1,
...

pr

(1 − p) 2

...
(see geometric distribution)
...
, min (n, M)}
has a hypergeometric distribution with parameters M, N, and n, M ≤ N, n ≤ N , if
⎛ M ⎞ ⎛ N−M ⎞
⎝ m ⎠ ⎝ n−m ⎠
p m = P(X = m) =
;
⎛N⎞
n⎠
⎝

m = 0, 1,
...

As an application, consider the lottery '5 out of 45'
...
More importantly, as example 1
...

Approximations In view of the binomial coefficients involved in the definition of
the binomial and hypergeometric distribution, the following approximations are useful for numerical analysis:
Poisson Approximation to the Binomial Distribution If n is sufficiently large and p
is sufficiently small, then
⎛ n ⎞ p i (1 − p) n−i ≈ λ i e −λ ;
⎝i⎠
i!

λ = n p,

i = 0, 1,
...

Binomial Approximation to the Hypergeometric Distribution If N is sufficiently
large compared to n , then
⎛ M ⎞ ⎛ N−M ⎞
⎝ m ⎠ ⎝ n−m ⎠ ⎛ n ⎞ m
≈ ⎝ m ⎠ p (1 − p) n−m ,
⎛N⎞
⎝n⎠

© 2006 by Taylor & Francis Group, LLC

p= M
...

Example 1
...
01% of trout eggs will develop into adult fishes
...
It
is assumed that the eggs develop independently of each other
...
0001
...
0001) (0
...
, 40, 000
...

The desired probability is
p a = 1 − p 0 − p 1 − p 2 ≈ 1 − 0
...
0733 − 0
...
7619
...
4 A delivery of 10,000 transistors contains 200 defective ones
...
A
sample of size n = 100 is taken
...
The probability of rejection p r is the
producer's risk, since the delivery is in line with the agreement
...
Let X be the random number of defective transistors in the
sample
...

⎛ 10,000 ⎞
⎝ 100 ⎠
Since N is large enough compared to n, the binomial approximation with p = 0
...
02) m (0
...

⎝ m ⎠
Thus, the delivery is rejected with probability p r ≈ 0
...
For the sake of comparison: The Poisson approximation with λ = n p = 2 yields p r ≈ 0
...

© 2006 by Taylor & Francis Group, LLC

14

STOCHASTIC PROCESSES

1
...
3

Continuous Random Variables

1
...
3
...
This property of a continuous random variable results from its definition:
A random variable is called continuous if its distribution function F(x) has a first
derivative
...

The function
f (x) = F (x) = dF(x)/dx,

x ∈ RX

is called the probability density function of X (briefly: probability density or simply
density)
...
A density has property (Figure 1
...

Conversely, every nonnegative function f (x) satisfying this condition is the probability density of a certain random variable X
...
The range of X coincides with the set of all those x for which its
density is positive: R = {x, f(x) > 0} (Figure 1
...

f (x)
F(x 0 )

f (x 0 )

x0

Figure 1
...

x

1 PROBABILITY THEORY

15

In terms of its distribution function, the mean value of X is given by
∞

0

E(X) = ∫ 0 [1 − F(x)] dx − ∫ −∞ F(x) dx
...
15) is
∞

E(X) = ∫ 0 [1 − F(x)] dx
...
17)

If h(x) is a real function and X any continuous random variable with density f (x),
then the mean value of the random variable Y = h(X) can directly be obtained from
the density of X:
+∞

E(Y) = ∫ −∞ h(x) f (x) dx
...
18)

In particular, the mean value of h(X) = (X − E(X)) 2 is the variance of X:
+∞

Var(X) = ∫ −∞ (x − E(X)) 2 f (x) dx
...
Standard deviation and coefficient of variation are defined and motivated as
with discrete random variables
...

The following relationship between variance, second moment and mean value is also
valid for discrete random variables:
(1
...

For a continuous random variable X , the interval probability (1
...

The α−percentile x α (also denoted as α−quantile q α ) of a random variable X is defined as
F(x α ) = α
...
The 0
...
Thus, in a long series of random
experiments with outcome X, about 50% of the observed values will be to the left
and to the right of x 0
...

A probability distribution is symmetric with symmetry center a if f (x) satisfies
f (a − x) = f (a + x) for all x
...
5
...
5

α

0

median

x

xα

Figure 1
...
A density f (x) is called unimodal if it has only one maximum
...
For any random variable X with finite
mean value μ and variance σ, the random variable
X−μ
Z= σ
is a standardized random variable
...

1
...
3
...
If the distribution function is not explicitely given, it can only
be represented as an integral over the density
...

Thus, for any subinterval [a, b] of [c, d] , the corresponding interval probability is
P(a < X ≤ b) = b − a
...
e
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

17

Mean value and variance of X are
E(X) = c + d ,
2

Var(X) = 1 (d − c) 2
...

⎝x⎠
d

Mean value and variance are
E(X) = c d ,
c−1
Var(X) =

c > 1,

cd 2
,
(c − 1) 2 (c − 2)

c > 2
...

Mean value and variance are
E(X) = 1/λ ,

Var(X) = 1/λ 2
...
Frequently, the parameter λ is denoted as 1/μ
...

Mean value and variance are
E(X) = n /λ,

Var(X) = n /λ 2
...

Gamma Distribution A random variable X has a Gamma distribution with parameters α and β if it has density
f (x) =

© 2006 by Taylor & Francis Group, LLC

β α α−1 −β x
x
e
,
Γ(α)

x > 0, α > 0, β > 0,

18

STOCHASTIC PROCESSES

where the Gamma function Γ(z) is defined by
∞
Γ(z) = ∫ 0 x z−1 e −x d x ,

z > 0
...

Special cases: Exponential distribution for α = 1 and β = λ , Erlang distribution for
α = n and β = λ
...

B(α, β)
Mean value and variance are
αβ
E(X) = α , Var(X) =

...

Weibull Distribution A random variable X has a Weibull distribution with scale parameter θ and form parameter β if it has distribution function and density (Figure
1
...

θ ⎝θ⎠
Mean value and variance are
1
E(X) = θ Γ ⎛ β + 1 ⎞ ,
⎝
⎠

2⎤
⎡ 2
1
Var(X) = θ 2 ⎢ Γ ⎛ β + 1 ⎞ − ⎛ Γ ⎛ β + 1 ⎞ ⎞ ⎥
...
6 Densities of the Weibull distribution

© 2006 by Taylor & Francis Group, LLC

x

1 PROBABILITY THEORY

19

Special cases: Exponential distribution for θ = 1/λ and β = 1, Rayleigh distribution
for β = 2
...
Rosin and E
...
In the forties of the past century, the Swedish engineer W
...

Normal Distribution A random variable X has a normal (or Gaussian) distribution
with parameters μ and σ 2 if it has density (Figure 1
...

As the notation of the parameters indicates, mean value and variance are
E(x) = μ , Var(X) = σ 2
...
Different from most other probability
distributions, the standardization of a normally distributed random variable also has a
normal distribution
...

The density of the standardized normal distribution is denoted as ϕ(x) :
ϕ(x) =

1 e −x 2 /2 ,
2π

− ∞ < x < +∞
...

ϕ(x)

1
2π σ

0
...
7 Density of the normal distribution (Gaussian bell curve)
© 2006 by Taylor & Francis Group, LLC

x

20

STOCHASTIC PROCESSES

Since ϕ(x) is symmetric with symmetry center 0,
Φ(x) = 1 − Φ(−x)
...

This is the reason for introducing the following notation (analogously for other distributions with symmetry center 0):
z α = x 1−α ,

0 < α < 1/2
...

Generally, if X = N(μ, σ 2 ), the interval probabilities (1
...

⎠
⎝
⎝ σ ⎠

Logarithmic Normal Distribution A random variable X has a logarithmic normal
distribution with parameters μ and σ if it has density
f (y) =

⎧ ⎛ ln y − μ ⎞ 2 ⎫
1
exp ⎨ − 1
σ ⎠ ⎬;
2π σ y
⎩ 2⎝
⎭

y > 0, σ > 0, − ∞ < μ < ∞

Thus, X has a logarithmic normal distribution with parameters μ and σ if it has structure X = e Y , where Y = N(μ, σ 2 )
...
Therefore, the distribution function of X is
F(y) = Φ

⎛ ln y − μ ⎞
,
⎝ σ ⎠

x > 0
...

⎝
⎠

Cauchy Distribution A random variable X has a Cauchy distribution with parameters λ and μ if it has density
f (x) =

λ

π [λ 2 + (x − μ) 2 ]

,

Mean value and variance do not exist
...

1 PROBABILITY THEORY

21

Inverse Gaussian Distribution A random variable X has an inverse Gaussian distribution with parameters α and β if it has density
f (x) =

2
α exp ⎛ − α(x − β) ⎞ ,
⎜
⎟
⎝
2β 2 x ⎠
2π x 3

x > 0, α > 0, β > 0
...

Mean value and variance are
E(X) = β,

Var(X) = β 3 /α
...

Mean value and variance are
E(X ) = μ,

Var(X ) = σ 2
...
5 A company needs wooden shafts of a length of 600 mm
...
The producer delivers shafts of random length X which
has an N(200, σ 2 ) -distribution
...
97725]
= 0
...

Thus, 4
...

2) What is the value of σ if the company rejects on average 10% of the shafts?
By making use of the previous derivation with σ = 3 replaced by σ ,

© 2006 by Taylor & Francis Group, LLC

22

STOCHASTIC PROCESSES
P( X − 600 > 6) = 1 − [Φ(6/σ) − Φ(−6/σ)] = 2 [1 − Φ(6/σ)]
...
1
...
95, or, equivalently, from
6/σ = z 0
...
64,
since the 0
...
95 = 1
...

Thus, σ = 3
...

Example 1
...
86230 [years] and σ 2 = 21
...

1) Assuming that the lifetime of an African wild dog has a Weibull distribution, the
parameters θ and β of this distribution satisfy
E(X) = θ Γ(1 + 1/β) = 8
...
45964
...
27323
...
Hence, θ = 10
...
4), the probability of interest is
P(X > 10 X > 5)) =

2
P(X > 10) e −(10/10)
=
2
P(X > 5)
e −(5/10)

= e −0
...
47237
...
36788
...
2
...
To emphasize the dependency on a special parameter θ , in
this section the notation P X,θ instead of P X is used
...

1 PROBABILITY THEORY

23

Mixtures of random variables or their probability distributions arise from the assumption that the parameter θ is a realization of a random parameter Θ, and all the
probability distributions being elements of the set {P X,θ , θ ∈ R Θ } are mixed
...
Discrete Random Variable Θ with range R Θ = {θ 0 , θ 1 ,
...
} with q n = P(Θ = θ n }; n = 0, 1,
...

2
...
Then the mixture of probability
distributions of type P X,θ is defined as
G(x) = ∫R F X (x, θ) f Θ (θ) d θ
...
If Θ is continuous, G(x) is the weighted integral of F X (x, θ) with weight function f Θ (x, θ)
...
12)
...

If X is continuous, the respective densities of Y are
∞

g(x) = Σ n=0 f X (x, θ n ) q n and g(x) = ∫ R f X (x, θ) f Θ (θ) d θ
...
16) and (1
...

If X is discrete with probability distribution
P X,θ = { p i (θ) = P(X = x i ; θ); i = 0, 1,
...

(1
...

© 2006 by Taylor & Francis Group, LLC

i = 0, 1,
...
21)

24

STOCHASTIC PROCESSES

The probability distribution of Θ is sometimes called structure or mixing distribution
...

The mixture of probability distributions provides a method for producing types of
probability distributions, which are specifically tailored to serve the needs of certain
applications
...
7 ( mixture of exponential distributions ) Let X have an exponential distribution with parameter λ :
F X (x, λ) = P(X ≤ x) = 1 − e −λ x , x ≥ 0
...

Mixing yields the distribution function
+∞

+∞

G(x) = ∫ 0 F X (x, λ) f L (λ) d λ = ∫ 0 (1 − e −λ x )μe −μ λ d λ
= 1 − μ /(x + μ)
...

This is a Pareto distribution
...
8 (mixture of binomial distributions) Let X have a binomial distribution with parameters n and p:
P(X = i) = ⎛ n ⎞ p i (1 − p) n−i ,
⎝i⎠

i = 0, 1, 2,
...

The parameter n is considered to be a value of a Poisson with parameter λ distributed random variable N:
n
P(N = n) = λ e −λ ;

n!

n = 0, 1,
...

Then, from (1
...
with regard to the structure
distribution P N is obtained as follows:

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

25
∞

Σ

P(Y = i) =
=

∞

Σ

n=i

=

n=0

⎛ n ⎞ p i (1 − p) n−i λ n e −λ
⎝i⎠
n!

⎛ n ⎞ p i (1 − p) n−i λ n e −λ
⎝i⎠
n!

(λ p) i −λ ∞ [λ (1−p)] k
e Σ
i!
k!
k=0

=

(λ p) i −λ λ (1−p)
e e

...

This is a Poisson distribution with parameter λ p
...
; λ > 0}
...
} is said to have mixed Poisson distribution if its probability distribution is a mixture of the Poisson distributions P X,λ
with regard to any structure distribution
...
e
...

A mixed Poisson distributed random variable Y has the following properties:
(1) E(Y) = E(L)
(2) Var(Y) = E(L) + Var(L)
(3) P(Y > n) =

∞

λ n e −λ F (λ)) d λ ,
L
n!
0

∫

where F L (λ) = P(L ≤ λ) is the distribution function of L and F L (λ) = 1 − F L (λ)
...
9 (mixed Poisson distribution, gamma structure distribution) Let the
random structure variable L have a gamma distribution with density
f L (λ) =

β α α−1 −β λ
λ
e
,
Γ(α)

λ > 0, α > 0, β > 0
...

This is a negative binomial distribution with parameters r = α and p = 1/(β + 1)
...

1
...
5 Functions of a Random Variable
Let X be a continuous random variable and y = h(x) a real function
...

Theorem 1
...
Then,
y−β
F Y (y) = F X ⎛ α ⎞
⎝
⎠

for α > 0,

y−β
F Y (y) = 1 − F X ⎛ α ⎞
⎝
⎠

for α < 0,

y−β
1
f Y (y) = α f X ⎛ α ⎞
⎝
⎠

for α ≠ 0 ,

E(Y ) = α E(X) + β,

Var(Y) = α 2 Var(X)
...

y−β
y−β
F Y (y) = P(Y ≤ y) = P(α X + β ≤ y) = P ⎛ X > α ⎞ = 1 − F X ⎛ α ⎞ for α < 0
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

27

For α > 0, the variance of Y is
y−β
1
Var(Y) = ∫ (y − E(Y)) 2 f Y (y)dy = ∫ (y − α E(X) − β) 2 α f X ⎛ α ⎞ dy
...

(The integrals involved refer to the ranges of X and Y
...

If X = N(μ, σ 2 ) , then the standardization of X, namely
X−μ 1
μ
Z= σ = σX− σ,
also has a normal distribution
...
Usually, Y = α X + β has not the same distribution type
as X
...

This distribution function characterizes the class of shifted exponential distributions
...

Strictly Monotone Function y = h(x) Let y = h(x) be a strictly monotone function
with inverse function x = h −1 (y)
...

If y = h(x) is strictly decreasing, then, for any random variable X,
F Y (y) = P(h(X) ≤ y) = P(X > h −1 (y))
...

By differentiation, applying the chain rule, the density of Y is in either case seen to be
f Y (y) = f X (h −1 (y))

d h −1 (y)
d x(y)
= f X (x(y))

...

Outside of this range, the distribution function of Y is 0 or 1 and the density of Y is 0
...
10 A solid of mass m moves along a straight line with a random velocity
X, which is uniformly distributed over the interval [0, V ]
...

2
1
2 , it follows that
In view of y = h(x) = m x
2

and d x = 1/(2my) ,

x = h −1 (y) = 2y /m

dy

0 < y < 1 m V 2
...

2

The mean kinetic energy of the solid is
E(Y) =

m V 2 /2

∫

0

y1
V

1/(2my) dy = 1
V

1/2m

m V 2 /2

∫

y 1/2 dy

0

m V 2 /2 1
1/2m ⎡ y 3/2 ⎤
= m V 2
...
18):
V
V
E(Y) = ∫ 0 1 m x 2 1 dx = 1 m 1 ∫ 0 x 2 dx = 1 m V 2
...
3 TRANSFORMATION OF PROBABILITY DISTRIBUTIONS
The probability distributions or at least moments of random variables can frequently
be obtained from special functions, so called (probability- or moment-) generating
functions of random variables or, equivalently, of their probability distributions
...
Examples will be considered in the following chapters
...
Formally, going over
from a probability distribution to its generating function is a transformation of this
distribution
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

29

1
...
1 z-Transformation
The discrete random variable X has range {0, 1,
...
with p i = P(X = i); i = 0, 1,
...
For our purposes it is sufficient to assume that z is a
real number
...
From (1
...

(1
...

Therefore, M(z) can be differentiated (as well as integrated) term by term:
∞

M (z) = Σ i=0 i p i z i−1
...

Taking the second derivative of M(z) gives
∞

M (z) = Σ i=0 (i − 1) i p i z i−2
...

Therefore, M (1) = E(X 2 ) − E(X)
...

Continuing in this way, all moments of X can be generated by derivatives of M(z)
...
In view of (1
...

(1
...
Hence, M(z) is also
called a probability generating function
...

30

STOCHASTIC PROCESSES

Then,
M(z) =

∞ λi
∞ ( λ z) i
e −λ z i = e −λ Σ
= e −λ e +λz
...

The first two derivatives are
M (z) = λ e λ (z−1) ,

M (z) = λ 2 e λ (z−1)
...

Thus, mean value, second moment and variance of X are
E(X) = λ,

Var(X) = λ,

E(X 2 ) = λ (λ + 1)
...
, n
...

i

i

This is a binomial series so that
M(z) = [p z + (1 − p)] n
...

Hence,
M (1) = n p and M (1) = (n − 1) n p 2
so that
E(X) = n p,

E(X 2 ) = (n − 1)n p 2 + n p,

Var(X) = n p (1 − p)
...
and q 0 , q 1 ,
...
} and let a sequence {r 0 , r 1 ,
...
+ p n q 0 ,

n = 0, 1,
...
24)

The sequence r 0 , r 1 ,
...
and q 0 , q 1 ,
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

31

For deriving the z-transform of the convolution, the following formula is needed:
n

∞
∞
∞
Σ n=0 Σ i=0 a in = Σ i=0 Σ n=i a in
...
25)

If Z denotes that random variable whose probability distribution is the convolution
{r 0 , r 1 ,
...

⎝
⎠⎝
⎠
Thus, the z-transform of Z is the product of the z-transforms of X and Y:

M Z (z) = M X (z) ⋅ M Y (z)
...
26)

1
...
2 Laplace Transformation
Let f (x) be any real-valued function on [0, +∞) with properties
1) f (x) is piecewise continuous,
2) there exist real constants a and s 0 such that f (x) ≤ a e s 0 x for all x ≥ 0
...

Notation If z = x + iy is any complex number (i
...
i = −1 and x, y are real numbers,
then R(z) denotes the real part of z: R(z) = x
...
With regard to the applications considered in this book, s can be assumed to be real
...

Specifically, if f (x) is the probability density of a nonnegative random variable X,
then f (s) has a simple interpretation:
f (s) = E(e −sX )
...
27)

This relationship is identical to (1
...

The n fold derivative of f (s) with respect to s is
d n f (s)
∞
= (−1) n ∫ 0 x n e −sx f (x) dx
...

(1
...
However, the Laplace
transform is also a probability (density) generating function, since via a (complex)
inversion formula the density of X can be obtained from its Laplace transform
...

Partial integration in f (s) yields (s > s 0 ≥ 0)
x
L ∫ 0 f (u) du = 1 f (s)
s

(1
...

dx

(1
...
− s 1 f (n−2) (0) − f (n−1) (0)
...
Then,
L f 1 + f 2 = L f 1 + L f 2 = f 1 (s) + f 2 (s)
...
31)

Convolution The convolution f 1 ∗ f 2 of two functions f 1 and f 2 , which are defined on the interval [0, +∞), is given by
x

( f 1 ∗ f 2 )(x) = ∫ 0 f 2 (x − u) f 1 (u) du
...
26):
L f 1 ∗ f 2 = L f 1 L f 2 = f 1 (s) f 2 (s)
...
32)

A proof of this relationship is easily established:
x
∞
L f 1 ∗ f 2 = ∫ 0 e −sx ∫ 0 f 2 (x − u) f 1 (u) du dx
∞
∞
= ∫ 0 e −su f 1 (u) ∫ u e −s (x−u) f 2 (x − u) dx du
∞
∞
= ∫ 0 e −su f 1 (u) ∫ 0 e −s y f 2 (y) dy du

= f 1 (s) f 2 (s)
...
32) means that the Laplace transform of the convolution of two
functions is equal to the product of the Laplace transforms of these functions
...
32), Dirichlet's formula had been applied:
z y

∫0 ∫0

z z

f (x, y) dx dy = ∫ 0 ∫ x f (x, y) dy dx
...
33)

Obviously, formula (1
...
25):
Retransformation The Laplace transform f (s) is called the image of f (x) and f (x) is
the pre-image of f (s)
...
Properties (1
...
32) of the Laplace transformation suggest that Laplace transforms should be decomposed as far as possible
into terms and factors (for instance, decomposing a fraction into partial fractions),
because the retransformations of the arising less complex terms and factors are usually easier done than the retransformation of the original image
...
These tables contain important functions and their
Laplace transforms
...
Its application requires knowledge
of complex calculus
...
11 Let X have an exponential distribution with parameter λ :
f (x) = λ e −λ x ,

x ≥ 0
...

s+λ
∞

∞

It exists for s > −λ
...

= (−1) n
ds n
(s + λ) n+1
Thus, the n th moment is
E(X n ) = n! ;
λn

n = 0, 1,
...
12 The definition of the Laplace transform can be extended to functions
defined on the whole real axis (−∞, +∞)
...

Its Laplace transform is defined as
f (s) =

© 2006 by Taylor & Francis Group, LLC

+∞

−
1
∫ e −sx e
2π σ −∞

(x−μ) 2
2σ 2

dx
...
Substituting u = (x − μ)/σ
yields
f (s) =
=

+∞
2
1
e −μs ∫ e −σ s u e − u /2 du
2π
−∞

−μs+ 1 σ 2 s 2 +∞ − 1 (u+σs) 2
1
2
e
du
...
Hence,
f (s) = e

−μs+ 1 σ 2 s 2
2

...

a) Moment Generating Function Let X be a random variable with density f (x) and
t a real parameter
...
M(t) arises from the Laplace transform of f (x) by letting s = −t
...
)
b) Characteristic Function Let X be a random variable with density f (x) , t a real
parameter and i = −1
...
Obviously, ψ(t) is the Fourier transform of
f (x)
...

Characteristic functions belong to the most important mathematical tools for solving
probability theoretic problems, e
...
for proving limit theorems and for characterizing and analyzing stochastic processes
...

The characteristic function has quite analogous properties to the Laplace transform
(if the latter exists) with regard to its relationship to the probability distribution of
sums of independent random variables
...
4 CLASSES OF PROBABILITTY DISTRIBUTIONS BASED ON
AGING BEHAVIOUR
This section is restricted to the class of nonnegative random variables
...
Hence, a terminology is used tailored to this application
...
In the engineering context, a failure
of a system need not be equivalent to the end of its useful life
...

Residual Lifetime Let F t (x) be the distribution function of the residual lifetime X t
of a system, which has already worked for t time units without failing:
F t (x) = P(X t ≤ x) = P(X − t ≤ x X > t)
...
6),
F t (x) =

P(X − t ≤ x ∩ X > t) P(t < X ≤ t + x)
=

...
13) yields the desired result:
F(t + x) − F(t)
F t (x) =
;
F(t)

x ≥ 0, t ≥ 0
...
34)

The corresponding conditional survival probability F t (x) = 1 − F t (x) is given by
F t (x) =

F(t + x)
;
F(t)

x ≥ 0, t ≥ 0
...
35)

Hence, using (1
...

F(t) t

μ(t) = 1

(1
...
13 (uniform distribution) The random variable X has uniform distribution over [0, T]
...

⎩ 1

X

t
0

x
Xt

Figure 1
...

T−t
Thus, X t is uniformly distributed over the interval [0, T − t], and the conditional failure probability is increasing with increasing t, t < T
...
14 (exponential distribution) Let X have an exponential distribution with
parameter λ , i
...
its density and distribution function are
f (x) = λ e −λ x ,

F(x) = 1 − e −λx ,

x ≥ 0
...

(1
...
The exponential distribution is the only continuous probability distribution, which has this
so-called memoryless property or lack of memory property
...
Or, equivalently, if the system has not failed in the interval [0, t], then,
with respect to its failure behaviour in [t, ∞) , it is at time t as good as new
...

The fundamental relationship F t (x) = F(x) is equivalent to
F(t + x) = F(t) F(x)
...
38)

It can be shown that the distribution function of the exponential distribution is the
only one which satisfies the functional equation (1
...

The engineering (biological) background of the conditional failure probability motivates the following definition
...
1 A system is aging (rejuvenating ) in the interval [t 1 , t 2 ], t 1 < t 2 , if
for an arbitrary but fixed x, the conditional failure probability F t (x) is increasing
(decreasing) for increasing t, t 1 ≤ t ≤ t 2
...
Note that here and in what follows the terms 'increasing' and 'decreasing' have the meaning of 'nondecreasing' and 'nonincreasing', respectively
...
To derive this
concept, the conditional system failure probability F t (Δt) of a system in [t, t + Δt] is
© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

37

considered relative to the length Δt of this interval
...
e
...

Δt
Δt t
F(t)
For Δt → 0 , the first ratio on the right hand side tends to f (t)
...

Δt→0 Δt

This limit is called failure rate or hazard function and denoted as λ(t) :
λ(t) = f (t) F(t)
...
39)

(In demograpy and in actuarial science, λ(t) is called force of mortality
...
Integration on both sides of (1
...

If introducing the integrated failure rate
x

Λ(x) = ∫ 0 λ(t) dt ,
F(x) , F t (x) and the corresponding survival probabilities can be written as follows:
F(x) = 1 − e −Λ(x) ,

F(x) = e −Λ(x) ,

F t (x) = 1 − e −[Λ(t+x)−Λ(t)] ,
F t (x) = e −[Λ(t+x)−Λ(t)] ;

(1
...

This representation of F t (x) implies an important property of the failure rate:
A system ages in [t 1 , t 2 ] , t 1 < t 2 , if its failure rate λ(t) is increasing in this
interval
...
e
...

(1
...
This property of the failure
rate can be used for its statistical estimation: At time t = 0 a specified number of independently operating, identical systems start working
...

For instance, if X has a Weibull distribution with parameters β and θ , then
λ(x) = (β/θ) (x/θ) β−1 ,

x > 0
...
If β = 1, the failure rate is identically constant: λ(t) ≡ λ = 1/θ
...
Originally, they were defined with regard to applications
in reliability engineering
...
The most obvious classes are IFR
(increasing failure rate) and DFR (decreasing failure rate)
...

If the density f (x) = F (x) exists, then, from (1
...

Another characterization of IFR and DFR is based on the Laplace transform f (s) of
the density f (x) = F (x)
...
, let
(−1) n d n a 0 (s)
a −1 (s) ≡ 1, a 0 (s) = 1 ⎡ 1 − f (s) ⎤ , a n (s) =

...
42)

Then F(x) is IFR (DFR) if and only if
≥
a 2 (s) (≤) a n−1 (s) a n+1 (s);
n

n = 0, 1,
...
If f (x) does not exist, then this statement remains valid if f (s) is
the Laplace-Stieltjes transform of F(x)
...

The IFR- (DFR-) class is equivalent to the aging (rejuvenation) concept proposed in
definition 1
...
The following nonparametric classes present modifications and more
general concepts of aging and rejuvenation than the ones given by definition 1
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

39

IFRA- (DFRA-) Distribution The failure rate (force of mortality) of human beings
(as well as of other organisms), is usually not (strictly) increasing
...
But the average failure rate will
definitely increase
...
Hence, the definition of the classes IFRA (increasing failure
rate average) and DFRA (decreasing failure rate average) makes sense:
F(x) is an IFRA- (DFRA-) distribution if the function
− 1 ln F(t)
t
is increasing (decreasing) in t
...
39), the average failure rate over the interval
[0, t] is
t
λ(t) = 1 ∫ 0 λ(x) dx = − 1 ln F(t)
...

⎦
⎣

NBU- (NWU-) Distribution Since
F t (x) = F(x)
is equivalent to F(t + x) = F(t) F(x), a new system has a smaller failure probability
than a used system of age t if and only if
F(t + x) ≤ F(t) F(x)
...
43)

for all x ≥ 0, t ≥ 0
...
) As the classes IFR and DFR (as well as other classes),
NBU and NWU can be characterized by properties of Laplace transforms of its probability densities (Vinogradov [85]): With the notation (1
...
; n = 0, 1,
...

© 2006 by Taylor & Francis Group, LLC

40

STOCHASTIC PROCESSES

NBUE- (NWUE-) Distribution According to (1
...
36), the mean life of a
new system μ and the mean residual lifetime μ(t) of a system, which is still operating at age t (used system) are given by
∞

μ = ∫ 0 F(x) dx,

∞
μ(t) = 1 ∫ t F(x) dx
...
44)

When comparing μ and μ(t) , one arrives at the classes NBUE (new better than used
in expectation) and NWUE (new worse than used in expectation):
F(x) is an NBUE- (NWUE-) distribution if
1 ∞
≤
μ ∫ t F(x) dx (≥) F(t)

for all t ≥ 0
...
3)
...

The corresponding distribution function is
1 t
F S (t) = 1 − F S (t) = μ ∫ 0 F(x) dx
...
45)

Hence, F(x) is an NBUE- (NWUE-) distribution if and only if
≤
F S (x) (≥) F(x) for all x ≥ 0
...

2-NBU- (2-NWU-) Distribution F(x) is a 2-NBU- (2-NWU-) distribution if the
corresponding distribution function F S (x) , defined by (1
...

Obviously, this is equivalent to F S (x) being NBU (NWU)
...
43) one obtains for
NBU
∞
∞
∫ 0 e −sx F(t + x)dx ≤ F(t)∫ 0 e −sx F(x)dx,
and for NWU,

∞ −sx
∞
e F(t + x)dx ≥ F(t)∫ 0 e −sx F(x)dx
...

Equivalently, F(x) is NBUL (NWUL) if
∞ −sx
∞
e F t (x)dx ≤ ∫ 0 e −sx F(x) dx;
(≥)

∫0

s, t ≥ 0
...

Implications between some classes of nonparametric distribution classes are:
IFR ⇒ IFRA ⇒ NBU ⇒ NBUE
DFR ⇒ DFRA ⇒ NWU ⇒ NWUE
Knowledge of the nonparametric class a distribution function belongs to and knowledge of some of its numerical parameters allow the construction of lower and/or upper bounds on this otherwise unknown distribution function
...

1) Let F(x) = P(X ≤ x) be IFR and μ n = E(X n ) the n th moment of X
...

1/n
⎪ 0
for x > μ n
⎩
In particular, for n = 1 , with μ = μ 1 = E(X),
⎧ e −x /μ for x ≤ μ
F(x) ≥ ⎨

...
46)

2) The lower bound (1
...

⎝ 1−α ⎠
2
⎝μ
⎠

The parameter α satisfies 0 < α < 1 and is solution of the equation
μ2
μ2

© 2006 by Taylor & Francis Group, LLC

−1=

2α−α 2 +2(1−α) ln(1−α)
α2

...

γ ≤ 0 if F(x) is IFR (DFR)
...

4) If F(x) is IFR, then (Solov'ev [77])
sup F(x) − e −x/μ ≤ 1 − 2γ − 1 ,
x

5) If F(x) is DFR, then (Brown [14]),
sup F(x) − e −x/μ ≤ 1 − e −γ ,
x

sup F(x) − F S (x) ≤ 1 − e −γ ,
x

where F S (x) is given by (1
...

6) If F(x) is IFRA, then
for x < μ
⎧1
F(x) ≤ ⎨ −r x
,
e
for x ≥ μ
⎩
where r = r(x, μ) is solution of
1 − r μ = e −r x
...

8) If F(x) is NBUE (NWUE), then
≤
F S (x) (≥) e −x/μ ,

x ≥ 0
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

43

1
...
These 'stochastic orders' have proved a powerful tool for the
approximate analysis of complex stochastic models, which elude a mathematically
rig- orous treatment, in particular in queueing-, inventory-, and reliability theory, and
recently in actuarial science
...
The present state of art of theory and applications can be found in the
monograph Müller and Stoyan [62]
...

Usual Stochastic Order X is smaller than Y with regard to the usual stochastic order if
F(x) ≤ G(x) for all x
...
47)
Thus, X assumes large values with lower probability than Y
...
For
that reason it was simply called the stochastic order
...

Notation: X ≤ Y
st

With regard to the previous section: F(x) is IFR (DFR) if and only if
X t 2 ≤ X t 1 ⎛ X t 2 ≥ X t 1 ⎞ for 0 ≤ t 1 ≤ t 2 ,
⎠
⎝
st
st
where X t is the residual lifetime of a system operating at time t
...

⎝ st ⎠
st
Let the random variable X S have the distribution function F S (x) given by (1
...
Then,
F(x) is 2-NBU (2-NWU) if and only if
X S,t ≤ X S ⎛ X S,t ≥ X S ⎞
...

st

2) If X ≤ Y , then E(h(X)) ≤ E(h(Y)) for all increasing functions h(⋅) and vice versa
...

st

© 2006 by Taylor & Francis Group, LLC

44

STOCHASTIC PROCESSES

Hazard Rate Order This stochastic order is closely related to the distribution function of the residual lifetime
...
If the usual stochastic order
Xt ≤ Yt
st
is required to hold for all t ≥ 0 , then, according to (1
...

or

This relationship motivates the following order relation:
X is smaller than Y with respect to the hazard rate order (failure rate order) if the ratio F(t)/G(t) is decreasing with increasing t
...

hr

2) Let X ≤ Y and h(⋅) be an increasing real function
...

hr

hr

3) If X ≤ Y , then X ≤ Y
...
However, for many applications
it is useful to include the variability aspect
...
This aspect
is taken into account by convex orders
...

Notation: X ≤ Y
cx

(b) X is said to be smaller than Y in increasing convex order if for all real-valued increasing convex functions h(⋅) with property that E(h(X)) and E(h(Y)) exist,
E(h(X)) ≤ E(h(Y))
...

(1
...
In actuarial science,
'increasing convex order' had been known as 'stop-loss order', whereas in decision
theory 'increasing concave order' had been called 'second order stochastic dominance'
...

icx

icv

Hence, only one of these stochastic orders needs to be investigated
...

icx

3) If X ≤ Y , then
cx

E(X n ) ≤ E(Y n ) and E((X − E(X)) n ) ≤ E((Y − E(Y)) n ) for n = 2, 4,
...

cx

4) Let (c − x) + = max(0, c − x)
...

(1
...
48) needs to be checked only for a simple
icx

class of convex functions, namely the so-called wedge functions
h(x) = (c − x) +
...
50)

is convex and decreasing in x
...

© 2006 by Taylor & Francis Group, LLC

46

STOCHASTIC PROCESSES

1
...
6
...
, X n ) be an n-dimensional vector, the components of which are random variables
...
, X n ) is called a random vector, a multidimensional
random variable or, more precisely, an n-dimensional random vector or an n-dimensional random variable
...
, X n is defined by
F(x 1 , x 2 ,
...
, X n ≤ x n )
...
51)

This function characterizes the probability distribution of ( X 1 , X 2 ,
...
The distribution functions of the X i , denoted as
F X (x) = P(X i ≤ x i ) ,
i
can be obtained from the joint distribution function:
F X (x i ) = F(∞,
...
, ∞);
i

i = 1, 2,
...

(1
...
, F X n (x)
1

2

are the marginal distributions of ( X 1 , X 2 ,
...
The marginal distributions of a
random vector cannot fully characterize its probability distribution, since they do not
contain information on the statistical dependency between the X i
...
, X n
...
, X n are said to be independent if for
all vectors (x 1 , x 2 ,
...
, x n ) = F X (x 1 ) F X (x 2 )
...

1
2

(1
...

Identical Distribution The random variables X 1 , X 2 ,
...
, n
...
, x n ) = F(x 1 ) F(x 2 )
...

Thus, the joint distribution function of a random vector with independent components is equal to the product of its marginal distribution functions
...
6
...
6
...
1 Discrete Components
Consider a random vector (X, Y) , the components X and Y of which are discrete random variables with respective ranges x 0 , x 1 ,
...
and probability
distributions
{p i = P(X = x i ; i = 0, 1,
...

Furthermore, let
r ij = P(X = x i ∩ Y = y j )
...
} is the joint or two-dimensional probability
distribution of the random vector (X, Y)
...

(1
...
6
...
} and { q i , i = 0, 1,
...
By (1
...

r ij

P(Y = y j | X = x i ) = p
...

i

are the conditional probability distributions of X given Y = y j and of Y given X = x i ,
respectively
...

i

The conditional mean value E(X Y) of X given Y is a random variable, since the
condition is random
...

The mean value of E(X Y) is
E(E(X Y)) =
=

∞

Σ

j=0
∞

Σ

i=0

E(X Y = y j ) P(Y = y j ) =
∞

xi Σ ri j =
j=0

∞

Σ

i=0

Because the roles of X and Y can be changed,

© 2006 by Taylor & Francis Group, LLC

∞ ∞

Σ Σ

j=0 i=0

x i p i = E(X)
...

(1
...
53): X and Y are independent if and only if the random events ”X = x i ” and
”Y = y j ” are independent for all i, j = 0, 1, 2,
...

1
...
2
...
The joint distribution function of (X, Y) ,
F X,Y (x, y) = P(X ≤ x, Y ≤ y) ,
has the following properties:
1) F X,Y (−∞, −∞) = 0,

F X,Y (+∞, +∞) = 1

2) 0 ≤ F X,Y (x, y) ≤ 1
3) F X,Y (x, +∞) = F X (x),

F X,Y (+∞, y) = F Y (y)

(1
...
(Properties 1 to 4 also hold for random
vectors with discrete components
...

Assuming its existence, the partial derivative of F X,Y (x, y) with respect to x and y,
f X,Y (x, y) =

∂F X,Y (x, y)
,
∂x ∂y

is called the joint probability density of (X, Y)
...
57)

for all x, y
...

Conversely, any function of two variables x and y satisfying these two conditions can
be considered to be the joint density of a random vector (X, Y)
...
56)
and (1
...

(1
...

If X and Y are independent, then, according to (1
...

Hence, in terms of the densities, if X and Y are independent, then the joint density of
the random vector (X, Y) is the product of its marginal densities:
f X,Y (x, y) = f X (x) f Y (y)
...

For continuous random variables, condition X = x has probability 0 so that formula
(1
...
Hence, consider for Δx > 0
P(Y ≤ y x ≤ X ≤ x + Δx) =

∫ −∞ Δx ⎛ ∫ x
⎝
y

=

1

x+Δx

P(Y ≤ y ∩ x ≤ X ≤ x + Δx)
P(x ≤ X ≤ x + Δx)
f X,Y (u, v) d u ⎞ d v
⎠

1
⎡ F (x + Δx) − F X (x)⎤
⎦
Δx⎣ X

...

f X (x) ∫ −∞ X,Y

Differentiation yields the desired conditional density:
f X,Y (x, y)
f Y (y x) =

...

f Y (y)
The conditional mean value of Y given X = x is
+∞

E(Y x) = ∫ −∞ y f Y (y x) d y
...

© 2006 by Taylor & Francis Group, LLC

(1
...

(1
...
61)

E(X Y) = E(X) E(Y)
...
62)

The covariance Cov(X, Y) between random variables X and Y is defined as
Cov(X, Y) = E{[x − E(X)][Y − E(Y)]}
...
63)

This representation of the covariance is equivalent to
Cov(X, Y) = E(X Y) − E(X) E(Y)
...
64)

In particular, Cov(X, X) is the variance of X:
Cov(X, X) = Var(X) = E((X − E(X)) 2 )
...
62), if X and Y are independent, then covariance between these two random
variables is 0: Cov(X, Y) = 0
...
The covariance can assume any value between −∞ and +∞
...

The correlation coefficient between X and Y is defined as
ρ(X, Y) =

Cov(X,Y)
Var(X)

Var(Y)

...
65)

The correlation coefficient has the following properties:
1) If X and Y are independent, then ρ(X, Y) = 0
...

3) For any random variables X and Y, −1 ≤ ρ(X, Y) ≤ 1
...

X and Y are said to be uncorrelated if ρ(X, Y) = 0
...
Obviously, X and Y are
uncorrelated if and only if
E(X Y) = E(X) E(Y)
...
But if X and Y are uncorrelated, they need not be independent
...
Example 1
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

51

Table 1
...
15

X
-1

0

+1

-2

1/16

1/16

1/16

-1

2/16

1/16

2/16

+1

2/16

1/16

2/16

+2

1/16

1/16

1/16

Y

Example 1
...

Table 1
...
Accordingly, the mean values of X
and Y are:
E(X) = 3 ⋅ (−2) + 5 ⋅ (−1) + 5 ⋅ 1 + 3 ⋅ 2 = 0
16

16

16

16

E(Y) = 6 ⋅ (−1) + 4 ⋅ 0 + 6 ⋅ 1 = 0
16
16
16
The mean value of the product XY is
E(X Y) = 1 ⋅ (−2)(−1) + 1 ⋅ (−1)(−1) + 1 ⋅ 1 ⋅ (−1) + 1 ⋅ 2 ⋅ (−1)
16
8
8
16
1
1
1
1
+ ⋅ (−2) ⋅ 0 + ⋅ (−1) ⋅ 0 + ⋅ 1 ⋅ 0 + ⋅ 2 ⋅ 0
16
16
16
16
1
1
1
1
+ ⋅ (−2) ⋅ 1 + ⋅ (−1) ⋅ 1 + ⋅ 1 ⋅ 1 + ⋅ 2 ⋅ 1 = 0
...

On the other hand,
P(X = 2, Y = −1) = 1 ≠ P(X = 2) ⋅ P(Y = −1) = 3 ⋅ 6 = 18 = 9
...

Example 1
...

According to (1
...

⎜x2 ∫
2π
2 2π ⎝
⎠
−∞ 2π
−∞

−x
=e

2 /2 ⎛

The integrand in the first integral is the density of an N(0, 1) -distributed random variable, the second integral is the variance of an N(0, 1) -distributed random variable
...
Thus ,
f X (x) =

1 (x 2 + 1) e −x 2 /2 ,
2 2π

− ∞ < x < +∞
...

Obviously, f X,Y (x, y) ≠ f X (x) ⋅ f Y (y)
...

For f X (y) and f Y (y) being symmetric with regard to the origin, E(X) = E(Y) = 0
...

⎝ −∞
⎠

The integrands in the integrals of the second line are asymmetric with regard to the
origin
...
Hence,
E(X Y) = E(X) E(Y) = 0
...

The following example shows that the correlation coefficient may give absolutely
wrong information on the degree of the statistical dependency between two random
variables other than the linear one
...
17 Let Y = sin X with X uniformly distributed over the interval [0, π] :
f X (x) = 1/π , 0 ≤ x ≤ π
...

Thus, the covariance between X and Y is 0:
2
Cov (X, Y) = 1 − π ⋅ π = 0
...
Despite the functional relationship between the random variables
X and Y, they are uncorrelated
...
66)

⎧
⎛ (x−μ x ) 2
(x−μ x )(y−μ y ) (y−μ y ) 2 ⎞ ⎫
⎪
⎪
1
exp ⎨ −
− 2ρ
+
⎜
σxσy
2 ⎟⎬
⎪ 2(1−ρ 2 ) ⎝ σ 2
σy ⎠ ⎪
x
⎩
⎭

with −∞ < x, y < +∞
...
58), the corresponding marginal densities are
f X (x) =

f Y (x) =

⎛ (x − μ x ) 2 ⎞
1
exp ⎜ −
⎟,
2π σ x
⎝
2 σ2 ⎠
x
⎛ (y−μ y ) 2 ⎞
1
exp ⎜ −
⎟,
2π σ y
2 σ2 ⎠
⎝
y

− ∞ < x < +∞

− ∞ < y < +∞
...
Since the independence of X and Y is equivalent to f X,Y (x, y) = f X (x) f Y (y), X and Y are independent if and only if ρ = 0
...
Therefore:
If the random vector (X, Y) has a bivariate normal distribution, then X and Y are
independent if and only if they are uncorrelated
...
59):
f Y (y x) =

1
2π σ y 1−ρ 2

⎧
⎫
⎪
⎡ y − ρ σ y (x − μ x ) − μ y ⎤ 2 ⎪
...
67)

Thus, on condition X = x, the random variable Y has a normal distribution with
parameters
σy

E(Y X = x) = ρ σ x (x − μ x ) + μ y ,

Var(Y X = x) = σ 2 (1 − ρ 2 )
...
68)

Example 1
...
5
...
5 ⋅ 2 (x − 16) = x + 8
2
2
2
2 ) = 4 (1 − 0
...

Var(Y x) = σ y (1 − ρ

Hence,
f Y (y x) =

⎧

⎪ ⎛
1
exp ⎨ − 1 ⎜
2π 3
⎪ 2⎝

y− x −8 ⎞ 2 ⎫
⎪
2

3

⎩

⎟ ⎬,
⎠ ⎪
⎭

− ∞ < y < +∞
...
Some conditional interval probabilities are:
⎛
⎞
⎛
⎞
P(14 < Y ≤ 16 X = 10) = Φ ⎜ 16−13 ⎟ − Φ ⎜ 14−13 ⎟ = 0
...
718 = 0
...
718 − 0
...
436
...
500 − 0
...
341
...

Distribution of the Product of two Random Variables Let (X, Y) be a random vector with joint probability density f X,Y (x, y) , and
Z = X Y
...
9)
F Z (z) =

∫∫

{(x,y); xy≤z}

f X,Y (x, y)dx dy

with {(x, y); xy ≤ z} = {−∞ < x ≤ 0, z/x ≤ y < ∞} {0 ≤ x < ∞, − ∞ < y ≤ z/x}
...

F Z (z) = ∫ −∞ ∫ z/x f X,Y (x, y) dy dx + ∫ 0

Differentiation with regard to z yields the probability density of Z :
0
∞
z
z
f Z (z) = ∫ −∞ ⎛ − 1 ⎞ f X,Y (x, x ) dx + ∫ 0 1 f X,Y (x, x ) dx
...

(1
...
9 Derivation of the distribution function of a product
+∞ z/x
∫ 0 f X,Y (x, y) dy dx, z ≥ 0,

F Z (z) = ∫ 0

+∞ 1

f Z (z) = ∫ 0

z
x f X,Y (x, x ) dx, z ≥ 0
...
70)

Distribution of the Ratio of two Random Variables Let (X, Y) be a random vector
with joint probability density f X,Y (x, y), and
Z = Y /X
...
10)
F Z (z) =

∫∫ y

f X,Y (x, y)dx dy

(x,y); x ≤z
y

with (x, y); x ≤ z = {−∞ < x ≤ 0, zx ≤ y < ∞} {0 ≤ x < ∞, − ∞ < y ≤ zx}
...

+∞

0

F Z (z) = ∫ −∞ ∫ z x f X,Y (x, y) dy dx + ∫ 0
y

y
y = zx
z<0

z>0

0

x

x

0
y = zx

Figure 1
...

(1
...

F Z (z) = ∫ 0

(1
...
19 The random vector (X, Y) has the joint density
f X,Y (x, y) = λμe −(λ x+ν y) , x ≥ 0, y ≥ 0; λ > 0, ν > 0
...
Hence, the density of
the ratio Z = Y/X is
∞

f Z (z) = ∫ 0 x λν e −(λ+ν z)x dx,

z ≥ 0
...

The integral is the mean value of an exponentially distributed random variable with
parameter λ + ν z
...

f Z (z) =
Z
λ + νz
(λ + ν z) 2
The mean value of Z does not exist
...
17) to determining E(Z)
...
20 A system has the random lifetime (= time to failure) X
...
It takes Y time units to replace a failed system
...

X+Y
A is the availability of the system in a cycle
...

⎝ X+Y
⎠
⎝X
t ⎠
Hence,
F A (t) = 1 − F Z ⎛ 1−t ⎞ ,
⎝ t ⎠

0 < t ≤ 1
...

⎝ t ⎠
t2

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

57

Specifically, if f Z (z) is the same as in example 1
...

2
(λ − ν) t + ν
[(λ − ν)t + ν]
For λ ≠ ν , the mean value of A is
E(A) =

ν ⎡ 1 + λ ⎤ ln λ
...
73)

If λ = ν, then A is uniformly distributed over [0, 1]
...

1
...
3 n-Dimensional Random Variables
Let (X 1 , X 2 ,
...
, x n ) = P(X 1 ≤ x 1 , X 2 ≤ x 2 ,
...

Provided its existence, the nth mixed partial derivative of the joint distribution function with respect to the x 1 , x 2 ,
...
, X n ) :
f (x 1 , x 2 ,
...
, x n )

...
∂x n

(1
...
Hence they will not be given here
...
52), whereas the marginal densities are
f X (x i ) =
i

+∞ +∞ +∞

∫ ∫ ⋅⋅⋅ ∫

−∞ −∞ −∞

f (x 1 ,
...
, x n ) dx 1
...
dx n
...
75)

If the X i are independent, then, from (1
...
, X n ) is
equal to the product of the densities of the X i :
f (x 1 , x 2 ,
...
f X n (x n )
...
76)

The joint distribution function (density) also allows for determining the joint probability distributions of all subsets of { X 1 , X 2 ,
...
For instance, the joint distribution function of the random vector (X i , X j ), i < j, is
F X , X (x i , x j ) = F(∞,
...
, ∞, x j , ∞,
...
, X k , k < n, is
F X , X ,
...
, x k ) = F(x 1 , x 2
...

1 2
k
© 2006 by Taylor & Francis Group, LLC

(1
...

+∞

∫

−∞

f (x 1 , x 2 ,
...
dx i−1 dx i+1
...
dx n

and
f X ,X ,
...
, x k )
1 2
k
=

+∞ +∞

∫ ∫

...
78)

f (x 1 , x 2 ,
...
x n ) dx k+1 dx k+2
...

Conditional densities can be obtained analogously to the two-dimensional case: For
instance, the conditional density of ( X 1 , X 2 ,
...
, n, is
f (x 1 ,
...
, x n x i ⎞ =
⎠

f (x 1 , x 2 ,
...
79)

i

and the conditional density of ( X 1 , X 2 ,
...
, X = x k ' is
f (x k+1 , x k+2 ,
...
, x k ) =

f (x 1 , x 2 ,
...
,X (x 1 , x 2 ,
...
(1
...
, x n ) be a function of n variables
...
, X n ) is defined as
E(Y) =

+∞ +∞

∫ ∫

...
, x n ) dx 1 dx 2
...

(1
...
X n is
E(X 1 X 2
...

+∞

∫

−∞

x 1 x 2
...
, x n ) dx 1 dx 2
...

In view of (1
...
X n ) = E(X 1 ) E(X 2 )
...

(1
...

The conditional mean value of Y = h (X 1 , X 2 ,
...
, X k = x k '
is
© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

59
E(Y x 1 , x 2 ,
...

+∞

∫

−∞

h(x 1 , x 2 ,
...
83)

f (x 1 , x 2 ,
...
dx n
...
, X (x 1 , x 2 ,
...
83) the x 1 , x 2 ,
...
, X k yields
the corresponding random mean value of Y given X 1 , X 2 ,
...
, X k )
...
, X k ) is
E X ,X ,
...
, X k ) ) = E(Y)
...
84)

The mean value of E(Y X 1 , X 2 ,
...
, X k−1 is again a random variable:
E X ,X ,
...
, X k )) = E(Y X k )
...
85)

From this it is obvious how to obtain the conditional mean value
E(Y x i , x i ,
...
, X i )
1
2
k
with regard to any subsets {x i , x i ,
...
, X i } of the respec1
2
1
2
k
k
tive sets {x 1 , x 2 ,
...
, X n }
...
+ Y m x i , x i ,
...
, x i )
1
2
1
2
k
k
and

(1
...
+ Y m X i , X i ,
...
, X i )
...
87)
1

2

k

1

2

k

Let
c i j = Cov (X i , X j )
be the covariance between X i and X j ; i, j = 1, 2,
...
It is useful to unite the c ij in
the covariance matrix C :
C = ((c i j )) ; i, j = 1, 2,
...

The main diagonal of C consists of the variances of the X i :
c i i = Var(X i ); i = 1, 2,
...

© 2006 by Taylor & Francis Group, LLC

60

STOCHASTIC PROCESSES

n-Dimensional Normal Distribution Let ( X 1 , X 2 ,
...
, μ n ) and covariance matrix
C = ((c ij ))
...
, x n )
...
, X n ) has an n-dimensionally normal (or Gaussian) distribution if it has joint density
f (x) =

1
(2π) n C

exp ⎛ − 1 (x − μ) C −1 (x − μ) T ⎞ ,
⎝
⎠
2

(1
...
, x n − μ n )
...
88), f (x) becomes
f (x 1 , x 2 ,
...
89)

Σ i=1 Σ j=1 C i j (x i − μ i )(x j − μ j ⎞ ,
⎠
n

n

where C i j is the cofactor of c i j
...
89) becomes the density of the bivariate normal distribution (1
...
Generalizing from the bivariate special case, it can be shown that the
random variables X i have an N(μ i , σ 2 )− distribution with σ 2 = c ii ; i = 1, 2,
...
, X n ) has an n-dimensional normal distribution
...
76) of the joint density and, therefore, the independence of the X i follows:
f X (x 1 , x 2 , ⋅⋅⋅, x n ) =

n ⎡

x − μ 2⎞ ⎤
⎛
⎢ 1
exp ⎜ − 1 ⎛ i σ i ⎞ ⎟ ⎥
...
90)

Theorem 1
...
, X n ) has an n-dimensionally normal
distribution and the random variables Y 1 , Y 2 ,
...
e
...
, m ,

then the random vector (Y 1 , Y 2 ,
...

Maximum of n Independent Random Variables Let X 1 , X 2 ,
...
, X n }
...
, X n ≤ x '
...
F X (x)
...
91)
1

n

2

Minimum of n Independent Random Variables Let X 1 , X 2 ,
...
, X n
...
, X n > x)
...
F X n (x),
1

2

(1
...
F X n (x)
...
93)

Example 1
...
, X n
...
Hence, its lifetime is
X = max X 1 , X 2 ,
...
91)
...

By (1
...

Substituting u = 1 − e −λx yields
n
1
1
E(X) = 1 ∫ 0 1 − u du = 1 ∫ 0 [1 + u +
...

λ
λ
1−u

Hence,
E(X) = 1 ⎡ 1 + 1 + 1 +
...

n⎥
λ⎢
⎣
⎦
2 3
b) Under otherwise the same assumptions as in case a), the system fails as soon as
the first subsystem fails (series system)
...
, X n }
and has distribution function (1
...
In particular, if the lifetimes of the subsystems
are identically exponentially distributed with parameter λ , then
F Y (x) = 1 − e −nλx , x ≥ 0
...

© 2006 by Taylor & Francis Group, LLC

62

STOCHASTIC PROCESSES

1
...
7
...

Then the mean value of the sum Z = X + Y is
∞

∞

∞

∞

∞

∞

E(Z) = Σ i=0 Σ j=0 (x i + y j ) r ij = Σ i=0 x i Σ j=0 r ij + Σ i=0 y j Σ j=0 r ij
...
54),
E(X + Y) = E(X) + E(Y)
...
, X n ,
E( X 1 + X 2 +
...
+ E(X n )
...
94)
(1
...
} and probability distributions
{p i = P(X = i; i = 0, 1,
...

Then,
k

P(Z = k) = P(X + Y = k) = Σ i=0 P(X = i) P(Y = k − i)
...

r k = p 0 q k + p 1 q k−1 +
...

Thus, according to (1
...
is the
convolution of the probability distributions of X and Y
...
26),
M Z (z) = M X (z) M Y (z)
...
96)
The z-transform M Z (z) of the the sum Z = X + Y of two independent discrete
random variables X and Y with common range R = {0, 1,
...

By induction, if Z = X 1 + X 2 +
...
M X n (z)
...
97)

Example 1
...
+ X n be a sum of independent random variables, where X i has a Poisson distribution with parameter λ i ; i = 1, 2,
...
The ztransform of X i is (section 1
...
1)
M X (z) = e λ i (z−1)
...
97),

63

...

Thus, the sum of independent, Poisson distributed random variables has a Poisson
distribution the parameter of which is the sum of the parameters of the Poisson distributions of these random variables
...
7
...
, n; are random variables with respective distribution
functions, densities, mean values and variances
F X (x i ), f X (x i ), E(X i ), and Var(X i ); i = 1, 2,
...

i

i

The joint density of the X 1 , X 2 ,
...
, x n )
...

Mean Value of a Sum Applying (1
...
, x n ) = x 1 + x 2 +
...

−∞ −∞

+∞

∫

−∞

(x 1 + x 2 +
...
, x n ) dx 1 dx 2
...

From (1
...

⎝
⎠
i

Hence,
E( X 1 + X 2 +
...
+ E(X n )
...
98)

The mean value of the sum of (discrete or continuous) random variables
is equal to the sum of the mean values of these random variables
...

⎝
⎠

(1
...
99 ) can be written in the form
n

n

n

Var ⎛ Σ i=1 X i ⎞ = Σ i=1 Var( X i ) + 2 Σ i,j=1;i ...
+ X n ) = Var( X 1 ) + Var( X 2 ) +
...

© 2006 by Taylor & Francis Group, LLC

(1
...
101)

64

STOCHASTIC PROCESSES
The variance of a sum of uncorrelated random variables is equal to the sum
of the variances of these random variables
...
, α n be any sequence of finite real numbers
...
102)

n
n
n
Var ⎛ Σ i=1 α i X i ⎞ = Σ i=1 α 2 Var( X i ) + 2 Σ i,j=1, i ...
103)
⎝
⎠
i

If the X i are uncorrelated,
n
n
Var ⎛ Σ i=1 α i X i ⎞ = Σ i=1 α 2 Var( X i )
...
104)

For independent, identically distributed random variables with mean μ and variance
σ 2 , formulas (1
...
75) simplify to
n
E ⎛ Σ i=1 X i ⎞ = nμ ,
⎝
⎠

n
Var ⎛ Σ i=1 X i ⎞ = n σ 2
...
105)

Note Formulas (1
...
105) hold for discrete and continuous random variables
...
On condition Y = y , the distribution function of the sum Z = X + Y is
F Z (Z ≤ z Y = y) = P(X + y ≤ z) = P(X ≤ z − y) = F X (z − y)
and, on condition X = x,
F Z (Z ≤ z X = x) = P(Y + x ≤ z) = P(Y ≤ z − x) = F Y (z − x)
...

(1
...

(1
...
107) are equivalent definitions of the convolution of the densities
f X and f Y
...
106) can be written as
+∞

+∞

F Z (z) = ∫ −∞ F Y (z − x) dF X (x) = ∫ −∞ F X (z − y) dF Y (y)
...
108)

The integrals in (1
...

Notation F Z (z) = F X ∗ F Y (z) = F Y ∗ F X (z)

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

65

The distribution function (probability density) of the sum of two independent
random variables is given by the convolution of their distribution functions
(probability densities)
...
106), simply
the convolution of F and g or, equivalently, the convolution of G and f
...
106) and (1
...
109)

z

z

z ≥ 0
...
110)

F Z (z) = ∫ 0 F X (z − x) f Y (x)dx = ∫ 0 F Y (z − y) f X (y)dy ,
f Z (z) = ∫ 0 f Y (z − x) f X (x) dx = ∫ 0 f X (z − y) f Y (y) dy ,

Moreover, if L( f ) denotes the Laplace transform of a function f defined on [0, ∞) (its
existence provided), then, by (1
...

(1
...

(1
...

By (1
...

(1
...
+ X n of n independent, continuous random
variables X i is obtained by repeated application of formula (1
...
The resulting
function is the convolution of the densities f X , f X ,
...
∗ f X n (z)
...
114)

In particular, if the X i are identically distributed with density f , then f Z is the n-fold
convolution of f with itself or, equivalently, the nth convolution power f ∗(n) (z) of f
...
115)

i = 2, 3,
...
For nonnegative random variables, this formula simplifies to
z
f ∗(i) (z) = ∫ 0 f ∗(i−1) (z − x) f (x) dx, z ≥ 0
...
116)

66

STOCHASTIC PROCESSES

From (1
...
+ X n is equal to the
product of the Laplace transforms of these random variables:
L( f Z ) = L( f X ) L( f X )
...

1

2

(1
...
108) yields the distribution function of a sum of the n
independent random variables X 1 , X 2 ,
...
∗ F X n (z)
...
118)

In particular, if the X i are independent and identically distributed with distribution
function F, then F Z (z) is equal to the n th convolution power of F:
F Z (z) = F ∗(n) (z)
...
119

F Z (z) can be recursively obtained from
+∞

F ∗(i) (z) = ∫ −∞ F ∗(i−1) (z − x) dF(x);

(1
...
; F ∗(0) (x) ≡ 1, F ∗(1) (x) ≡ F(x)
...
120) becomes
z

F ∗(i) (z) = ∫ 0 F ∗(i−1) (z − x) dF(x)
...
121)

Example 1
...

i

i

(1
...

If λ 1 = λ 2 = λ , then

f Z (z) = λ 2 z e −λ z , z ≥ 0
...
122)

This is the density of an Erlang distribution with parameters n = 2 and λ (section 1
...

If λ 1 ≠ λ 2 , then
f Z (z) =

λ 1 λ 2 ⎛ −λ z
e 2 − e −λ 1 z ⎞ ,
⎠
λ1 − λ2 ⎝

z ≥ 0
...
, X n be independent, identically distributed exponential random
variables with density f (x) = λ e −λ x ; x ≥ 0
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

67

Hence, by (1
...
+ X n is
f Z (s) = ⎛ λ ⎞ n
...

(n − 1)!

Hence, Z has an Erlang distribution with parameters n and λ
...
24 (Normal distribution) The random variables X i are independent and
have a normal distribution with parameters μ i and σ 2 ; i = 1, 2 :
i
f X (x) =
i

⎛
(x − μ i ) 2 ⎞
1
⎟ ; i = 1, 2
...
12 (page 33), the Laplace transforms of the X i are
f X (s) = e
i

−μ i s+ 1 σ 2 s 2
i
2

; i = 1, 2
...
111), the density of the sum Z = X 1 + X 2 has the Laplace transform
f Z (s) = f X (s) f X (s) = e
1
2

−(μ 1 +μ 2 )s+ 1 (σ 2 +σ 2 ) s 2
1 2
2

...
Thus, the sum of two independent, normally distributed random variables also
has a normal distribution
...
+ Xn
is a sum of independent random variables with X i = N(μ i , σ 2 ); i = 1, 2,
...
+ μ n , σ 2 + σ 2 +
...

n
1
2

(1
...
, X can be represented as sum of independent, identically as N(μ/n, σ 2 /n) -distributed random variables
...
2, if (X 1 , X 2 ,
...
+ X n has a normal distribution
...

x
y

(1
...
7
...

For instance, the total claim size an insurance company is confronted with a year is
the sum of a random number of random individual claim sizes
...
3 (Wald's identities) Let X 1 , X 2 ,
...
Let further N be
a positive, integer-valued random variable, which is independent of all X 1 , X 2 ,
...
+ X N are
E(Z) = E(X) E(N)

(1
...

(1
...
+ X N N = n) P(N = n)
∞

= Σ n=1 E(X 1 + X 2 +
...

This proves (1
...

To verify (1
...
+ X n ] 2 ) P(N = n)
...
19),
∞

E(Z 2 ) = Σ n=1 {Var(X 1 + X 2 +
...
+ X n )] 2 } P(N = n)
∞

= Σ n=1 {n Var(X) + n 2 [E(X)] 2 } P(N = n)
= Var(X) E(N) + [E(X)] 2 E(N 2 )
...

This is the identity (1
...

Wald's identity (1
...
is somewhat weakened
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

69

Definition 1
...
if
the occurrence of the random event ' N = n ' is completely determined by the sequence
X 1 , X 2 ,
...
, n = 1, 2,
...

Sometimes, a stopping time defined in this way is called a Markov time and only a
finite Markov time is called a stopping time
...
In this case, E(Y) < ∞
...
are observed one after the other
...
e
...
will not be
observed
...
4 Under otherwise the same assumptions and notation as in theorem 1
...
Then
E(Z) = E(X) E(N)
...
127)

Proof Let binary random variables Y i be defined as follows:
⎧ 1,
Yi = ⎨
⎩ 0,

if N ≥ i
, i = 1, 2,
...
, X i−1
...
Since E(Y i ) = P(N ≥ i) and E(X i Y i ) = E(X i ) E(Y i )
...

Now formula (1
...
127)
...
25 a) Let X i = 1 if the i th flipping of a fair coin yields 'head' and X i = 0
otherwise
...
+ X n = 8}
(1
...
From (1
...
+ X n ) = 1 E(N)
...
+ X n = 8
...

© 2006 by Taylor & Francis Group, LLC

70

STOCHASTIC PROCESSES

b) Let X i = 1 if the i th flipping of a fair coin yields 'head' and X i = −1 otherwise
...
128) is again a finite stopping time for { X 1 , X 2 ,
...
127) yields
E(X + X +
...

1

2

N

The left hand side of this equation is equal to 8
...
Therefore, Wald's equation (1
...

1
...
8
...
All occurring mean values are assumed to exist
...

(1
...
129), assume for simplicity that X has density f (x)
...

This proves the two-sided Chebychev inequality (1
...
The following one-sided
Chebychev inequality is proved analogously:
P(X − μ ≥ ε) ≤

σ2
...
26 The height X of trees in a forest stand has mean value μ = 20 m and
standard deviation σ = 2 m
...
250
...
Then the exact probability that the height of a tree differs at least
4 m from μ is

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

71

P( X − 20 ≥ 4) = P(X − 20 ≥ 4) + P(X − 20 ≤ −4)
= 2 Φ(−2) = 0
...

Thus, Chebyshev's inequality gives a rather rough estimate
...

This results from (1
...

b) For any random variable X with a bell-shaped density f (x) and mode equal to μ ,
P( X − μ ≤ n σ) ≥ 1 − 4 ;
9n 2

n = 1, 2,
...
)
Inequalities of Markov Type Let y = h(x) be a nonnegative, strictly increasing
function on [0, ∞)
...

h(ε)

(1
...
130) is proved as follows:
+∞

E(h( X )) = ∫ −∞ h( y ) f(y)dy
+∞

−ε

≥ ∫ +ε h( y ) f(y)dy + ∫ −∞ h( y ) f(y)dy
+∞

−ε

≥ h( ε )∫ +ε f(y)dy + h( ε )∫ −∞ f(y)dy
= h( ε ) P( X ≥ ε)
...

εa

(1
...
131) Chebychev's inequality is obtained by letting a = 2 and replacing X
with X − μ
...
131 yields an exponential inequality:
P( X ≥ ε) ≤ e −bε E ⎛ e b X ) ⎞
...
132)

Markov's inequality (1
...
132) are usually superior to Chebychev's inequality, since, given X and ε , their right hand sides can be
minimized with respect to a and b
...
8
...
Then ,
E(g(X)) E(h(X)) ≤ E(g(X) h(X))
...
133)

If g is nonincreasing and h nondecreasing or vice versa, then
E(g(X)) E(h(X)) ≥ E(g(X) h(X))
...
134)

As an important special case, let
g(x) = x r and h(x) = x s ; r, s ≥ 0
...

(1
...

Hölder's Inequality Let r and s be positive numbers satisfying
1 + 1 = 1
...

For r = s = 2, Hölder's inequality implies the inequality of Schwarz
...

Inequality of Jensen Let h(x) be a convex (concave) function
...
136)
h(E(X)) ≤ E(h(X))
...

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

1
...
9
...

1) Convergence in Probability A sequence of random variables { X 1 , X 2 ,
...

i→∞

(1
...
}
with property
p
E( X i ) < ∞ ; i = 1, 2,
...

(1
...
} converges in mean towards X
...
} converges in mean square or in square mean towards X
...
}
converges with probability 1 or almost sure towards a random variable X if
P( lim X i = X) = 1
...
Then the sequence { X 1 , X 2 ,
...

i

i→∞

i→∞

Implications
a) 3 implies 4, 2 implies 1, and 1 implies 4
...
} converges towards a finite constant a in distribution, then
{ X 1 , X 2 ,
...
Hence, if the limit is a finite constant, convergence in distribution and convergence in probability are equivalent
...
} converges towards a random variable X in probability, then there
exists a subsequence { X i , X i ,
...
} , which converges towards X
1

with probability 1
...
9
...
They essentially deal with the
convergence behaviour of arithmetic means X n for n → ∞, where
Xn = 1
n

n

Σ i=1 X i
...
5 Let { X 1 , X 2 ,
...
Then the sequence of arithmetic means {X 1 , X 2 ,
...

⎝
⎠

n→∞

Proof In view of Var(X n ) = σ 2 /n , Chebyshev's inequality (1
...

⎝
⎠
n ε2
Letting n → ∞ proves the theorem
...
5 is the following one
...
6 Let { X 1 , X 2 ,
...
On condition
lim Var(X i ) = 0,

i→∞

the sequence {X 1 − μ 1 , X 2 − μ 2 ,
...

Example 1
...

Thus, X has a Bernoulli distribution with
E(X) = p,

Var(X) = p(1 − p)
...
The corresponding sequence of

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

75

indicator variables be X 1 , X 2 ,
...
The X i are independent and identically distributed as X
...
5 is applicable: With respect to convergence in probability,
n

lim X n = lim 1 Σ i=1 X i = p
...
1)
...

The following theorem does not need assumptions on variances
...
} is required, i
...
X i and X j are independent for i ≠ j
...
7 Let { X 1 , X 2 ,
...
Then the corresponding sequence of
arithmetic means X 1 , X 2 ,
...

Theorems 1
...
7 are called weak laws of great numbers, whereas the following
two theorems are strong laws of great numbers, since the underlying convergence
criterion is convergence with probability 1
...
8 Let { X 1 , X 2 ,
...
Then the corresponding sequence of arithmetic
means X 1 , X 2 ,
...

Theorems 1
...
8 imply that the sequence of relative frequencies
p 1 (A), p 2 (A),
...
The following theorem abandons the assumption of identically
distributed random variables
...
9 Let { X 1 , X 2 ,
...

i
On condition that

∞
Σ i=1 (σ i /i) 2 < ∞ ,

the sequence Y 1 , Y 2 ,
...

© 2006 by Taylor & Francis Group, LLC

76

STOCHASTIC PROCESSES

1
...
3 Central Limit Theorem
The central limit theorem provides the theoretical base for the dominant role of the
normal distribution in probability theory and its applications
...
There
are several variations of the central limit theorem
...

Theorem 1
...
+ X n be the sum of n
independent, identically distributed random variables X i with finite mean E(X i ) = μ
and finite variance Var(X i ) = σ 2 , and let Z n be the standardization of Y n :
Zn =

Y n − nμ

...

−∞
2π

Corollary Under the conditions of theorem 1
...

(1
...
(The fact that Y n has
mean value nμ and variance n σ 2 follows from (1
...
)
As a rule of thumb, (1
...
The following theorem shows that the assumptions of theorem 1
...

Theorem 1
...
+ X n be the sum of independent random variables X i with finite means μ i = E(X i ) and finite variances
σ 2 = Var(X i ) , and let Z n be the standardization of Y n :
i
Zn =

Y n − E(Y n )
Var(Y n )

=

Yn − Σn μi
i=1
Σn σ2
i=1 i

Then Z n has the properties
lim P(Z n ≤ x) = Φ(x) =
n→∞

x

...
140)

lim E(Z n ) → ∞,

(1
...
142)

n→∞

and
lim

n→∞ i=1,2,
...

Conditions (1
...
142) imply that no term X i in the sum dominates the rest
and that, for n → ∞ , the contributions of the X i to the sum uniformly tend to 0
...
10, the X i a priori have this property
...
28 On weekdays, a car dealer sells on average one car (of a certain
make) per μ = 2
...
6
...
, X 0 = 0 be the time span between selling the (i - 1)th
and the i th car
...
+ X n is the time point, at which the n th car
is sold (selling times negligibly small)
...
If the X i are assumed to be independent,
E(Y 35 ) = 35 ⋅ 2
...
6 2 = 89
...

In view of (1
...
6) -distribution
...
95) = 0
...

⎝ 9
...
95?
(It is assumed that this special make of a car is delivered by the manufacturer at no
other times
...
95
...
4 (n + 1) ⎞
P(Y n+1 ≤ 75) ≤ 0
...
05
...
6 n + 1 ⎠
Since the 0
...
05 = −1
...
4 (n + 1)
1
...

© 2006 by Taylor & Francis Group, LLC

≤ −1
...
85) 2 ≥ 37
...

78

STOCHASTIC PROCESSES

Normal Approximation to the Binomial Distribution As pointed out in section
1
...

A series of n random experiments with respective outcomes X 1 , X 2 ,
...
Then
Yn = X1 + X2 +
...
The random variable Y n has a binomial
distribution with parameters n and p
...

Since the assumptions of theorem 1
...

Thus,
⎛ i + 1 − np
2 2
P(i 1 ≤ Z n ≤ i 2 ) ≈ Φ ⎜
⎜
⎜ np(1 − p)
⎝

⎞
⎛ i − 1 − np
⎟ − Φ⎜ 1 2
⎟
⎜
⎟
⎜ np(1 − p)
⎠
⎝

⎞
⎟;
⎟
⎟
⎠

0 ≤ i 1 ≤ i 2 ≤ n
...

The term ±1/2 is called a continuity correction
...
These
approximation formulas are the better, the larger n is and the nearer p is to 1/2
...

The approximation of the binomial distribution by the normal distribution is known
as the central limit theorem of Moivre-Laplace
...
12 (Moivre-Laplace) If the random variable X has a binomial distribution with parameters n and p, then, for all x,
⎛ X − np
⎞
lim P ⎜
≤ x⎟ =
⎜
⎟
⎟
n→∞ ⎜ np(1 − p)
⎝
⎠

x

2
1
∫ e −u /2 du
...
29 Electronic circuits are subjected to a quality test
...
What is the probability that the proportion of faulty units
in a sample of 1000 circuits is between 4% and 6%?
Let X be the random number of faulty circuits in the sample
...
05
...
05) i (0
...

⎝ i ⎠

For numerical reasons, it makes sense to apply the normal approximation: Since
E(X) = 1000 ⋅ 0
...
05 ⋅ 0
...
5 > 10,
its application will yield satisfactory results:
P(40 ≤ X ≤ 60) ≈ Φ ⎛ 60 + 0
...
5 − 50 ⎞
⎝
⎠
⎝
⎠
6
...
892
= Φ(1
...
523)
= 0
...

Normal Approximation to the Poisson Distribution Let
Yn = X1 + X2 +
...
, X n
with respective parameters λ 1 , λ 2 ,
...
Then, by example 1
...
7
...

M Y n (z) = e (λ 1 +λ 2 + +λ n ) z
...
+ λ n
...
Since the assumptions of theorem 1
...

Therefore,

© 2006 by Taylor & Francis Group, LLC

80

STOCHASTIC PROCESSES
X ≈ N(λ, λ),

⎛
⎞
F X (x) ≈ Φ ⎜ x − λ ⎟
...

⎜ 2
P(X = i) ≈ Φ ⎜
⎟
⎟
⎜
⎜
⎜
λ ⎟
λ ⎟
⎠
⎠
⎝
⎝
Since the distribution of a nonnegative random variable is approximated by the normal distribution, the assumption
E(X) = λ > 3 Var(X) = 3 λ
has to be made
...

Example 1
...

1) What is the probability that there are exactly 10 traffic accidents a day?
10
P(X = 10) = 12 e −12 = 0
...

10!

The normal approximation yields
⎛
⎛
⎞
⎞
P(X = 10) ≈ Φ ⎜ 10 + 0
...
5 − 12 ⎟
⎝
⎠
⎝
⎠
12
12
= 0
...
2330
= 0
...

2) What is the probability that there are at least 10 traffic accidents a day?
For computational reasons, it is convenient to apply the normal approximation:
P(X ≥ 10) =

∞ 12 i
⎛
⎞
e −12 ≈ 1 − Φ ⎜ 9 + 0
...
7673
...
10 EXERCISES
Sections 1
...
3
1
...
Let A, B and C be the
events that a casting does not weigh more than 1 or 5 kg , exactly 10 kg , and at least
20 kg, respectively
...

1
...
Based on this random experiment, three random events are introduced as follows:
A = 'no person has gene g'
B='at least one person has gene g'
C = 'not more than one person has gene g'
(1) Characterize verbally the random events A ∩ B, B ∪ C and (A ∪ B) ∩ C
...

1
...
3; P(B) = 0
...
2
...

1
...
The results are:

diameter

yes
no

surface quality
acceptable unacceptable
170
15
8
7

A plate is selected at random from these 200
...

(1) Determine the probabilities P(A), P(B) and P( A ∩ B) from the matrix
...
1, determine P( A ∪ B) and P( A ∪ B)
...
5) A company optionally equips its newly developed PC Ibson with 2 or 3 hard
disk drives and with or without extra software and analyzes the first 1000 orders:

extra software yes
no

© 2006 by Taylor & Francis Group, LLC

hard disk drives
three
two
520
90
70
320

82

STOCHASTIC PROCESSES

A PC is selected at random from the first 1000 orders
...

(1) Determine the probabilities
P(A), P(B), and P(A ∩ B)
from the matrix
...
1 determine the probabilities
P(A ∪ B), P(A B), P(B A), P(A ∪ B B) and P(A |B)
...
6) 1000 bits are independently transmitted from a source to a sink
...
0005
...
7) To construct a circuit a student needs, among others, 12 chips of a certain type
...

How many chips have to be provided so that, with a probability of not less than 0
...
8) It costs $ 50 to find out whether a spare part required for repairing a failed device
is faulty or not
...

Is it on average more profitable to use a spare part without checking if
(1) 1% of all spare parts of that type
(2) 3% of all spare parts of that type
(3) 10 % of all spare parts of that type
are faulty ?
1
...
99 if the circuit is faultless
...
90 if the circuit
is faulty
...
02
...
10) Suppose 2% of cotton fabric rolls and 3% of nylon fabric rolls contain flaws
...

(1) What is the probability that a randomly selected roll used by the manufacturer
contains flaws?
(2) Given that a randomly selected roll used by the manufacturer does not contain
flaws, what is the probability that it is a nylon fabric roll?

© 2006 by Taylor & Francis Group, LLC

1 PROBABILITY THEORY

83

1

4

s

t

3
2

5

1
...
The figure indicates the possible
interruption of an edge (connection between two nodes of the transmission graph) by
a switch
...
All 5 switches operate independently
...
Only switch
3 allows for transmitting information into both directions
...

1
...
Random noise may cause transmission failures: If a 0 was sent,
then a 1 will arrive at the sink with probability 0
...
If a 1 was sent, then a 0 will arrive at the sink with probability 0
...
(Figure)
...
What is the probability that a 1 had been sent?
(2) A 0 has arrived
...
90

0

0
0
...
10

1

1

0
...
13) A biologist measured the weight of 132 eggs of a certain bird species [gram]:
i
1
weight x i
38
number of eggs n i 4

© 2006 by Taylor & Francis Group, LLC

2
41

3
42

4
43

5
44

6
45

7
46

8
47

9
48

10
50

6

7

10

13

26

33

16

10

7

84

STOCHASTIC PROCESSES

There are no eggs weighing less than 38 or more than 49
...

(1) Determine the probability distribution of X, i
...
p i = P(X = x i ); i = 1, 2,
...

(2) Determine P(43 ≤ X ≤ 48) and P(X > 45)
...

1
...
0

15
...
1

15
...
3

4

15
...
5 > 15
...

(1) Determine the probabilities p i = P(X = x i ); i = 1, 2,
...

(2) Determine the probabilities P(X ≤ 15
...
4), and P(15
...
5)
...

1
...
13
...

1
...
14
...

1
...

The probability that for some reason or other a passenger does not show up is 0
...

The passengers behave independently
...
18) Water samples are taken from a river once a week
...
It is known that on
average 10% of the samples are polluted
...
19) From the 300 chickens of a farm, 100 have attracted bird flue
...
20) Some of the 140 trees in a park are infested with a fungus
...

(1) If 25 trees from the 140 are infested, what is the probability that the sample contains at least one infested tree?
(2) If 5 trees from the 140 are infested, what is the probability that the sample contains at least two infested trees?
© 2006 by Taylor & Francis Group, LLC

85

1 PROBABILITY THEORY

1
...
Suppose that the
number of flaws follows a Poisson distribution with a mean of 0
...
What is the probability of more than 2 flaws in a section of length 10 centimetre?
1
...
1 per centimetre squared
...
23) The random number of crackle sounds produced per hour by an old radio has a
Poisson distribution with parameter λ = 12
...
24) Show that the following functions are probability density functions for some
value of c and determine c:

(1) f (x) = c x 2 ,

0≤x≤4

(2) f (x) = c (1 + 2x),
(3) f (x) = c e −x ,

0≤x≤2

0≤x<∞

These functions are assumed to be identically 0 outside their respective ranges
...
25) Consider a random variable X with probability density function
2
f (x) = x e −x /2 , x ≥ 0
...
5, P(X ≤ x) = 0
...
95
...
26) A road traffic light is switched on every day at 5:00 a
...
It always begins with
'red' and holds this colour 2 minutes
...
This cycle continues till midnight
...
m
...
27) According to the timetable, a lecture begins at 8:15
...

What is the probability that Sluggish arrives after Durrick in the venue?
1
...
24
...
29) The lifetimes of bulbs of a particular type have an exponential distribution with
parameter λ [h −1 ]
...
Their lifetimes can be assumed independent
...
30) The probability density function of the annual energy consumption of an enterprise [in 10 8 kwh ] is
f (x) = 30(x − 2) 2 ⎡ 1 − 2(x − 2) + (x − 2) 2 ⎤ ,
⎣
⎦

2 ≤ x ≤ 3
...

(2) What is the probability that the annual energy consumption exceeds 2
...
31) Assume X is normally distributed with mean 5 and standard deviation 4
...
5 , P(X > x) = 0
...
2 , P(3 < X < x) = 0
...
99
...
32) The response time of an average male car driver is normally distributed with
mean value 0
...
06 (in seconds)
...
6 seconds?
(2) What is the probability that the response time is between 0
...
55 seconds?
1
...

(1) Determine the probability that the tensile strength of a sample is at least 28 psi
...
34) The total monthly sick-leave time X of employees of a small company has a
normal distribution with mean 100 hours and standard deviation 20 hours
...
1?
1
...
; 0 ≤ θ ≤ 1
...

87

1 PROBABILITY THEORY
1
...
What distribution type arises when mixing the F α with regard
to the structure distribution density
f (α) = λ e −λα , λ > 0, α > 0 ?
Sections 1
...
5
1
...
Assume that an arriving customer does not find an available taxi, the previous one left 3 minutes ago, and no
other customers are waiting
...
38) The random variable X has distribution function
F(x) = λ x /(1 + λ x), λ > 0, x ≥ 0
...

1
...

3
x, 3 ≤ x ≤ 4
10

0,

otherwise

With the notation introduced in section 1
...

(1) Show that X ≤ Y
...

st
st
1
...

(1) Show that A ≤ B and A ≤ B
...
Show that if X is independent of
A and B then A + X ≤ B + X
...
Show that
AX ≤ BX
...
6 to 1
...
41) Every day a car dealer sells X cars of type I and Y cars of type II
...
1

0
...
1

0
...
1

2

0

0
...
1

Y
X

(1) Determine the marginal distributions of (X, Y)
...

1
...

(1) Are X and Y independent?
(2) Determine the probability density of Z = X Y
...
43) The random vector (X, Y) has joint density
f X,Y (x, y) = 6x 2 y,

0 ≤ x, y ≤ 1
...

1
...
20 of them achieve an average
daily turnover of $8000, whereas 4 achieve an average daily turnover of $ 10,000
...
The daily
turnovers of all shop-assistants are independent and have a normal distribution
...

(1) Determine E(Z) and Var(Z)
...
45) A helicopter is allowed to carry at most 8 persons provided that their total
weight does not exceed 620 kg
...

(1) What are the probabilities of exceeding the permissible load with 7 and 8 passengers, respectively?
(2) What would the maximum total permissible load have to be to ensure that, with
probability 0
...
46) A freighter has to be loaded with 2000 tons of hard coal
...

What is the smallest number n = n min of railway carriages which are necessary to
make sure that with a probability of not less than 0
...
47) In a certain geographical region, the height X of women has a normal distribution with E(X ) = 168 cm and Var(X ) = 64 cm , whereas the height Y of men has a
normal distribution with E(Y) = 175 cm and Var(Y) = 100 cm
...

Hint The desired probability has structure P(X ≥ Y) = P(X + (−Y) ≥ 0)
...
48)* Let X 1 and X 2 be independent and identically distributed with density
1
f (x) = π λ , x ∈ (−∞, +∞)
...
2
...
2)
...

1
...

(1) Determine the z-transform of X and by means of it E(X ) and Var(X )
...

Determine the z-transform of Z = X 1 + X 2 and by means of it E(Z) and Var(Z)
...

1
...
, X k be independent, binomially distributed random variables
with respective parameters (n 1 , p 1 ), (n 2 , p 2 ),
...

Under which condition has the sum Z = X 1 + X 2 +
...

1
...
e
...

⎩ 0, otherwise

(1) Are X and Y independent?
(2) Determine the density of the sum Z = X + Y
...
52) Let X have a Laplace distribution with parameters λ and μ , i
...
X has density
f (x) = λ e −λ x−μ ,
2

λ > 0, − ∞ < μ < +∞, − ∞ < x < +∞
...

1
...
Let B n be
the number of people in a sample of n randomly selected citizens from this town
which suffer from this desease (Bernoulli trial)
...
06 ≥ 0
...
05 for all n with n ≥ n 0
...

© 2006 by Taylor & Francis Group, LLC

CHAPTER 2

Basics of Stochastic Processes
2
...

A change of these conditions will influence the outcome of the experiment, i
...
the
probability distribution of X will change
...
This approach leads to more general random experiments than the ones defined in section 1
...
To illustrate such generalized random experiments, two simple
examples will be considered
...
1 a) At a fixed geographical point, the temperature is measured every
day at 12:00
...
The value
of x i will vary from year to year and, therefore, it can be considered a realization of a
random variable X i
...
Apart from random fluctuations of the temperature, the X i also
depend on a deterministic parameter, namely on the time, or, more precisely, on the
day of the year
...
Nevertheless, indexing the daily
temperatures is necessary, because modeling the obviously existing statistical dependence between the daily temperatures requires knowledge of the joint probability
distribution of the random vector ( X 1 , X 2 , X 3 )
...
The random outcomes of this generalized random experiment are sequences
of random variables {X 1 , X 2 ,
...
If on the ith day temperature x i has been measured,
then the vector (x 1 , x 2 ,
...
, 365] : x(t) = x i for t = i
...
, x 365 )
is a realization of the random vector (X 1 , X 2 ,
...

b) If a sensor graphically records the temperature over the year, then the outcome of
the measurement is a continuous function of time t: x = x(t), 0 ≤ t ≤ 1, where x(t) is
realization of the random temperature X(t) at time t at a fixed geographical location
...
It
will be denoted as {X(t), 0 ≤ t ≤ 1}
...
, X(t n )); 0 ≤ t 1 < t 2 <
...

This knowledge allows for statistically modelling the dependence between the X(t i )
in any sequence of random variables
X(t 1 ), X(t 2 ),
...

It is quite obvious that, for small time differences t i+1 − t i , there is a strong statistical
dependence between X(t i ) and X(t i+1 )
...

Example 2
...
For instance, if at a fixed time point and a fixed
observation point the temperature is measured along a vertical of length L to the
earth's surface, then one obtains a function x = x(h) with 0 ≤ h ≤ L which obviously
depends on the distance h of the measurement point to the earth's surface
...
Hence, the temperature
at distance h is a random variable X(h) and the generalized random experiment 'measuring the temperature along a vertical of length L', denoted as {X(h), 0 ≤ h ≤ L}, has
outcomes, which are real functions of h: x = x(h), 0 ≤ h ≤ L
...

Then the observation x depends on a vector of deterministic parameters:
x = x(θ), θ = (h, t)
...
However, this book only considers one-dimensional
parameter spaces
...
Despite maintaining constant production conditions, minor variations of the rope diameter can technologically not be avoided
...
This function will randomly vary from rope to rope
...
If X(d) denotes the

© 2006 by Taylor & Francis Group, LLC

2 BASICS OF STOCHASTIC PROCESSES

93

x(d)

5
...
00
4
...
1 Random variation of the diameter of a nylon rope

diameter of a randomly selected rope at a distance d from the origin, then it makes
sense to introduce the corresponding generalized random experiment
{X(d ), 0 ≤ d ≤ 10}
with outcomes x = x(d ) , 0 ≤ d ≤ 10 (Figure 2
...

In contrast to the random experiments considered in chapter 1, the outcomes of
which are real numbers, the outcomes of the generalized random experiments, dealt
with in examples 2
...
2, are real functions
...
However, the terminology stochastic processes is more common and will be used throughout the
book
...
To simplify the terminology and in view of the overwhelming majority of applications, in this book the parameter t is interpreted as time
...
Further, let Z denote the set of all values,
the random variables X(t) can assume for all t ∈ T
...

If T is a finite or countably infinite set, then {X(t), t ∈ T} is called a stochastic process in discrete time or a discrete-time stochastic process
...
} (example 2
...
On the
other hand, every sequence of random variables can be thought of as a stochastic
process in discrete time
...
A stochastic process
{X(t), t ∈ T} is said to be discrete if its state space Z is a finite or a countably infinite set, and a stochastic process {X(t), t ∈ T} is said to be continuous if Z is an interval
...
Throughout this book the
state space Z is assumed to be a subset of the real axis
...
e
...
Such a function is called a sample path, a trajectory or a realization
of the stochastic process
...
The sample
paths of a stochastic process in discrete time are, therefore, sequences of real numbers, whereas the sample paths of stochastic processes in continuous time can be any
functions of time
...
The set of all sample paths of
a stochastic process with parameter space T is, therefore, a subset of all functions
over the domain T
...
Random fluctuations of the voltage
are for instance caused by thermal noise
...
2)
...

This is characterized by random fluctuations in the energy of received signals caused
by the dispersion of radio waves as a result of inhomogeneities in the atmosphere and
by meteorological and industrial noise
...
) 'Classic' applications of stochastic processes in economics are modeling the development of share prices, profits, and prices of commodities over time
...
In statistical quality control, they model the fluctuation of quality criteria over time
...
One of the first applications of stochastic processes can be found in biology: modeling the development
in time of the number of species in a population
...
2 Voltage fluctuations caused by random noise
© 2006 by Taylor & Francis Group, LLC

t

2 BASICS OF STOCHASTIC PROCESSES

95

2
...
Let F t (x) be the distribution function of X(t):
F t (x) = P(X(t) ≤ x), t ∈ T
...
In view of the statistical dependence, which generally exists between the X(t 1 ), X(t 2 ),
...
, t n , the family of the one-dimensional distribution functions {F t (x), t ∈ T}
does not completely characterize a stochastic process (see examples 2
...
2)
...
, for all n-tuples t 1 , t 2 ,
...
, x n } with
x i ∈ Z , the joint distribution function of the random vector
(X(t 1 ), X(t 2 ),
...
,t n (x 1 , x 2 ,
...
, X(t n ) ≤ x n )
...
1)

The set of all these joint distribution functions defines the probability distribution of
the stochastic process
...
, X(t n ) ∈ A n )
for all t 1 , t 2 ,
...
, n; n = 1, 2,
...
2)
m(t) = E(X(t)), t ∈ T
...
If the densities f t (x) = dF t (x) /dx exist, then
+∞

m(t) = ∫ −∞ x f t (x) dx ,

t ∈ T
...

Hence, in view of (1
...
64),

C(s, t) = Cov (X(s), X(t)) = E([X(s) − m(s)] [X(t) − m(t)]) ; s, t ∈ T,

(2
...

© 2006 by Taylor & Francis Group, LLC

(2
...

(2
...

(2
...

(2
...
3 shows that this need not be the case
...
According to (1
...
8)
ρ(s, t) =

...
This is useful when considering covariances and correlations between X(s) and Y(s) with regard to different
stochastic processes {X(t), t ∈ T} and {Y(t), t ∈ T}
...
3 (cosine wave with random amplitude) Let
X(t) = A cos ωt ,

where A is a nonnegative random variable with E(A) < ∞
...
(Random deviations of the amplitudes from a nominal value are technologically unavoidable
...

By (2
...

Hence,
C(s, t) = Var(A)(cos ωs)(cos ωt)
...
7)
...
Actually, the correlation function between X(s)
and X(t) is identically equal to 1: ρ(s, t) ≡ 1
...
6
...
3 has a special feature: Once the random variable A has assumed a value a, the process develops in a strictly deterministic way
...
(The same comment refers to examples 2
...
7)
...
The following example
belongs to this category
...
3 Pulse code modulation

Example 2
...
The symbol 0 is transmitted by
sending nothing during a time interval of length one
...
The
source has started operating in the past
...
9)

where the A n ; n = 0, ±1, ±2,
...

h(t) = ⎨
elsewhere
⎩ 0

For any t,
X(t) =

0 with probability
p

...
3 is generated
by the following partial sequence of a signal:

...

© 2006 by Taylor & Francis Group, LLC

98

STOCHASTIC PROCESSES

Note that the time point t = 0 coincides with the beginning of a new transmission period
...

For n ≤ s, t < n + 1; n = 0, ±1, ±2,
...

If m ≤ s < m + 1 and n ≤ t < n + 1 with m ≠ n , then X(s) and X(t) are independent
...

...
A modification of the pulse code modulation process is considered in example 2
...
As the following example shows, the pulse code modulation is a special
shot noise process
...
5 (shot noise process) At time points T n , pulses of random intensity A n
are induced
...
} and {A 1 , A 2 ,
...
and lim T n = ∞,
n→∞

2) E(A n ) < ∞; n = 1, 2,
...
} is called a pulse process
...
1, it will be called a marked point process
...

t→∞

(2
...
11)

is called a shot noise process or just shot noise
...
The factors A n are sometimes called amplitudes of the shot noise process
...
4, even constant
...
} and {A n ; n = 0, ±1, ±2,
...

n=−∞

(2
...
This fluctuation
is caused by random current impulses, which are initiated by emissions of electrons
from the anode at random time points (Schottky effect)
...
11)
...

2
...
In the context of example 2
...
) Or, has the sample path of the temperature in
January any influence on the temperature curve in February? For reliably predicting
tomorrow's temperature at 12:00, is it sufficient to know the present temperature or
would knowledge of the temperature curve during the past two days allow a more accurate prediction? What influence has time on trend or covariance function?
Special importance have those stochastic processes, for which the joint distribution
functions (2
...
e
...
, t n to each other have an impact on the joint distribution of the
random variables X(t 1 ), X(t 2 ),
...

Strong Stationarity A stochastic process {X(t), t ∈ T} is said to be strongly stationary or strictly stationary if for all n = 1, 2,
...
, t n ) with t i ∈ T and t i + h ∈ T; i = 1, 2,
...
, x n ) , the joint distribution function of the random vector (X(t 1 ), X(t 2 ),
...
, t n (x 1 , x 2 ,
...
, t n +h (x 1 , x 2 ,
...

1
2

(2
...
In particular, by letting n = 1, property (2
...
In this
case there exists a distribution function F(x) so that

© 2006 by Taylor & Francis Group, LLC

100

STOCHASTIC PROCESSES
F t (x) ≡ F(x) , t ∈ T
...
14)

Hence, trend- and variance function of {X(t), t ∈ T} do not depend on t either:
m(t) = E(X(t)) ≡ m = constant ,

(2
...

The trend function of a strongly stationary process is, therefore, a parallel to the time
axis and the fluctuations of its sample paths around the trend function experience no
systematic changes with increasing t
...
13) yields for all s < t ,
F 0, t−s (x 1 , x 2 ) = F s, t (x 1 , x 2 ),
i
...
the joint distribution function of the random vector (X s , X t ) , and, therefore, the
mean value of the product X s X t , depend only on the difference τ = t − s , and not on
the absolute values of s and t
...
4),
C(s, t) = E[X(s) X(t)] − m 2 for s, t ∈ T,
C(s, t) must have the same property:
C(s, t) = C(s, s + τ) = C(0, τ) = C(τ)
...

(2
...
e
...

(2
...
, X(t n )} in order to check
whether a stochastic process is strongly stationary or not
...
15) and
(2
...
Hence, based on these two properties, another concept of stationarity has
been introduced
...

Second Order Process A stochastic process {X(t), t ∈ T} is said to be a second order process if

E(X 2 (t)) < ∞ for all t ∈ T
...
18)

The existence of the second moments of X(t) as required by assumption (2
...
8
...

© 2006 by Taylor & Francis Group, LLC

2 BASICS OF STOCHASTIC PROCESSES

101

Weak Stationarity A stochastic process {X(t), t ∈ T} is said to be weakly stationary if it is a second order process, which has properties (2
...
16)
...
But, if a second
order process is strongly stationary, then, as shown above, it is also weakly stationary
...
Further important properties of stochastic processes are based on properties of their increments
...

A stochastic process {X(t), t ∈ T} is said to have homogeneous or stationary increments if for arbitrary, but fixed t 1 , t 2 ∈ T the increment X(t 2 + τ) − X(t 1 + τ) has the
same probability distribution for all τ with property t 1 + τ ∈ T, t 2 + τ ∈ T
...

A stochastic process with homogeneous (stationary) increments need not be stationary in any sense
...
and for all n-tuples (t 1 , t 2 ,
...
< tn

the increments
X(t 2 ) − X(t 1 ), X(t 3 ) − X(t 2 ),
...

Gaussian Process A stochastic process {X(t), t ∈ T} is a Gaussian process if the
random vectors (X(t 1 ), X(t 2 ),
...
, t n ) with t i ∈ T and t 1 < t 2 <
...

Gaussian processes have an important property: A Gaussian process is strongly stationary if and only if it is weakly stationary
...

Markov Process A stochastic process {X(t), t ∈ T} has the Markov(ian) property if
for all (n + 1)−tuples (t 1 , t 2 ,
...
< t n+1 , and for
any A i ⊆ Z; i = 1, 2,
...
, X(t 1 ) ∈ A 1 )
= P(X(t n+1 ) ∈ A n+1 X(t n ) ∈ A n )
...
19)

102

STOCHASTIC PROCESSES

The Markov property has the following implication: If t n+1 is a time point in the future, t n the present time point and, correspondingly, t 1 , t 2 ,
...
Stochastic processes having the Markov property are called Markov processes
...
Otherwise it is called a continuous-time Markov process
...
Thus, a discrete-time Markov chain has both a discrete state space and a discrete parameter space
...

Markov processes play an important role in all sorts of applications, mainly for four
reasons: 1) Many practical phenomena can be modeled by Markov processes
...
3) Computer algorithms
are available for numerical evaluations
...
In this book, the practical importance of
Markov processes is illustrated by many examples
...
1 A Markov process is strongly stationary if and only if its one-dimensional probability distributions do not depend on time, i
...
if there exists a distribution function F(x) with

F t (x) = P(X(t) ≤ x) = F(x) for all t ∈ T
...
14) is necessary and sufficient for a Markov process to be strongly stationary
...

h→ 0

(2
...

According to section 1
...
1, the convergence used in (2
...

There is a simple criterion for a second order stochastic process to be mean-square
continuous at t 0 : A second order process {X(t), t ∈ T} is mean-square continuous at
t 0 if and only if its covariance function C(s, t) is continuous at (s, t) = (t 0 , t 0 )
...

© 2006 by Taylor & Francis Group, LLC

2 BASICS OF STOCHASTIC PROCESSES

103

The following two examples make use of two addition formulas from trigonometry:
cos α cos β = 1 [cos(β − α) + cos(α + β)] ,
2
cos(β − α) = cos α cos β + sin α sin β
...
6 (cosine wave with random amplitude and random phase) In modifying example 2
...

The random parameter Φ is assumed to be uniformly distributed over [0, 2π] and independent of A
...
Since
2π
E(cos(ωt + Φ)) = 1 ∫ 0 cos(ωt + ϕ) dϕ
2π

= 1 [sin(ωt + ϕ)] 2π = 0,
0
2π

the trend function of this process is identically zero: m(t) ≡ 0
...
4), its covariance function is
C(s, t) = E{[A cos(ωs + Φ)][A cos(ωt + Φ)]}
2π
= E(A 2 ) 1 ∫ 0 cos(ωs + ϕ) cos(ωt + ϕ) dϕ
2π
2π
= E(A 2 ) 1 ∫ 0 1 {cos ω(t − s) + cos [ω(s + t) + 2ϕ]} dϕ
...
Since the integral of the
second term is zero, C(s, t) depends only on the difference τ = t − s :
C(τ) = 1 E(A 2 ) cos wτ
...

Example 2
...

The stochastic process {X(t), t ∈ (−∞, +∞)} be defined by
X(t) = A cos ωt + B sin ωt
...
Its
trend function is identically zero: m(t) ≡ 0
...
4),
C(s, t) = E(X(s) X(t))
...
Hence,
© 2006 by Taylor & Francis Group, LLC

104

STOCHASTIC PROCESSES
C(s, t) = E(A 2 cos ωs cos ωt + B 2 sin ωs sin ωt)
+ E(AB cos ωs sin ωt + AB sin ωs cos ωt)
= σ 2 (cos ωs cos ωt + sin ωs sin ωt)
+ E(AB) (cos ωs sin ωt + sin ωs cos ωt)
= σ 2 cos ω (t − s)
...

Thus, the process {X(t), t ∈ (−∞, +∞)} is weakly stationary
...
8 (randomly delayed pulse code modulation) Based on the stochastic
process {X(t), t ∈ (−∞, +∞)} defined in example 2
...
Thus, when shifting the
sample paths of the process {X(t), t ∈ (−∞, +∞)} exactly Z time units to the right, one
obtains the corresponding sample paths of the process {Y(t), t ∈ (−∞, +∞)}
...
3 exactly Z = z time
units to the right yields the corresponding section of the sample path of the process
{Y(t), t ∈ (−∞, +∞)} shown in Figure 2
...

The trend function of the process {Y(t), t ∈ (−∞, +∞)} is
m(t) ≡ a (1 − p)
...
Then
P( B) = t − s ,

P(B) = 1 − t − s
...

Therefore,
C(s, t) = 0 if t − s > 1 and/or B occurs
...
Hence, the covariance function of {Y(t), t ∈ (−∞, +∞)} given t − s ≤ 1 can be obtained as follows:
y(t)

a

1

-1

0

0

1

z 1

1

2

0

3

0

4

1

5

Figure 2
...
5 Covariance function of the randomly delayed pulse code modulation

C(s, t) = E(X(s) X(t) B)P(B) + E(X(s) X(t) B) P(B) − m(s) m(t)
= E(X(s)) E(X(t)) P(B) + E([X(s)] 2 ) P(B) − m(s) m(t)
= [a(1 − p)] 2 t − s + a 2 (1 − p)(1 − t − s ) − [a(1 − p)] 2
...

elsewhere

Thus, the process {Y(t), t ∈ (−∞, +∞)} is weakly stationary
...
3 to example 2
...
4
...
4 EXERCISES
2
...

Is this process weakly stationary?
2
...

Determine its trend function m(t) and, for μ = 2 and σ = 0
...

2
...

(1) Determine trend-, covariance- and correlation function of {X(t), t ∈ (−∞, +∞)}
...
4) Let X(t) = A(t) sin(ω t + Φ) , where A(t) and Φ are independent, nonnegative random variables for all t, and let Φ be uniformly distributed over [0, 2π]
...

2
...
, a n be a sequence of real numbers and {Φ 1 , Φ 2 ,
...
Determine covariance- and correlation function of the stochastic process {X(t), t ∈ (−∞, +∞)} given by
n

X(t) = Σ i=1 a i sin(ω t + Φ i )
...
6)* A modulated signal (pulse code modulation) {X(t), t ∈ (−∞, +∞)} is given by
+∞
X(t) = Σ −∞ A n h(t − n) ,

where the A n are independent and identically distributed random variables which
can only take on values −1 and +1 and have mean value 0
...

0
elsewhere

1) Sketch a section of a possible sample path of the process {X(t), t ∈ (−∞, +∞)}
...

3) Let Y(t) = X(t − Z), where the random variable Z has a uniform distribution over
[0, 1]
...
7) Let {X(t), t ∈ (−∞, +∞)} and {Y(t), t ∈ (−∞, +∞)} be two independent, weakly
stationary stochastic processes, whose trend functions are identically 0 and which
have the same covariance function C(τ)
...

2
...

Verify: (1) The discrete-time stochastic process {X(t); t = 1, 2,
...
(2) The continuous-time stochastic process {X(t), t ≥ 0} is neither weakly nor strongly stationary
...
9) Let {X(t), t ∈ (−∞, +∞)} and {Y(t), t ∈ (−∞, +∞)} be two independent stochastic
processes with trend- and covariance functions m X (t), m Y (t) and C X (s, t), C Y (s, t),
respectively
...

Determine the covariance functions of the stochastic processes {U(t), t ∈ (−∞, +∞)}
and {V(t), t ∈ (−∞, +∞)}
...
1 BASIC CONCEPTS
A point process is a sequence of real numbers {t 1 , t 2 ,
...
and

lim t i = +∞
...
1)

That means, a point process is a strictly increasing sequence of real numbers, which
does not have a finite limit point
...
), failure time points of machines, time points of traffic accidents, occurrence of nature catastrophies, occurrence of supernovas,
...
Hence, the t i are called event times
...

If not stated otherwise, the assumption t 1 ≥ 0 is made
...
For instance, sequences {t 1 , t 2 ,
...
Then t i denotes the distance of the
i th pothole from the beginning of the road
...
(This is the base
of the well-known Bitterlich method for estimating the total number of trees in a forest stand
...
1), they have to be considered finite samples from a point process
...
} can equivalently be represented by the sequence of its
interevent (interarrival) times
{ y 1 , y 2 ,
...
; t 0 = 0
...
This number is denoted as n(t) :
n(t) = max {n, t n ≤ t}
...
Here and in what follows, it is assumed that more than one
event cannot occur at a time
...

The number of events, which occur in an interval (s, t] , s < t, is
n(s, t) = n(t) − n(s)
...

⎩ 0 otherwise
Then,

(3
...

Example 3
...
Then, within the first 16 seconds, n(16) = 3 cars passed
the control point, and in the interval (31, 49] exactly n(31, 49) = n(49) − n(30) = 5
cars passed the control point
...
2), given the time
span A = (10, 20] [51, 60]
I 18 (A) = I 24 (A) = I 51 (A) = I 57 (A) = I 59 (A) = 1,
I i (A) = 0 for i ≠ 18, 24, 51, 57, 59
...

Recurrence Times The forward recurrence time of a point process {t 1 , t 2 ,
...
, t 0 = 0
...
3)
Hence, a(t) is the time span from t (usually interpreted as the 'presence') to the occurrence of the next event
...

(3
...

The backward recurrence time b(t) with respect to time point t is
b(t) = t − t n(t)
...
5)

Thus, b(t) is the time which has elapsed from the last event time before t to time t
...
For instance: If t i is the time point the i th customer arrives at a supermarket, then the customer will spend there a certain amount of
money m i
...
If t i denotes the time of the
i th bank robbery in a town, then the amount m i the robbers got away with is of interest
...
If t i is the time of the i th
supernova in a century, then its light intensity m i is of interest to astronomers, and
so on
...
}, a sequence of two-dimensional vectors
{(t 1 , m 1 ), (t 2 , m 2 ),
...
6)

with m i being an element of a mark space M is called a marked point process
...

Random Point Processes Usually the event times are random variables
...
} with
T 1 < T 2 <
...
7)

is a random point process
...
; T 0 = 0,
a random point process can equivalently be defined as a sequence of positive random
variables {Y 1 , Y 2 ,
...

In either case, with the terminology introduced in section 2
...
Thus, a point process (3
...
A
point process is called simple if at any time point t not more than one event can occur
...
} is said to be recurrent if its corresponding sequence of interarrival times {Y 1 , Y 2 ,
...
The most important recurrent
point processes are homogenous Poisson processess and renewal processes (sections
3
...
1 and 3
...

Random Counting Processes Let
N(t) = max {n, T n ≤ t}
be the random number of events occurring in the interval (0, t]
...
} is called the
random counting process belonging to the random point process {T 1 , T 2 ,
...

© 2006 by Taylor & Francis Group, LLC

110

STOCHASTIC PROCESSES

Conversely, every stochastic process {N(t), t ≥ 0} in continuous time having these
three properties is the counting process of a certain point process {T 1 , T 2 ,
...
} , {Y 1 , Y 2 ,
...
For that reason, a random point process is frequently defined as a continuous-time stochastic process {N(t), t ≥ 0} with properties 1 to 3
...

The most important characteristic of a counting process {N(t), t ≥ 0} is the probability distribution of its increments N(s, t) = N(t) − N(s) , which determines for all intervals [s, t), s < t, the probabilities
p k (s, t) = P(N(s, t) = k);

k = 0, 1,
...

(3
...

(3
...

Figure 3
...

Note In what follows the attribute 'random' is usually omitted if it is obvious from the
notation or the context that random point processes or random counting processes are
being dealt with
...
1 (stationarity) A point process {T 1 , T 2 ,
...
} is strongly stationary (section 2
...
, i k with 1 ≤ i 1 < i 2 <
...
and
for any τ = 0, 1, 2,
...
, Y i } and {Y i +τ , Y i +τ ,
...

1

2

k

1

2

k

n(t)

6
4
2
0

t1

t2

t3

t4 t5

t6

Figure 3
...
} is strongly stationary,
the corresponding counting process {N(t), t ≥ 0} has homogeneous increments and
vice versa
...
1:
Corollary A point process {T 1 , T 2 ,
...

Hence, for a stationary point process, the probability distribution of any increment
N(s, t) depends only on the difference τ = t − s :
p k (τ) = P(N(s, s + τ) = k);

k = 0, 1,
...

(3
...

(3
...
} nor its
corresponding counting process {N(t), t ≥ 0} can be stationary as defined in section
2
...
In particular, since only simple point processes are considered, the sample paths
of {N(t), t ≥ 0} are step functions with jump heights being equal to 1
...
, T −2 , T 1 , T 0 , T 1 , T 2 ,
...
Then their sample
paths are also doubly infinite sequences: {
...
} and only the increments of the corresponding counting process over finite intervals are finite
...
By making use of notation (3
...

(3
...

Hence, the mean number of events occurring in any interval (s, t] of length τ = t − s is
m(s, t) = λ (t − s) = λτ
...
} of a stationary random point process, λ is estimated
by the number of events occurring in [0, t] divided by the length of this interval:
λ = n(t)/t ,
In example 3
...
233
...
This function allows to determine the mean number of events m(s, t) occurring in an interval (s, t] : For any s, t with 0 ≤ s < t,
t

m(s, t)) = ∫ s λ(x) dx
...

(3
...
14)

Hence, for Δt → 0,
so that for small Δt the product λ(t) Δt is approximately the mean number of events
in (t, t + Δt]
...
14) is: If Δt is sufficiently small, then
λ(t) Δt is approximately the probability of the occurrence of an event in the interval
[t, t + Δt]
...

(For Landau's order symbol o(x) , see (1
...
)
Random Marked Point Processes Let {T 1 , T 2 ,
...
Then the sequence
{(T 1 , M 1 ), (T 2 , M 2 ),
...
15)

is called a random marked point process
...
6)
...
} considered in example 2
...

Random marked point processes are dealt with in full generality in Matthes, Kerstan,
and Mecke [60]
...

Compound Stochastic Processes Let {(T 1 , M 1 ), (T 2 , M 2 ),
...
The stochastic process {C(t), t ≥ 0} defined by
⎧
for 0 ≤ t < T 1
⎪0
C(t) = ⎨
⎪ N(t) M for
t ≥ T1
⎩ Σ i=1 i
is called a compound (cumulative, aggregate) stochastic process
...
If {T 1 , T 2 ,
...
If T i is the time of
the i th breakdown of a machine and M i the corresponding repair cost, then C(t) is
the total repair cost in [0, t)
...
2 POISSON PROCESSES
3
...
1

Homogeneous Poisson Processes

3
...
1
...
Moreover, there is a close relationship between the homogeneous
Poisson process and the exponential distribution (theorem 3
...

Definition 3
...

3) Its increments N(s, t) = N(t) − N(s), 0 ≤ s < t, have a Poisson distribution with parameter λ(t − s) :
(λ(t − s)) i −λ(t−s)
(3
...
,
i!
or, equivalently, introducing the length τ = t − s of the interval [s, t], for all τ > 0,
P(N(s, t) = i) =

P(N(s, s + τ) = i) =

(λτ) i −λτ
e
; i = 0, 1,
...
17)

(3
...

Thus, the corresponding Poisson point process {T 1 , T 2 ,
...
1
...
1 A counting process {N(t), t ≥ 0} with N(0) = 0 is a homogeneous Poisson process with intensity λ if and only if it has the following properties:
a) {N(t), t ≥ 0} has homogeneous and independent increments
...
e
...

c) P(N(t, t + h) = 1) = λ h + o(h)
...
2 implies properties a), b) and c), it is only necessary to show that a homogeneous Poisson process satisfies properties b) and c)
...
17):
P(N(t, t + h) ≥ 2) = e −λh
= λ 2 h 2 e −λh

© 2006 by Taylor & Francis Group, LLC

∞ (λh) i

Σ

i=2

i!

(λh) i
≤ λ 2 h 2 = o(h)
...
17) and the simplicity of the Poisson process proves c):
P(N(t, t + h) = 1) = 1 − P(N(t, t + h) = 0) − P(N(t, t + h) ≥ 2)
= 1 − e −λh + o(h) = 1 − (1 − λ h) + o(h)
= λ h + o(h)
...
In view of the assumed homogeneity of the increments, it is sufficient to prove the validity of (3
...
Thus, letting
p i (t) = P(N(0, t) = i) = P(N(t) = i) ; i = 0, 1,
...

i!

From a),
p 0 (t + h) = P(N(t + h) = 0) = P(N(t) = 0, N(t, t + h) = 0)
= P(N(t) = 0) P( N(t, t + h) = 0) = p 0 (t) p 0 (h)
...

h
Taking the limit as h → 0 yields
p 0 (t) = −λ p 0 (t)
...
18) holds for i = 0
...

Because of c), the sum in the last row is o(h)
...
18)

3 POINT PROCESSES

115

p i (t + h) − p i (t)
= −λ [ p i (t) − p i−1 (t)] + o(h)
...
19)
p i (t) = −λ[ p i (t) − p i−1 (t)]; i = 1, 2,
...
18) is obtained by induction
...
1 is that the properties a), b) and c) can be
verified without any quantitative investigations, only by qualitative reasoning based
on the physical or other nature of the process
...

Note Throughout this chapter, those events, which are counted by a Poisson process
{N(t), t ≥ 0}, will be called Poisson events
...
} be the point process, which belongs to the homogeneous Poisson
process {N(t), t ≥ 0}, i
...
T n is the random time point at which the n th Poisson event
occurs
...

(3
...

(3
...

On the right-hand side of this equation, all terms but one cancel:
f T n (t) = λ

(λt) n−1 −λt
e ; t ≥ 0, n = 1, 2,
...
22)

Thus, T n has an Erlang distribution with parameters n and λ
...
; k = 1, 2,
...

are independent and identically distributed as T 1 (see example 1
...
Moreover,
n

T n = Σ i=1 Y i
...
2 Let {N(t), t ≥ 0} be a counting process and {Y 1 , Y 2 ,
...
Then {N(t), t ≥ 0} is a homogeneous Poisson
process with intensity λ if and only if the Y 1 , Y 2 ,
...

The counting process {N(t), t ≥ 0} is statistically equivalent to both its corresponding
point process {T 1 , T 2 ,
...
Hence, {T 1 , T 2 ,
...
} are sometimes also called Poisson
processes
...
2 From previous observations it is known that the number of traffic accidents N(t) in an area over the time interval [0, t) can be described by a homogeneous
Poisson process {N(t), t ≥ 0}
...
e
...
25 [h −1 ]
...

In view of the independence and the homogeneity of the increments of {N(t), t ≥ 0},
p can be determined as follows:
p = P(N(10) − N(0) ≤ 1) P(N(16) − N(10) ≥ 2) P(N(24) − N(16) = 0)
= P(N(10) ≤ 1) P(N(6) ≥ 2) P(N(8) = 0)
...
25⋅10 + 0
...
25⋅10 = 0
...
25⋅6 − 0
...
25⋅6 = 0
...
25⋅8 = 0
...

Hence, the desired probability is p = 0
...

(2) What is the probability that the 2 nd accident occurs not before 5 hours?
Since T 2 , the random time of the occurrence of the second accident, has an Erlang
distribution with parameters n = 2 and λ = 0
...
25⋅5 (1 + 0
...

2

Thus, P(T 2 > 5) = 0
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

117

The following examples make use of the hyperbolic sine and cosine functions:
x
−x
sinh x = e − e ,
2

x
−x
cosh x = e + e ,
2

x ∈ (−∞, +∞)
...
3 (random telegraph signal ) A random signal X(t) have structure
(3
...
Signals of this structure are called random telegraph signals
...
Obviously, X(t) = 1 or X(t) = −1 and Y determines
the sign of X(0)
...
2 shows a sample path x = x(t) of the stochastic process
{X(t), t ≥ 0} on condition Y = 1 and T n = t n ; n = 1, 2,
...
To see this, firstly note that
X(t) 2 = 1 < ∞ for all t ≥ 0
...
With
I(t) = (−1) N(t) ,
its trend function is m(t) = E(X(t)) = E(Y) E(I(t))
...

It remains to show that the covariance function C(s, t) of this process depends only
on t − s
...
e
...
2 Sample path of the random telegraph signal

© 2006 by Taylor & Francis Group, LLC

t

118

STOCHASTIC PROCESSES

Analogously,
P(I(t) = −1) = P(odd number of jumps in [0, t])
= e −λt

∞ (λt) 2i+1

Σ

i=0 (2i + 1)!

= e −λt sinh λt
...

Since
C(s, t) = Cov [X(s), X(t)]
= E[(X(s) X(t))] = E[Y I(s) Y I(t)]
= E[Y 2 I(s) I(t)] = E(Y 2 ) E[I(s) I(t)]
and E(Y 2 ) = 1, the covariance function of {X(t), t ≥ 0} has structure
C(s, t) = E[I(s) I(t)]
...
6) and the homogeneity of the increments of {N(t), t ≥ 0}, assuming
s < t,
p 1,1 = P(I(s) = 1, I(t) = 1) = P(I(s) = 1)P( I(t) = 1 I(s) = 1)
= e −λs cosh λs P(even number of jumps in (s, t])
= e −λs cosh λs e −λ(t−s) cosh λ(t − s)
= e −λt cosh λs cosh λ(t − s)
...

Now
E[I(s)I(t)] = p 1,1 + p −1,−1 − p 1,−1 − p −1,1 ,
so that
C(s, t) = e −2 λ(t−s) , s < t
...

Hence, the random telegraph signal {X(t), t ≥ 0} is a weakly stationary process
...
3 Let {N(t), t ≥ 0} be a homogeneous Poisson process with intensity λ
...
, n ; has a binomial distribution with parameters p = s/t and n
...
, n
...
24)

This proves the theorem
...
2
...
2 Homogeneous Poisson Process and Uniform Distribution
Theorem 3
...
24), for s < t,
P(T 1 ≤ s T 1 ≤ t) = P(N(s) = 1 N(t) = 1) = s
...
To prove it, the joint probability
density of the random vector (T 1 , T 2 ,
...

Theorem 3
...
, T n ) is
⎧ λ n e −λt n for 0 ≤ t 1 < t 2 <
...
, t n ) = ⎨

...
25)

t1

P(T 1 ≤ t 1 , T 2 ≤ t 2 ) = ∫ 0 P(T 2 ≤ t 2 T 1 = t) f T (t) dt
...
2, the interarrival times
Y i = T i − T i−1 ; i = 1, 2,
...

© 2006 by Taylor & Francis Group, LLC

120

STOCHASTIC PROCESSES

Hence, since T 1 = Y 1 ,
t1

P(T 1 ≤ t 1 , T 2 ≤ t 2 ) = ∫ 0 P(T 2 ≤ t 2 T 1 = t ) λe −λt dt
...
Thus, the desired two-dimensional distribution function is
t1

F(t 1 , t 2 ) = P(T 1 ≤ t 1 , T 2 ≤ t 2 ) = ∫ 0 (1 − e −λ(t 2 −t) ) λ e −λt dt
t1

= λ ∫ 0 (e −λt − e −λ t 2 ) dt
...

Partial differentiation yields the corresponding two-dimensional probability density
⎧ λ 2 e −λt 2
⎪
f (t 1 , t 2 ) = ⎨
⎪0
⎩

for 0 ≤ t 1 < t 2

...

The formulation of the following theorem requires a result from the theory of ordered samples: Let {X 1 , X 2 ,
...
e
...
The corresponding ordered sample is denoted as
(X ∗ , X ∗ ,
...
≤ X ∗
...
, X ∗ } is
n
1 2
⎧ n!/ x n ,
f ∗ (x ∗ , x ∗ ,
...
< x ∗ ≤ x,
n
1
2

...
26)

For the sake of comparison: The joint probability density of the original (unordered)
sample {X 1 , X 2 ,
...
, x n ) = ⎨
⎩ 0 ,

0 ≤ xi ≤ x
elsewhere

...
27)

Theorem 3
...
; T 0 = 0
...
, the
random vector {T 1 , T 2 ,
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

121

Proof By definition, for disjoint, but otherwise arbitrary subintervals [t i , t i + h i ] of
[0, t] , the joint probability density of {T 1 , T 2 ,
...
, t n N(t) = n)
=

P(t i ≤ T i < t i + h i ; i = 1, 2,
...

h 1 h 2
...
,h n )→0
lim

Since the event ' N(t) = n ' is equivalent to T n ≤ t < T n+1 ,
P(t i ≤ T i < t i + h i ; i = 1, 2,
...
, n ; t < T n+1 )
P(N(t) = n)

∞ t n +h n t n−1 +h n−1

∫

=

t

∫

∫

tn

t n−1

...
dx n dx n+1

t1
(λt) n −λt
e
n!

h h
...
h n
= 1 2 n
=
n!
...
, t n N(t) = n) = ⎨
⎩ 0,

0 ≤ t1 < t2 <
...

elsewhere

(3
...
26)
...
, T n are 'purely randomly' distributed over [0, t]
...
4 (shot noise) Shot noise processes have been formally introduced in example 2
...
Now an application is discussed in detail: In the circuit, depicted in Figure
3
...
A current pulse is initiated in the circuit as soon as the cathode emits a photoelectron due to the light falling on it
...

(3
...
3 Photodetection circuit (Example 3
...
be the sequence of random time points, at which the cathode emits
photoelectrons and {N(t), t ≥ 0} be the corresponding counting process
...

(3
...
29), X(t) can also be written in the form
N(t)

X(t) = Σ i=1 h(t − T i )
...
For determining the trend function of the shot noise {X(t), t ≥ 0}, note
that according to theorem 3
...
, T n are uniformly distributed over [0, t]
...

t
t

Therefore,
E(X(t) N(t) = n) = E ⎛
⎝

n

Σ i=1 h(t − T i )

N(t) = n ⎞
⎠

n

= Σ i=1 E(h(t − T i ) N(t) = n)
t
= ⎛ 1 ∫ 0 h(x) dx ⎞ n
...
7) yields
∞

E(X(t)) = Σ n=0 E(X(t) N(t) = n) P(N(t) = n)
(λ t) n −λt
∞
t
= 1 ∫ 0 h(x) dx Σ n=0 n
e
t
n!
t
t
= ⎛ 1 ∫ 0 h(x) dx ⎞ E(N(t)) = ⎛ 1 ∫ 0 h(x) dx ⎞ (λt)
...

(3
...

⎣
⎦

Since, on condition ' N(t) = n ', the T 1 , T 2 ,
...

t
Thus, for s < t, substituting x = s − y,
s
E(h(s − T i ) h(t − T i ) N(t) = n) = 1 ∫ 0 h(x) h(t − s + x) dx
...
5, on condition ' N(t) = n ' the T 1 , T 2 ,
...

Hence,
E(h(s − T i ) h(t − T j ) N(t) = n) = E(h(s − T i ) N(t) = n) E(h(t − T j ) N(t) = n)
t
s
= ⎛ 1 ∫ 0 h(s − x) dx ⎞ ⎛ 1 ∫ 0 h(t − x) dx ⎞
⎝t
⎠ ⎝t
⎠
t
s
= ⎛ 1 ∫ 0 h(x) dx ⎞ ⎛ 1 ∫ 0 h(x) dx ⎞
...

⎝t
⎠ ⎝t
⎠

Applying once more the total probability rule,
s
E(X(s) X(t)) = ⎛ 1 ∫ 0 h(x)h(t − s + x) dx ⎞ E(N(t))
⎝t
⎠
t
s
+ ⎛ 1 ∫ 0 h(x) dx ⎞ ⎛ 1 ∫ 0 h(x) dx ⎞ ⎡ E(N 2 (t)) − E(N(t)) ⎤
...
31) and (2
...

124

STOCHASTIC PROCESSES

More generally, for any s and t, C(s, t) can be written in the form
min(s,t)

C(s, t) = λ ∫ 0

h(x) h( t − s + x) d x
...

By letting s → ∞ , keeping τ = t − s constant, trend- and covariance function become
∞

(3
...

(3
...
They imply that, for large t,
the shot noise process {X(t), t ≥ 0} is approximately weakly stationary
...
7, and for more general formulations of
this theorem see, for instance, Brandt, Franken, and Lisek [13], Stigman [78]
...

Provided the A i are identically distributed as A, independent of each other, and independent of all T k , then determining trend- and covariance function of the generalized shot noise {X(t), t ≥ 0} does not give rise to principally new problems
...
34)

h(x) h( t − s + x) d x
...
35)

m(t) = λ E(A)∫ 0 h(x) dx ,
min(s,t)

C(s, t) = λ E(A 2 )∫ 0

If the process of inducing current impulses by photoelectrons has already been operating for an unboundedly long time (the circuit was switched on a sufficiently long
time ago), then the underlying shot noise process {X(t), t ∈ (−∞, +∞)} is given by
+∞
X(t) = Σ −∞ A i h(t − T i )
...

Example 3
...
Hence,
the arrival of a customer is a Poisson event
...
To cope with this situation, the service system must be modeled as having an infinite number of servers
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

125

Let G(t) = P(Z ≤ t) be the distribution function of Z, and X(t) be the random number
of customers in the system at time t, X(0) = 0
...
; t ≥ 0
...
e
...
Given N(t) = n, the arrival
times T 1 , T 2 ,
...
4, independent
and uniformly distributed over [0, t]
...
Thus, the probability that any of the n customers who arrived in [0, t] is still in the system at time t, is
p(t) = ∫ 0 (1 − G(t − x)) 1 dx = 1 ∫ 0 (1 − G(x)) dx
...
, n
...
7),
p i (t) =
=

∞

Σ

n=i

P(X(t) = i N(t) = n) ⋅ P(N(t) = n)

n
⎛ n ⎞ [p(t)] i [1 − p(t)] n−i ⋅ (λt) e −λt
...
Thus, from example 1
...

Hence, X(t) has a Poisson distribution with parameter
E(X(t)) = λ t p(t)
...

For t → ∞ the trend function tends to
lim m(t) =

t→∞

E(Z)
,
E(Y)

(3
...

© 2006 by Taylor & Francis Group, LLC

126

STOCHASTIC PROCESSES

By letting
ρ = E(Z)/E(Y) ,
the stationary state probabilities of the system become
p i = lim p i (t) =
t→∞

ρ i −ρ
e ;
i!

i = 0, 1,
...
37)

If Z has an exponential distribution with parameter μ , then
t
λ
m(t) = λ ∫ 0 e −μx dx = μ ⎛ 1 − e −μt ⎞
...

3
...
2 Nonhomogeneous Poisson Processses
In this section a stochastic process is investigated, which, except for the homogeneity
of its increments, has all the other properties listed in theorem 3
...
Abandoning the
assumption of homogeneous increments implies that a time-dependent intensity function λ = λ(t) takes over the role of λ
...
As in section 3
...
3 A counting process {N(t), t ≥ 0} satisfying N(0) = 0 is called a nonhomogeneous Poisson process with intensity function λ(t) if it has properties
(1) {N(t), t ≥ 0} has independent increments,
(2) P(N(t, t + h) ≥ 2) = o(h),
(3) P(N(t, t + h) = 1) = λ(t) h + o(h)
...

2) Computation of the probability density of the random event time T i (time point at
which the i th Poisson event occurs)
...
, T n ); n = 1, 2,
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

127

Thus,
p 0 (s, t + h) − p 0 (s, t)
o(h)
= −λ(t) p 0 (s, t) +

...

0
∂t 0
Since N(0) = 0 or, equivalently, p 0 (0, 0) = 1 , the solution is
p 0 (s, t) = e −[Λ(t)−Λ(s)] ,
where

(3
...

(3
...

(3
...

(3
...

(3
...

2) Let F T (t) = P(T 1 ≤ t) be the distribution function and f T (t) the probability den1
1
sity of the random time T 1 to the occurrence of the first Poisson event
...

1

From (3
...

Hence,
t
− λ(x) dx
F T (t) = 1 − e ∫ 0
,
1

© 2006 by Taylor & Francis Group, LLC

t
− λ(x) dx
f T (t) = λ(t)e ∫ 0
, t ≥ 0
...
43)

128

STOCHASTIC PROCESSES

A comparison of (3
...
40) shows that the intensity function λ(t) of the nonhomogeneous Poisson process {N(t), t ≥ 0} is identical to the failure rate belonging
to T 1
...
44)

the distribution function of the nth event time T n is
F T n (t) =

∞ [Λ(t)] i
e −Λ(t) ,
i!
i=n

Σ

n = 1, 2,
...
45)

Differentiation with respect to t yields the probability density of T n :
f T n (t) =

[Λ(t)] n−1
λ(t) e −Λ(t) ; t ≥ 0 , n = 1, 2,
...
46)

Equivalently,
f T n (t) =

[Λ(t)] n−1
f (t); t ≥ 0 , n = 1, 2,
...
17), the mean value of T n is
E(T n ) =

∞

⎛ n−1 [Λ(t)] i ⎞
⎟ dt
...
47)

Hence, the mean time
E(Y n ) = E(T n ) − E(T n−1 )
between the (n − 1) th and the n th event is
E(Y n ) =

∞
1
[Λ(t)] n−1 e −Λ(t) dt ; n = 1, 2,
...
48)

Letting λ(t) ≡ λ and Λ(t) ≡ λ t yields the corresponding characteristics for the homogeneous Poisson process, in particular E(Y n ) = 1/λ
...
Thus, from (3
...

2
Differentiation with respect to t 2 yields the corresponding probability density:
f T (t 2 t 1 ) = λ(t 2 ) e −[Λ(t 2 )−Λ(t 1 )] ,
2

0 ≤ t1 < t2
...
59), the joint probability density of (T 1 , T 2 ) is
⎧ λ(t 1 ) f T 1 (t 2 ) for t 1 < t 2
f (t 1 , t 2 ) = ⎨

...
, T n ) :
⎧ λ(t 1 )λ(t 2 )
...
< t n
f (t 1 , t 2 ,
...
49)
elsewhere
⎩ 0,
This result includes as a special case formula (3
...

As with the homogeneous Poisson process, the nonhomogeneous Poisson counting
process {N(t), t ≥ 0}, the corresponding point process of Poisson event times
{T 1 , T 2 ,
...
} are statistically equivalent stochastic processes
...
4 Intensity of the arrival of cars at a filling station

Example 3
...
m
...
4)
2
λ(t) = 10 + 35
...

1) What is the mean number of cars arriving for petrol weekdays between 5:00 and
11:00? According to (3
...
4 t e −t /8 ⎞ dt
⎝
⎠

6
2
= ⎡ 10 t − 141
...

⎥
⎢
⎦0
⎣

2) What is the probability that at least 90 cars arrive for petrol weekdays between
6:00 and 8:00 ? The mean number of cars arriving between 6:00 and 8:00 is
8

3

∫ 6 λ(t) dt = ∫ 1 (10 + 35
...
6 e −t /8 ⎤ = 99
...
Thus, desired probability is
P(N(6, 8) ≥ 90) =

∞ 99 n
e −0
...

n!
n=90

Σ

By using the normal approximation to the Poisson distribution (section 1
...
3):
∞ 99 n
⎛
⎞
e −0
...
1827
...
8173
...
2
...
In view of their flexibility, they are now a favourite point process model for many other applications
...

Let {N(t), t ≥ 0} be a homogeneous Poisson process with parameter λ
...
The basic idea of Dubourdieu
was to consider λ a realization of a positive random variable L, which is called the
(random) structure or mixing parameter
...
2
...

Definition 3
...
Then the counting process {N L (t), t ≥ 0} is said to be a mixed Poisson process with structure parameter L if it has the following properties:
(1) {N L L=λ (t), t ≥ 0} has independent, homogeneous increments for all λ ∈ R L
...

(2) P ⎛ N L L=λ (t) = i ⎞ =
⎝
⎠
i!
Thus, on condition L = λ, the mixed Poisson process is a homogeneous Poisson process with parameter λ :
{N L L=λ (t), t ≥ 0} = {N λ (t), t ≥ 0}
...
50)
P ⎛ N L (t) = i ⎞ = E ⎜
e
⎟ ; i = 0, 1,
...
; then
P ⎛ N L (t) = i ⎞ =
⎝
⎠

∞ (λ k t) i
e −λ k t π k
...
51)

In applications, a binary structure parameter L is particularly important
...
52)

for 0 ≤ π ≤ 1, λ 1 ≠ λ 2
...
Hence, for convenience, throughout this section the assumption is made
that L is a continuous random variable with density f L (λ)
...

Obviously, the probability p 0 (t) = P(N L (t) = 0) is the Laplace transform of f L (λ)
with parameter s = t (section 1
...
2):
∞
p 0 (t) = f L (t) = E(e −L t ) = ∫ 0 e −λ t f L (λ) d λ
...

Therefore, all state probabilities of a mixed Poisson process can be written in terms
of p 0 (t) :
i (i)
p i (t) = P(N L (t) = i) = (−1) i t p 0 (t) ;
i!

i = 1, 2,
...
53)

Mean value and variance of N L (t) are (compare with the parameters of the mixed
Poisson distribution given in section 1
...
4)):
E(N L (t)) = t E(L),

Var (N L (t)) = t E(L) + t 2 Var(L)
...
54)

The following theorem lists two important properties of mixed Poisson processes
...
6 (1) A mixed Poisson process {N L (t), t ≥ 0} has homogeneous increments
...
e
...

Proof (1) Let 0 = t 0 < t 1 <
...
Then, for any nonnegative integers
i 1 , i 2 ,
...
, n)
∞

= ∫ 0 P(N λ (t k−1 + τ, t k + τ) = i k ; k = 1, 2,
...
, n) f L (λ)d λ
= P(N L (t k−1 , t k ) = i k ; k = 1, 2,
...

(2) Let 0 ≤ t 1 < t 2 < t 3
...

This proves the theorem if the mixing parameter L is a continuous random variable
...

Multinomial Criterion Let 0 = t 0 < t 1 <
...
Then, for any nonnegative integers i 1 , i 2 ,
...
+ i n ,
P(N L (t k−1 , t k ) = i k ; k = 1, 2,
...
⎛ t n − t n−1 ⎞ i n
i!

...
i n ! ⎝ t n ⎠ ⎝ t n ⎠

(3
...
4)
...
15)
...
55), the joint distibution of the increments N L (0, t) = N L (t) and N L (t, t + τ) will be derived:
P(N L (t) = i, N L (t, t + τ) = k)
= P(N L (t) = i N L (t + τ) = i + k) P(N L (t + τ) = i + k)
=

(i + k)! ⎛ t ⎞ i ⎛ τ ⎞ k ∞ [λ(t + τ)] i+k −λ (t+τ)
e
f L (λ) d λ
...
56)

for i, k = 0, 1,
...
As a first step into this direction, the mean value of the product
of the increments N L (t) = N L (0, t) and N L (t, t + τ) has to be determined
...
56),
E([N L (t)] [N L (t, t + τ)]) =
= tτ

∞

∞ ∞

i k
ik t τ
i ! k!
i=1 k=1

Σ Σ

∞ i+k −λ (t+τ)
λ e
f L (λ) d λ

∫0

∞ (λ t) i ∞ (λ τ) k
e −λ (t+τ) f L (λ) d λ
Σ
i=0 i ! k=0 k !

∫ λ2 Σ

0

∞

= t τ ∫0

∞
Σ i=0 λ 2 e λ t e λ τ e −λ (t+τ) f L (λ) d λ
∞

= t τ ∫ 0 λ 2 f L (λ) d λ
...

(3
...
4) and (3
...

Thus, two neighbouring increments of a mixed Poisson process are positively correlated
...
This property of a stochastic process is also called positive contagion
...

Let the gamma density of L be
f L (λ) =

β α α−1 −βλ
λ
e
,
Γ(α)

λ > 0, α > 0, β > 0
...
9 (section 1
...
4) yields
α
i
∞ (λ t) −λ t β
e
λ α−1 e −βλ d λ

P(N L (t) = i) = ∫ 0
=

© 2006 by Taylor & Francis Group, LLC

i!

Γ(α)

Γ(i + α) t i β α

...

(3
...
In particular, for
an exponential structure distribution (α = 1) , N L (t) has a geometric distribution with
parameter p = t /(t + β)
...
58) and the
multinomial criterion (3
...
< t n ; n = 1, 2,
...
, n)
= P(N L (t k ) = i k ; k = 1, 2,
...
, n N L (t n ) = i n ) P(N L (t n ) = i n )
=

i k −i k−1 i − 1 + α
n
t −t
in!
⎛ n
⎞ ⎛ tn ⎞ in ⎛ β ⎞ α

...
, n)
=

in!
⎛ in − 1 + α ⎞ ⎛ β ⎞
n
⎠ ⎝ β + tn ⎠
in
Π k=1 (i k − i k−1 )! ⎝

α n
t − t k− i k −i k−1

...
59)

For the following three reasons it is not surprising that the Polya process is increasingly used for modeling real-life point processes, in particular customer flows:
1) The finite dimensional distributions of this process are explicitely available
...

3) The two free parameters α and β of this process allow its adaptation to a wide variety of data sets
...
7 An insurance company analyzed the incoming flow of claims and
found that the arrival intensity λ is subjected to random fluctuations, which can be
modeled by the probability density f L (λ) of a gamma distributed random variable L
with mean value E(L) = 0
...
16 (unit: working hour)
...
24 = α /β,

© 2006 by Taylor & Francis Group, LLC

Var(L) = 0
...

3 POINT PROCESSES

135

Hence, α = 0
...
5
...
5) 0
...
64 −(1
...

Γ(0
...
Hence, the insurance company modeled
the incoming flow of claims by a Polya process { N L (t), t ≥ 0} with the one-dimensional probability distribution
f L (λ) =

i
0
...
64 ⎞ ⎛ t ⎞ ⎛ 1
...
5 + t ⎠ ⎝ 1
...

According to (3
...
24 t,

Var (N L (t)) = 0
...
16 t 2
...

Doubly Stochastic Poisson Process The mixed Poisson process generalizes the homogeneous Poisson process by replacing its parameter λ with a random variable L
...
A doubly stochastic Poisson process {N L(⋅) (t), t ≥ 0} can be thought of as a nonhomogeneous Poisson process the
intensity function λ(t) of which has been replaced with a stochastic process
{L(t), t ≥ 0} called intensity process
...

2) Given {λ(t), t ≥ 0}, the process {N L(⋅) (t), t ≥ 0} evolves like a nonhomogeneous
Poisson process with intensity function λ(t)
...

The absolute state probabilities of the doubly stochastic Poisson process at time t are
⎛ t
i − t L(x) dx ⎞
P(N L(⋅) (t) = i) = 1 E ⎜ ⎡ ∫ 0 L(x) dx ⎤ e ∫ 0
⎟;
⎦
i! ⎝⎣
⎠

i = 0, 1,
...
60)

In this formula, the mean value operation ' E ' eliminates the randomness generated by
the intensity process in [0, t]
...

⎝0
⎠
0
A nonhomogeneous Poisson process with intensity function λ(t) = E(L(t)) can be
used as an approximation to the doubly stochastic Poisson process {N L(⋅) (t), t ≥ 0}
...
the homogeneous Poisson process if L(t) is a constant λ ,
2
...
the mixed Poisson process if L(t) is a random variable L, which does not depend
on t
...
The term 'doubly stochastic Poisson process'
was introduced by Cox [21], who was the first to investigate this class of point processes
...
For detailed treatments
and applications in engineering and insurance, respectively, see, for instance, Snyder
[76] and Grandell [34]
...
2
...
2
...
1 Superposition
Assume that a service station recruits its customers from n different sources
...
Each town or each make of cars,
respectively, generates its own arrival process (flow of demands)
...
, n,
be the corresponding counting processes
...
+ N n (t)
...

On condition that {N i (t), t ≥ 0} is a homogeneous Poisson process with parameter
λ i ; i = 1, 2,
...
22 (section 1
...
1) it is known that the z-transform of N(t) is
M N(t) (z) = e −(λ 1 +λ 2 +

...
+ λ n ) t
...

3 POINT PROCESSES

137

Since the counting processes {N i (t), t ≥ 0} have homogeneous and independent increments, their additive superposition {N(t), t ≥ 0} also has homogeneous and independent increments
...
7 The additive superposition {N(t), t ≥ 0} of independent homogeneous Poisson processes {N i (t), t ≥ 0} with intensities λ i ; i = 1, 2,
...
+ λ n
...
, n ; then their additive superposition {N(t), t ≥ 0} is a nonhomogeneous Poisson process with intensity function
λ(t) = λ (t) + λ (t) +
...

1

2

3
...
4
...
For instance, a cosmic particle counter registers only α -particles and ignores other types of particles
...
Formally, a marked point process {(T 1 , M 1 ), (T 2 , M 2 ),
...
It is assumed that the
marks M i are independent of each other and independent of {T 1 , T 2 ,
...
e
...
In this case, there
are two different types of events, type 1-events (attached with mark m 1 ) and type
2-events (attached with mark m 2 )
...
Note that if t < T 1 , then there is surely
no type 2-event in [0, t] , and if T n ≤ t < T n+1 , then there are exactly n events in [0, t]

and (1 − p) n is the probability that none of them is a type 2 -event
...

Since P(T n ≤ t < T n+1 ) = P(N(t) = n),
P(Y > t) = e −λt +
= e −λt + e −λt

© 2006 by Taylor & Francis Group, LLC

∞

⎛ (λ t) n e −λt ⎞ (1 − p) n
⎝ n!
⎠
n=1

Σ

∞ [λ (1−p) t] n
= e −λt + e −λt ⎡ e λ(1− p) t − 1 ⎤
...

Hence, the interevent times between type 2-events have an exponential distribution
with parameter pλ
...
By changing the roles of type 1 and type 2-events, theorem 3
...
8:
Theorem 3
...
Then N(t) can be represented in the form
N(t) = N 1 (t) + N 2 (t),

(3
...

Nonhomogeneous Poisson Process Now the situation is somewhat generalized by
assuming that the underlying counting process {N(t), t ≥ 0} is a nonhomogeneous
Poisson process with intensity function λ(t) and that an event, occurring at time t, is
of type 1 with probability 1 − p(t) and of type 2 with probability p(t)
...
Then the relationship
P(t < Y ≤ t + Δt Y > t) = p(t) λ(t) Δt + o(Δt)
...

Δt
Δt
G(t)
Letting Δt tend to 0 yields,
G (t)
= p(t) λ(t)
...

(3
...

Theorem 3
...
Then N(t) can be represented in the form
N(t) = N 1 (t) + N 2 (t),
© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

139

where {N 1 (t), t ≥ 0} and {N 2 (t), t ≥ 0} are independent nonhomogeneous Poisson
processes with respective intensity functions
(1 − p(t))λ(t) and p(t)λ(t) ,
which count only type 1 and type 2-events, respectively
...
2
...
3, some more sophisticated results will be needed: Let Z be the random number of type 1 - events to the occurrence of the first type 2-event
...
43):
1
t

− λ(x) dx
f (t) = λ(t) e ∫ 0
, t ≥ 0
...
49), for k ≥ 1,
∞ x k+1
...

By making use of the well-known formula
t xn
...
dx n = n! ⎡ ∫ 0 g(x) dx ⎤
⎣
⎦

n

,

n ≥ 2, (3
...

(3
...

(3
...
66)

and G(t) has structure
G(t) = [F(t)] p ;

t ≥ 0
...
67)

Now, let Z t be the random number of type 1-events in (0, min(Y, t)) and
r t (k) = P(Z t = k Y = t);

k = 0, 1,
...
6),
P(Z t = k ∩ t ≤ Y ≤ t + Δt)
P(t ≤ Y ≤ t + Δt)
Δt→0

r t (k) = lim

P(t ≤ Y = X k+1 ≤ t + Δt)

...
68)

140

STOCHASTIC PROCESSES

From (3
...
63), the numerator in (3
...
x 3 x 2
∫0
∫0 ∫0

= ∫t

k

Π i=1 p(x i ) λ(x i ) dx i p(x k+1 ) f(x k+1 ) dx k+1

t+Δt ⎛ y
⎞k
= 1 ∫t
⎝ ∫ 0 p(x) λ(x) dx ⎠ p(y) f(y) dy
...
68) by Δt and taking the limit as Δt → 0
yields
t

k − p(x) λ(x) dx
t
r t (k) = 1 ⎛ ∫ 0 p(x) λ(x) dx ⎞ e ∫ 0
;
⎠
k! ⎝

k = 0, 1,
...
69)

so that
t

E(Z t Y < t) = ∫ 0 E(Z x Y = x) dG(x)/G(t)
t x

= ∫ 0 ∫ 0 p(y) λ(y) dy dG(x)/G(t)

(3
...

(3
...

The result is
t

E(Z t ) = ∫ 0 G(x) λ(x) dx − G(t)
...
72)

For these and related results see Beichelt [5]
...
2
...
} be a marked point process, where {T i ; i = 1, 2,
...
Then the
stochastic process {C(t), t ≥ 0} defined by
N(t)

C(t) = Σ i=0 M i
with M 0 = 0 is called a compound (cumulative, aggregate) Poisson process
...

2) If T i is the time of the i th breakdown of a machine and M i the corresponding repair cost, then C(t) is the total repair cost in [0, t]
...
(For
the brake discs of a car, every application of the brakes is a shock, which increases
their degree of mechanical wear
...
)
In what follows, {N(t), t ≥ 0} is assumed to be a homogeneous Poisson process with
intensity λ
...
}, then {C(t), t ≥ 0} has the following properties:
1) {C(t), t ≥ 0} has independent, homogeneous increments
...
73)

where
M(s) = E ⎛ e −s M ⎞
⎝
⎠
is the Laplace transform of M
...
73) is straightforward: By (1
...
+M N(t) ⎞
C t (s) = E ⎛ e −s C(t) ⎞ = E ⎛ e
⎝
⎠
⎝
⎠

=

∞

Σ

...

From C t (s) , all the moments of C(t) can be obtained by making use of (1
...
In particular, mean value and variance of C(t) are
E(C(t)) = λt E(M ),

Var(C(t)) = λt E(M 2 )
...
74)

These formulas also follow from (1
...
126)
...

0 with probability 1 − p

© 2006 by Taylor & Francis Group, LLC

142

STOCHASTIC PROCESSES

Then M 1 + M 2 +
...
2
...
2)
...
+ M n = k N(t) = n) P(N(t) = n)
=

n
⎛ n ⎞ p k (1 − p) n−k (λt) e −λt
...
Hence, by example 1
...
2
...

n!
Corollary If the marks of a compound Poisson process {C(t), t ≥ 0} have a Bernoulli
distribution with parameter p, then {C(t), t ≥ 0} is a thinned homogeneous Poisson
process with parameter λp
...
73) and (3
...

(3
...
75) are an immediate consequence of (1
...
126)
...
3
...

3
...
6

Applications to Maintenance

3
...
6
...
Maintenance policies prescribe
when to carry out (preventive) repairs, replacements or other maintenance measures
...
A minimal repair performed after a failure enables the system to continue its
work but does not affect the failure rate of the system
...
For example, if a failure of a complicated electronic system is caused by a
defective plug and socket connection, then removing this cause of failure can be con-

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

143

sidered a minimal repair
...
Of course, preventive minimal repairs make no sense
...
The random lifetime T of the system has probability density f (t), distribution function F(t), survival probability F(t) = 1 − F(t) , and failure rate λ(t)
...
The following maintenance policy is
directly related to a nonhomogeneous Poisson process
...

Let T n be the random time point, at which the n th system failure (minimal repair)
occurs
...
; T 0 = 0
...
Given T 1 = t, the failure rate of the system after completion
of the repair is λ(t)
...
Therefore, from
(1
...

F(t)
According to (1
...
38), equivalent representations of F t (y) are
F t (y) = 1 − e −[Λ(t+y)−Λ(t)]

(3
...

Obviously, these equations are also valid if t is not the time point of the first failure,
but the time point of any failure, for instance the n th failure
...

The occurrence of system failures (minimal repairs) is, therefore, governed by the
same probability distribution as the occurrence of Poisson events generated by a
nonhomogeneous Poisson process with intensity function λ(t)
...
, T n ) has the joint probability density (3
...

Therefore, if N(t) denotes the number of system failures (minimal repairs) in [0, t] ,
then {N(t), t ≥ 0} is a nonhomogeneous Poisson process with intensity function λ(t)
...

© 2006 by Taylor & Francis Group, LLC

(3
...
} is an ingredient to a marked point process {(T 1 , M 1 ), (T 2 , M 2 ),
...
The corresponding compound process {M(t), t ≥ 0} is given by
N(t)

M(t) = Σ i=0 M i , M 0 = 0,
where M(t) is the total repair cost in [0, t]
...
are assumed to be independent of each other, independent of N(t), and identically distributed as M with
c m = E(M) < ∞
...

(3
...
2
...
2 Standard Replacement Policies with Minimal Repair
The basic policy discussed in the previous section provides the theoretical fundament
for analyzing a number of more sophisticated maintenance policies
...
To justify preventive replacements, the
assumption has to be made that the underlying system is aging (section 1
...
1), i
...
its failure rate λ(t) is increasing
...
The latter assumption is merely a matter of
convenience
...
To establish this criterion, the time axis is partitioned into replacement cycles, i
...
into the times between
two neighbouring replacements
...
It is assumed that the L 1 , L 2 ,
...
This assumption implies that a replaced system is statistically as
good as the previous one ('as good as new') from the point of view of its lifetime
...
are assumed to be independent, identically distributed as C , and independent on the L i
...

n→∞ Σ n L
i=1 i
The strong law of the large numbers implies that
E(C)
K=

...
79)

For the sake of brevity, K is referred to as the (long-run) maintenance cost rate
...
In what follows, c p denotes the cost of a preventive
replacement, and c m is the cost of a minimal repair; c p , c m constant
...
Failures between
replacements are removed by minimal repairs
...
With this policy, all cycle lengths are equal to τ , and, in view of
(3
...

Hence, the corresponding maintenance cost rate is
c p + c m Λ(τ)
K 1 (τ) =

...

If λ(t) tends to ∞ as t → ∞ , there exists a unique solution τ = τ∗ of this equation
...

Policy 2 A system is replaced at the first failure which occurs after a fixed time τ
...

This policy fully makes use of the system lifetime so that, from this point of view, it
is preferable to policy 1
...
Thus, in practice the
maintenance cost rate of policy 2 may actually exceed the one of policy 1
...
36), the mean value
∞
μ(τ) = E(T τ ) = e Λ(τ) ∫ τ e −Λ(t) dt
...
80)

The mean maintenance cost per cycle is, from the notational point of view, equal to
that of policy 1
...
An optimal renewal interval τ = τ∗ satisfies
the necessary condition dK 2 (τ)/d τ = 0 , i
...

cp

(Λ(τ) + c m − 1) μ(τ) = τ
...

τ∗

146

STOCHASTIC PROCESSES

Example 3
...
The corresponding mean residual lifetime of the system after having
survived [0, τ] is
⎛ 2 ⎞⎤
2⎡
μ(τ) = θ π e (τ / θ) ⎢ 1 − Φ ⎜ θ τ ⎟ ⎥
...
0402
...
On the first failure after a given
time τ 1 , an unscheduled replacement is carried out
...

Under this policy, the random cycle length is
L = τ 1 + min(T τ 1 , τ 2 − τ 1 ) ,
so that the mean cycle length is

τ −τ 1

E(L) = τ 1 + μ(τ 1 , τ 2 ) with μ(τ 1 , τ 2 ) = ∫ 02

F τ 1 (t) dt
...

τ 1 + μ(τ 1 , τ 2 )

An optimal pair (τ 1 , τ 2 ) = (τ ∗ , τ ∗ ) is solution of the equation system
1 2
λ(τ 2 ) μ(τ 1 , τ 2 ) + F τ 1 (τ 2 − τ 1 ) − c m /(c r − c p ) = 0,
λ(τ 2 ) −

c m Λ(τ 1 ) + c r − c m
= 0
...

In this case, the minimal maintenance cost rate is
K 3 (τ ∗ , τ ∗ ) = (c r − c p ) λ(τ ∗ )
...
At the time point of
the n th failure, an (unscheduled) replacement is carried out
...
Hence, the maintenance cost rate is
c r + (n − 1)c m
,
E(T n )
where the mean cycle length E(T n ) is given by (3
...

K 4 (n) =

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

147

By analyzing the behaviour of the difference K 4 (n) − K 4 (n − 1), an optimal n = n∗ is
seen to be the smallest integer k satisfying
E(T n ) − [n − 1 + c r /c m ] E(Y n+1 ) ≥ 0;

n = 1, 2,
...
81)

where the mean time E(Y n ) between the (n − 1)th and the n th minimal repair is given by formula (3
...

Example 3
...

(3
...
82) becomes
β n − [n − 1 + c r /c m ] ≥ 0
...
(If x < 0, then x = 0
...
2
...
3 Replacement Policies for Systems with two Failure Types
So far, it has been assumed that every system failure can be removed by a minimal
repair
...
For example, the restoration of the roadworthiness of a car after a serious traffic accident can surely not be achieved by a minimal
repair
...

Type 2: Failures of this type are removed by replacements
...
A failure occuring at system
age t is a type 2 failure with probability p(t) and a type 1 failure with probability
1 − p(t)
...
Obviously, this is the same situation as discussed in section 3
...
4
...

Policy 5 The system is maintained according to the failure type
...
Hence, according to (3
...

(3
...
65)
...

t
∞ −∫ 0 p(x)λ(x)dx
e
dt

(3
...
66) and (3
...

p
∞
∫ 0 ⎡ F(t)⎤ dt
⎣
⎦

(3
...
In addition, preventive replacements are carried out τ time units after the previous replacement
...
Then
L τ = min(Y, τ)
is the random length of a replacement cycle (time between successive replacements
of any type) and, if Z τ denotes the random number of minimal repairs in a replacement cycle, the maintenance cost rate has structure
K 6 (τ) =

c m E(Z τ )) + c r G(τ) + c p G(τ)

...
72) and

τ

E(L τ ) = ∫ 0 G(t) dt ,
the maintenance cost rate becomes
K 6 (τ) =

c m ⎡ ∫ τ G(t)λ(t)dt − G(τ) ⎤ + c r G(τ) + c p G(τ)
⎣0
⎦

...
86)

In particular, for p(t) ≡ p,
K 6 (τ) =

{c r + [(1 − p)/p] c m }G(τ) + c p G(τ)
τ

∫ 0 G(t) dt

...
87)

If there exists an optimal preventive replacement interval τ = τ ∗ with regard to the
maintenance cost rate K 6 (τ) , then it is solution of the equation
τ

p λ(τ)∫ 0 G(t) dt − G(τ) =

p cp

...
If there is no preventive maintenance, i
...
τ = ∞ , then
(3
...
87) reduce to (3
...
85), respectively
...
Beginning with the papers of Uematsu and Nishida [83]
and Kijma, Morimura and Suzuki [49], approaches to modeling general degrees of
repairs have been suggested which take into account the intermediate stages
...

3
...
6
...
Different from the maintenance policies considered so far, repair cost limit
replacement policies explicitely take into account that repair costs are random variables
...

Policy 7 (Repair cost limit replacement policy) After a system failure, the necessary
repair cost is estimated
...

Otherwise, a minimal repair is carried out
...
Then the two failure
type model applies to policy 7 if the failure types are generated by C t in the following way: A system failure at time t is of type 1 (type 2) if
C t ≤ c(t) (C t > c(t))
...

(3
...
It is reasonable to
assume that, for all t ≥ 0,
0 < c(t) < c r and R t (x) =

1 if x ≥ c r

...
88), the length L of a replacement
cycle has, according to (3
...
86), the corresponding maintenance cost rate is

© 2006 by Taylor & Francis Group, LLC

t ≥ 0
...
89)

150

STOCHASTIC PROCESSES
t
⎤
⎡ ∞
−∫ R x (c(x))λ(x) dx
dt − 1
...

t
∞ −∫ 0 R x (c(x))λ(x) dx
dt
...
90)

The problem consists in finding a repair cost limit function c = c(t) which minimizes (3
...
Generally, an explicit analytical solution cannot be given
...
In particular, the system lifetime X is assumed to be
Weibull distributed:
β

F(t) = P(X ≤ t) = 1 − e −(t/θ) ,

t ≥ 0, β > 1, θ > 0
...
91)

The respective failure rate and integrated failure rate are given by (3
...

Constant Repair Cost Limit For the sake of comparison, next the case is considered that the repair cost limit is constant and that the cost of a repair C does not depend
on t, i
...

c(t) ≡ c and R t (x) = R(x) for all x and t
...

Hence, the mean cycle length is
−1/β
E(L) = θ Γ(1 + 1/β) ⎡ R(c)⎤

...
87):
K 7 (c) =

R(c)
cm + cr
R(c)

θ Γ(1 + 1/β) ⎡ R(c)⎤
⎣
⎦

−1/β

...
The value of y = R(c) minimizing K 7 (c) is easily seen to be
β−1
y ∗ = R(c ∗ ) =
with k = c r /c m
...
Hence, since 0 < y ∗ < 1, an additional assumption
has to be made:
1 < β < k
...
92)
Given (3
...

(3
...

θ Γ(1 + 1/β) ⎝ β − 1 ⎠

In particular, for β = 2 (Rayleigh distribution),
Γ1 + 1/β) = Γ(3/2) = π/4
so that
K 7 (c ∗ ) =

4 cm
θ

k − 1 ≈ 2
...

π
θ

(3
...

Thus, a decreasing repair cost limit c(t) is supposed to lead to a lower maintenance
cost rate than a constant repair cost limit or an increasing repair cost limit function
...

cr < x < ∞
⎩ 1,

(3
...
96)

Combining (3
...
96) gives the probability that a system failure, which occurs
at age t, implies a replacement:
⎧ 0,
0 ≤ t < d/(c r − c)
⎪
R(c(t)) = ⎨ c r −c
,
d
d
⎪ cr − cr t ,
c r −c ≤ t < ∞
⎩

0 ≤ c < cr
...
97)

yields
⎧ 0,
R(c(t)) = ⎨
⎩ s (1 − z/t) ,

0≤t
...
98)

Scheduling replacements based on (3
...
After this period, a system failure makes a replacement more and more likely with increasing system age
...
91) with β = 2) :
λ(t) = 2 t /θ 2 ,

© 2006 by Taylor & Francis Group, LLC

Λ(t) = (t /θ) 2
...
99)

152

STOCHASTIC PROCESSES

Under these assumptions, the maintenance cost rate (3
...
100)

2
x e −λ s x dx = 1
...

⎝ ⎠
⎠
2r + θ πs ⎝

In order to minimize K 7 (r, s) with respect to r and s, in a first step K 7 (r, s) is minimized with respect to r with s fixed
...

⎝ 2
⎠
4
Since, by assumption, k = c r /c m > 1, the right-hand side of this equation is positive
...

⎦
⎣
2
To make sure that r ∗ (s) > 0, an additional assumption has to be made:
k > π − 2 + 1
...

s

(3
...
102)

Since s ≤ 1, the function K 7 (r ∗ (s), s) assumes its minimum at s = 1
...

With s = 1, condition (3
...
57
...
95)
...
95) is
c = 0 and d ∗ = θ ⎡ 4k − π − π ⎤ c r ,
⎦
2⎣
and the corresponding minimal maintenance cost rate is
c
K 7 (d ∗ ) = m 4k − π
...
103)

3 POINT PROCESSES

∼ 15
K7

153

∼
K 7 (c ∗ )
∼
K 7 (d ∗ )

12
9
6
0

10

20

30

40

50

k

Figure 3
...

16 − 4π
Since 1
...
786, this restriction is slightly stronger than k > π/2 , but for the
same reasons as given above, will have no negative impact on practical applications
...
5 compares the relative cost criteria
∼
∼
K 7 (d ∗ ) = cθ K 7 (d ∗ ) and K 7 (c ∗ ) = cθ K 7 (c ∗ )
m
m

in dependence on k, k ≥ k ∗
...
However, it is more realistic to assume that on
average repair costs increase with increasing system age
...
Then,
⎧ 1 , 0 ≤ t < x−a
⎪
b
...
104)
x−a
⎪ bt ,
b
⎩
Constant repair cost limit For the sake of comparison, next a constant repair cost c
limit is applied
...
With the lifetime characteristics (3
...
100), the maintenance cost rate (3
...

r + θ π/4

The value r = r ∗ minimizing K 7 (r) is

r ∗ = θ ⎡ 4k − π − π ⎤ ,
⎦
2⎣
where, as before, k = c r /c m
...
Then corresponding optimal repair cost limit is c ∗ = a + b r ∗
...

(3
...

Then, from (3
...

⎪
y⎩ 1,
If d is assumed to be sufficiently small, then letting y = ∞ will only have a negligibly small effect on the maintenance cost rate (3
...
Moreover, the replacement probability R t (c(t)) has the same functional structure as (3
...
Thus, for small d the minimal maintenance cost rate is again given by (3
...

s

(3
...
97), now s > 1
...
105), one easily verifies that
K 7 (c ∗ ) > K 7 (r ∗ (s), s)
...
However, an optimal parameter s = s ∗ cannot
be constructed by minimizing (3
...
106) to be approximately valid, the assumption 's is sufficiently near to 1'
had to be made
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

3
...
3
...
In this
context, the replacements of failed systems are also called renewals
...

Definition 3
...

Thus, Y i is the time between the (i − 1)th and the i th renewal; i = 1, 2,
...

Renewal processes do not only play an important role in engineering, but also in the
natural, economical and social sciences
...
In the latter context, Y i is the time between the arrival of the (i − 1)th and the
i th customer
...
5)
...

If the observation of a renewal process starts at time t = 0 and the process has
already been operating for a while, then the lifetime of the system operating at time
t = 0 is a 'residual lifetime' and will, therefore, usually not have the same probability
distribution as the lifetime of a system after a renewal
...
are identically distributed
...
6 Let {Y 1 , Y 2 ,
...
are identically distributed as Y with distribution function
F(t) = P(Y ≤ t), F 1 (t) ≡ F(t)
...
} is called a delayed renewal process
...

The random point process {T 1 , T 2 ,
...
The time intervals between two neighbouring renewals are renewal cycles
...
Note that N(t) is the random number of renewals
in (0, t]
...
107)

F T n (t) = P(T n ≤ t) = P(N(t) ≥ n)
...
108)

implies
Because of the independence of the Y i , the distribution function F T n (t) is the convolution of F 1 (t) with the (n − 1) th convolution power of F (see section 1
...
2):
F T n (t) = F 1 ∗ F ∗(n−1) (t), F ∗(0) (t) ≡ 1, t ≥ 0 ; n = 1, 2,
...
109)

If the densities
f 1 (t) = F 1 (t) and f (t) = F (t)
exist, then the density of T n is
f T n (t) = f 1 ∗ f ∗(n−1) (t), f ∗(0) (t) ≡ 1, t ≥ 0; n = 1, 2,
...
110)

Using (3
...

0

(3
...
10 Let {Y 1 , Y 2 ,
...

Then, by theorem 3
...
In particular, by (3
...

Apart from the homogeneous Poisson process, there are two other important ordinary
renewal processes for which the convolution powers of the renewal cycle length distributions explicitely exist so that the distribution functions of the renewal times T n
can be given:

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

157

1) Erlang Distribution The renewal cycle length Y has an Erlang distribution with
parameters m and λ
...
23, section 1
...
2)
...

,

(3
...

2) Normal Distribution Let the renewal cycle length Y have a normal distribution
with parameters μ and σ , μ > 3σ
...
However, renewal theory has
been extended to negative 'cycle lengths'
...
24,
section 1
...
2), T n has distribution function
⎛ t − nμ ⎞
F ∗(n) (t) = P(T n ≤ t) = Φ ⎜
⎟,
⎝σ n ⎠

t ≥ 0
...
113)

This result also has a more general meaning: Since T n is the sum of n independent,
identically distributed random variables, then, by the central limit theorem 1
...
113) if n is sufficiently large, i
...

T n ≈ N(nμ, σ 2 n) if n ≥ 20
...
11 The distribution function of T n can be used to solve the so-called
spare part problem: How many spare parts (spare systems) are absolutely necessary
for making sure that the renewal process can be maintained over the intervall [0, t]
with probability 1 − α ?
This requires the computation of the smallest integer n satisfying
1 − F T n (t) = P(N(t) ≤ n) ≥ 1 − α
...

If t = 200 and 1 − α = 0
...
99
⎝ 5 n ⎠
is equivalent to
z 0
...
32 ≤ 8 n−200
...
99 every failed part can be replaced by a new one over the interval (0, 200]
...

3
...
2 Renewal Function
3
...
2
...

Definition 3
...

Thus, with the terminology and the notation introduced in section 2
...

However, to be in line with the majority of publications on renewal theory, in what
follows, the renewal functions belonging to an ordinary and a delayed renewal process are denoted as H(t) and H 1 (t) , respectively
...
Hence,
dF(t) = f (t) dt and dF 1 (t) = f 1 (t) dt
...

dt

The functions h 1 (t) and h(t) are the renewal densities of a delayed and of an ordinary renewal process, respectively
...
15), a sum representation of the renewal
function is
∞
H 1 (t) = E(N(t)) = Σ n=1 P(N(t) ≥ n)
...
114)
In view of (3
...
109),
∞

H 1 (t) = Σ n=1 F 1 ∗ F ∗(n−1) (t)
...
115)

In particular, the renewal function of an ordinary renewal process is
∞

H(t) = Σ n=1 F ∗(n) (t)
...
116)

By differentiation of (3
...
115) with respect to t, one obtains sum represen-

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

159

tations of the respective renewal densities:
∞

∞

h 1 (t) = Σ n=1 f 1 ∗ f ∗(n−1) (t) ,

h(t) = Σ n=1 f ∗(n) (t)
...

In view of (3
...

⎝
⎠

Again by (3
...
Hence, H 1 (t) satisfies
t

H 1 (t) = F 1 (t) + ∫ 0 H 1 (t − x) dF(x)
...
117)

According to (1
...
117) is the convolution H 1 ∗ f of the renewal
function H 1 with f
...

(3
...
118) can be done by conditioning with regard to
the time point of the first renewal: Given that the first renewal occurs at time x, the
mean number of renewals in [0, t] is
[1 + H(t − x)],

0 < x ≤ t
...
118)
...

(3
...
By partial integration of the convolutions, the renewal equations can be rewritten
...
117) is equivalent to
t

H 1 (t) = F 1 (t) + ∫ 0 F(t − x) dH 1 (x)
...
120)

STOCHASTIC PROCESSES

160

By differentiating the renewal equations (3
...
119) with respect to t, one obtains the following integral equations of renewal type for h 1 (t) and h(t) :
t

(3
...
122)

t

(3
...

Generally, solutions of integral equations of renewal type can only be obtained by
numerical methods
...
Then, by (1
...
121) and (3
...

The solutions are
h 1 (s) =

f 1 (s)
1 − f(s)

h(s) =

,

f (s)
1 − f (s)

...
124)

Thus, for ordinary renewal processes there is a one-to-one correspondence between
the renewal function and the probability distribution of the cycle length
...
29),
the Laplace transforms of the corresponding renewal functions are
H 1 (s) =

f 1 (s)
s (1 − f (s))

H(s) =

,

f (s)
s (1 − f (s))

...
125)

Integral Equations of Renewal Type The integral equations (3
...
119) and
the equivalent ones derived from these are called renewal equations
...
A function Z(t) is said to satisfy an integral equation of renewal type if for any function g(t), which is bounded
on intervals of finite length, and for any distribution function F(t) with probability
density f (t),
t

Z(t) = g(t) + ∫ 0 Z(t − x) f (x)dx
...
126)

The unique solution of this integral equation is
t

Z(t) = g(t) + ∫ 0 g(t − x)h(x) dx,

(3
...

For a proof, see Feller [28]
...
127) need not be the trend
function of a renewal counting process
...
12 Let
f 1 (t) = f (t) = λ e −λ t , t ≥ 0
...

s+λ

By (3
...

s+λ ⎝ s+λ⎠
s2

The corresponding pre-image is
H(t) = λ t
...

Example 3
...

Thus, F(t) = 1 − F(t) can be thought of the survival function of a parallel system consisting of two subsystems, whose lifetimes are independent, identically distributed
exponential random variables with parameter λ = 1
...

(s + 1)(s + 2)
From (3
...

s (s + 3)

By splitting the fraction into partial fractions, the pre-image of h(s) is seen to be
h(t) = 2 (1 − e −3 t )
...

⎢
⎠⎥
⎦
3⎣ 3 ⎝
Explicit formulas for the renewal function of ordinary renewal processes exist for the
following two classes of cycle length distributions:
1) Erlang Distribution Let the cycle lengths be Erlang distributed with parameters
m and λ
...
108), (3
...
116),
H(t) = e −λt

© 2006 by Taylor & Francis Group, LLC

(λt) i

...

⎢
⎥
⎝
4⎣
2 2
4⎠⎦

(homogeneous Poisson process)

2) Normal distribution Let the cycle lengths be normally distributed with mean value µ and variance σ 2 , μ > 3σ 2
...
108), (3
...
116),
H(t) =

∞

⎛t − nμ⎞
Φ⎜
⎟
...
128)

This sum representation is very convenient for numerical computations, since only
the sum of the first few terms approximates the renewal function with sufficient accuracy
...
12, an ordinary renewal process has renewal function
H(t) = λ t = t /μ if and only if f (t) = λe −λt , t ≥ 0,
where μ = E(Y)
...

Theorem 3
...
} be a delayed renewal process with cycle lengths
Y 2 , Y 3 ,
...
If Y has finite mean value μ and distribution function F(t) = P(Y ≤ t) , then {Y 1 , Y 2 ,
...
129)

if and only if the length of the first renewal cycle Y 1 has density f 1 (t) ≡ f S (t) , where
1
f S (t) = μ (1 − F(t)) ,

t ≥ 0
...
130)

Equivalently, {Y 1 , Y 2 ,
...
129) if and only if Y 1 has distribution function F 1 (t) ≡ F S (t) with
1 t
F S (t) = μ ∫ 0 (1 − F(x)) dx, t ≥ 0
...
131)

Proof Let f (s) and f S (s) be the respective Laplace transforms of f (t) and f S (t)
...
130) and taking into
account (1
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

163

Replacing in the first equation of (3
...

Retransformation of H S (s) gives the desired result: H S (t) = t /μ
...
130) (distribution function (3
...
3
...
Moreover, this distribution type already occurred in section 1
...
45)
...
24)
E(S) =

μ
μ2 + σ2
and E(S 2 ) = 3 ,
2μ
3μ

(3
...

Higher Moments of N(t) Apart from the renewal function, which is the first moment of N(t), higher moments of N(t) also have some importance, in particular when
investigating the behaviour of the renewal function as t → ∞
...
} an ordinary renewal process and {N(t), t ≥ 0} its corresponding renewal counting process
...
The binomial moment of the order n of N(t) is defined as
N(t)
E ⎛ n ⎞ = 1 E{[N(t)][N(t) − 1]
...

⎠ n!
⎝

(3
...

⎝
⎠
Specifically, for n = 2,
E⎛
⎝

N(t) ⎞ 1
= E{[N(t)][N(t) − 1]} = 1 E[N(t)] 2 − H(t) = H ∗(2) (t)
2 ⎠ 2
2

so that the variance of N(t) is equal to
t

Var(N(t)) = 2 ∫ 0 H(t − x) dH(x) + H(t) − [H(t)] 2
...

© 2006 by Taylor & Francis Group, LLC

STOCHASTIC PROCESSES

164

3
...
2
...
Hence, bounds on H(t), which only require information on one or more numerical parameters of the cycle length distribution, are of special interest
...

1) Elementary Bounds By definition of T n ,
n

max Y i ≤ Σ i=1 Y i = T n
...

1≤i ≤n

Summing from n = 1 to ∞ on both sides of this inequality, the sum representation of
the renewal function (3
...

1 − F(t)
Note that the left-hand side of this inequality is the first term of the sum (3
...

These 'elementary bounds' are only useful for small t
...
131)
...

(3
...

Convolution of both sides with F ∗(n) (t) leads to
a 0 ⎡ F ∗(n) (t) − F ∗(n+1) (t) ⎤ ≤ F ∗(n+1) (t) − F S ∗ F ∗(n) (t) ≤ a 1 ⎡ F ∗(n) (t) − F ∗(n+1) (t) ⎤
...
116) and theorem 3
...
134)
...
134) implies a simpler lower bound on H(t) :
t
t
H(t) ≥ μ − F S (t) ≥ μ − 1
...

∫ t F(x) dx
Then a 0 and a 1 can be rewritten as follows:
1
1
a 0 = μ inf 1 − 1 and a 1 = μ sup 1 − 1
...
134) becomes
t 1
1
1
t 1
μ + μ inf λ S (t) − 1 ≤ H(t) ≤ μ + μ sup λ S (t) − 1
...
135)

Since
inf λ(t) ≤ inf λ S (t) and sup λ(t) ≥ sup λ S (t) ,

t∈F

t∈F

t∈F

t∈F

the bounds (3
...

t∈F
t∈F
3) Upper Bound If μ = E(Y) and μ 2 = E(Y 2 ) , then (Lorden [55])
μ
t
H(t) ≤ μ + 2 − 1
...
136)

(3
...
137) can be improved (Brown [14]):
μ
t
H(t) ≤ μ + 2 − 1
...

(3
...
14 As in example 3
...
In this
case, μ = E(Y) = 3/2 and
1 ∞
F S (t) = μ ∫ t F(x) dx = 2 ⎛ 2 − 1 e −t ⎞ e −t , t ≥ 0
...
2
1
...
8

H(t)

1
...
8
λ S (t)

0
...
6

λ(t)

0
...
4

0
...
2
0

1

2

Figure 3
...
7 Bounds for the renewal function

Therefore, the failure rates belonging to F(t) and F S (t) are (Figure 3
...

4−e
2 − e −t
Both failure rates are strictly increasing in t and have, moreover, the properties
λ(0) = 0, λ(∞) = 1,
λ S (0) = 2/3, λ S (∞) = 1
...
135) and (3
...

3
3
3
3
3

In this case, the upper bound in (3
...
Figure 3
...
135) with the exact graph of the renewal
function given in example 3
...
The deviation of the lower bound from H(t) is negligibly small for t ≥ 3
...
3
...
The results allow the construction of estimates of the
renewal function and of the probability distribution of N(t) if t is sufficiently large
...
Some
of the key results require that the cycle length Y or, equivalently, its distribution
function, is nonarithmetic, i
...
that there is no positive constant a with property that
the possible values of Y are multiples of a
...
(The set R consists of all
possible values, which Y can assume
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

167

A simple consequence of the strong law of the large numbers is
N(t) 1 ⎞
P ⎛ lim
= μ = 1
...
139)

To avoid technicalities, the proof is given for an ordinary renewal process: The inequality
T N(t) ≤ t < T N(t)+1
implies that
T N(t)
N(t)

T N(t)+1 T N(t)+1 N(t)+1
≤ t <
=
N(t)

N(t)

N(t)+1

N(t)

or , equivalently, that
1
N(t)

N(t)

Σ i=1

N(t)+1 ⎤ N(t)+1
⎡
Y i ≤ t < ⎢ 1 Σ i=1 Y i ⎥

...
Hence, theorem
1
...
139)
...
The following theorem considers the corresponding limit behaviour of the mean value of N(t)
...
12 and 3
...

Theorem 3
...

t→∞ t

lim

Thus, for large t, H 1 (t) ≈ t /μ
...
(For this property to be
valid, the assumption E(Y 1 ) < ∞ had to be made
...
11 is
1

lim h (t) = μ
...
139) does not imply theorem 3
...
The following theorem was called
the fundamental renewal theorem by its discoverer W
...
Smith
...
12 (fundamental renewal theorem) If F(t) is nonarithmetic and g(t) an
integrable function on [0, ∞) , then
t
1 ∞
lim ∫ g(t − x) dH 1 (x) = μ ∫ 0 g(x) dx
...
Theorem 3
...
It refers to the integral equation of renewal type (3
...

© 2006 by Taylor & Francis Group, LLC

STOCHASTIC PROCESSES

168

Theorem 3
...
If Z(t) satisfies the equation of renewal type (3
...

t→∞

(3
...
11 to 3
...
The equivalence of the theorems 3
...
12 results from the structure (3
...

Blackwell's renewal theorem Let
⎧ 1 for 0 ≤ x ≤ h
g(x) = ⎨

...
141)
lim [H (t + h) − H 1 (t)] = μ
...

Theorem 3
...

⎝
⎠
2
2
t→∞
2μ

(3
...
120) is equivalent to
t

H 1 (t) = F 1 (t) + ∫ 0 F 1 (t − x) dH(x)
...
143)

If F 1 (t) ≡ F S (t), then, by theorem 3
...

By subtracting integral equation (3
...
143),
t
t
t
H 1 (t) − μ = F S (t) − F 1 (t) + ∫ 0 F S (t − x) dH(x) − ∫ 0 F 1 (t − x) dH(x)
...

⎠
t→∞ ⎝ 1
Now the desired results follows from (1
...
132)
...
144)

3 POINT PROCESSES

169

For ordinary renewal processes, (3
...

⎝
⎠ 2⎝ 2
t→∞
⎠
μ

(3
...
14, the fundamental renewal theorem implies the elementary renewal theorem
...
15 For an ordinary renewal process, the integrated renewal function has
property
⎡ t2 ⎛ μ2
⎞ ⎤ ⎫ μ2
⎧ t
μ
⎢
lim ⎨ ∫ H(x) dx − ⎢
+
− 1⎟ t⎥ ⎬ = 2 − 3
⎥
3 6 μ2
⎥
⎢ 2μ ⎜ 2μ 2
t→∞ ⎩ 0
⎝
⎠ ⎦ ⎭ 4μ
⎣
with μ 2 = E(Y 2 ) and μ 3 = E(Y 3 )
...
The following theorem is basically a consequence of the central limit theorem (for details see Karlin and Taylor [45])
...
16 The random number N(t) of renewals in [0, t] satisfies
⎛
⎞
N(t) − t/μ
lim P ⎜
≤ x ⎟ = Φ(x)
...

(3
...
16 can be used to construct approximate intervals, which contain
N(t) with a given probability: If t is sufficiently large, then
⎛t
⎞
t
P μ − z α/2 σ t μ −3 ≤ N(t) ≤ μ + z α/2 σ t μ −3 = 1 − α
...
147)

As usual, z α/2 is the (1 − α/2) - percentile of the standard normal distribution
...
15 Let t = 1000, μ = 10, σ = 2, and α = 0
...
Since z 0
...
95
...
3
...

(3
...
16 The same numerical parameters as in example 3
...
01
...
01 = 2
...
32 ⋅ 5 200 ⋅ 8 −3 = 32
...

8
Hence, about 33 spare parts are needed to make sure that with probability 0
...
(Formula (3
...
11 yielded n min = 34
...
3
...
3) and (3
...
In
particular, if {Y 1 , Y 2 ,
...
} is the corresponding
process of renewal time points, then its (random) forward recurrence time A(t) is
A(t) = T N(t)+1 − t
and its (random) backward recurrence time B(t) is
B(t) = t − T N(t)
...

A(t)

B(t)
0

T1

T2

...
8 Illustration of the recurrence times

The stochastic processes
{Y 1 , Y 2 ,
...
} , {N(t), t ≥ 0}, {A(t), t ≥ 0}, and {B(t), t ≥ 0}
are statistically equivalent, since there is a one to one correspondence between their
sample paths, i
...
each of these five processes can be used to define a renewal process (Figures 3
...
9)
...
9 Sample paths of the backward and forward recurrence times processes

Let
F A(t) (x) = P(A(t) ≤ x)

and F B(t) (x) = P(B(t) ≤ x)

be the distribution functions of the forward and the backward recurrence times, respectively
...
115),
F A(t) (x) = P(T N(t)+1 − t ≤ x)
∞

= Σ n=0 P(T N(t)+1 ≤ t + x, N(t) = n)
∞

= F 1 (t + x) − F 1 (t) + Σ n=1 P(T n ≤ t < T n+1 ≤ t + x)
∞

t

= F 1 (t + x) − F 1 (t) + Σ n=1 ∫ 0 [F(x + t − y) − F(t − y)] dF T n (y)
∞

t

= F 1 (t + x) − F 1 (t) + ∫ 0 [F(x + t − y) − F(t − y)]Σ n=1 dF T n (y)
∞

t

= F 1 (t + x) − F 1 (t) + ∫ 0 [F(x + t − y) − F(t − y)]Σ n=1 d(F 1 ∗ F ∗(n−1) (y))
∞
t
= F 1 (t + x) − F 1 (t) + ∫ 0 [F(x + t − y) − F(t − y)] d ⎛ Σ n=1 F 1 ∗ F ∗(n−1) (y) ⎞
⎝
⎠
t

= F 1 (t + x) − F 1 (t) + ∫ 0 [F(x + t − y) − F(t − y)] dH 1 (y)
...
120)
...

(3
...

(3
...
Therefore, F A(t) (x) is sometimes called interval reliability
...
are independent and identically distributed as Y with μ = E(Y )
...
125) cannot be applied to obtain E(A(t)), since N(t) + 1 is surely
not independent of the sequence Y 1 , Y 2 ,
...
:
”N(t) + 1 = n” = ”N(t) = n − 1” = ”Y 1 + Y 2 +
...
+ Y n ”
...
so that, by definition 1
...
Hence, the mean value of A(t) can be obtained from (1
...

Thus, the mean forward recurrence time of an ordinary renewal process is
E(A(t)) = μ [H(t) + 1] − t
...

The probability distribution of the backward recurrence time is obtained as follows:
F B(t) (x) = P(t − x ≤ T N(t) )
∞

= Σ n=1 P(t − x ≤ T n , N(t) = n)
∞

= Σ n=1 P(t − x ≤ T n ≤ t < T n+1 )
∞

t

= Σ n=1 ∫ t−x F(t − u) dF T n (u)
∞
t
= ∫ t−x F(t − u) d ⎛ Σ n=1 F 1 ∗ F ∗(n) ⎞
⎝
⎠
t

= ∫ t−x F(t − u) dH 1 (u)
...

(3
...

for
t⎩ 0

© 2006 by Taylor & Francis Group, LLC

(3
...

In view of the memoryless property of the exponential distribution (example 1
...
4), this result is not surprising
...
131), is the limiting distribution function of both backward and forward recurrence time as t tends to infinity:
lim F
(x) = lim F B(t) (x) = F S (x),
t→∞ A(t)
t→∞

x ≥ 0
...
153)

Paradox of Renewal Theory In view of the definition of the forward recurrence
time, one supposes that the following equation is true:
lim E(A(t)) = μ /2
...
153) and (3
...

2μ
2

This 'contradiction' is known as the paradox of renewal theory
...

3
...
5 Stationary Renewal Processes
By definition 3
...
} is stationary if for all k = 1, 2,
...
, i k with 1 ≤ i 1 < i 2 <
...

the joint distribution functions of the vectors
(Y i , Y i ,
...
, Y i +τ )
1
2
1
2
k
k
coincide, k = 1, 2,
...
1, {Y 1 , Y 2 ,
...
A third way of defining the stationarity of a renewal process {Y 1 , Y 2 ,
...
} and
the corresponding process {A(t), t ≥ 0} of its forward recurrence times:
A renewal process is stationary if and only if the process of its forward
recurrence times {A(t), t ≥ 0} is strongly stationary
...

The stochastic process in continuous time {B(t), t ≥ 0} is a Markov process
...
By theorem 2
...
Hence, a renewal process is stationary if and only if there is a
distribution function F(x) so that
F A(t) (x) = P(A(t) ≤ x) = F(x) for all x ≥ 0 and t ≥ 0
...

Theorem 3
...
Then a delayed renewal process given by F 1 (x) and F(x) is stationary if and only if
H 1 (t) = t /μ
...
154)

Equivalently, as a consequence of theorem 3
...

(3
...
154) holds, then (3
...
149),
1 t+x
1 t
F A(t) (x) = μ ∫ 0 F(y) dy − μ ∫ 0 F(x + t − y) dy
1 t+x
1 t+x
= μ ∫ 0 F(y) dy − μ ∫ x F(y) dy
1 x
= μ ∫ 0 F(y) dy
...

Conversely, if F A(t) (x) does not depend on t, then (3
...

This completes the proof of the theorem
...
17 and the elementary renewal theorem: After a
sufficiently large time span (transient response time) every renewal process with nonarithmetic distribution function F(t) and finite mean cycle length μ = E(Y) behaves
as a stationary renewal process
...
3
...

In order to be able to model practical situations, in which this assumption is not fulfilled, the concept of a renewal process has to be generalized in the following way:
The renewal time of the system after its i th failure is assumed to be a positive random variable Z i ; i = 1, 2,
...

In this way, a sequence of two-dimensional random vectors {(Y i , Z i ); i = 1, 2,
...

Definition 3
...
} and {Z 1 , Z 2 ,
...
} is said to be an
alternating renewal process
...
,

are the time points at which failures occur, and the random variables
n−1

T n = Σ i=1 (Y i + Z i ); n = 1, 2,
...
If an operating system is assigned a '1' and a failed system a '0', then a binary indicator variable of the
system state is
⎧ 0,
X(t) = ⎨
⎩ 1,

if t ∈ [S n , T n ), n = 1, 2,
...
156)

Obviously, an alternating renewal process can equivalently be defined by the stochastic process in continuous time {X(t), t ≥ 0} with X(t) given by (3
...
10)
...
By agreement,
P(X(+0) = 1) = 1
...
10 Sample path of an alternating renewal process

© 2006 by Taylor & Francis Group, LLC

t

176

STOCHASTIC PROCESSES

Analogously to the concept of a delayed renewal process, the alternating renewal
process can be generalized by assigning the random lifetime Y 1 a probability distribution different from that of Y
...

Let N f (t) and N r (t) be the respective numbers of failures and renewals in (0, t]
...
109)),
F S n (t) = P(S n ≤ t) = P(N f (t) ≥ n) = F Y ∗ (F Y ∗ F Z ) ∗(n−1) (t),
F T n (t) = P(T n ≤ t) = P(N r (t) ≥ n) = (F Y ∗ F Z ) ∗(n) (t)
...
157)
(3
...
115) and (3
...

H f (t) and H r (t) are referred to as the renewal functions of the alternating renewal
process
...
117)
with
F 1 (t) ≡ F Y (t) and F(t) = F Y ∗ F Z (t)
...
Therefore, H r (t)
satisfies renewal equation (3
...

Let R t be the residual lifetime of the system if it is operating at time t
...
This probability is called interval availability (or interval reliability) and is
denoted as A x (t)
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

177

Hence,
t

A x (t) = F Y (t + x) + ∫ 0 F Y (t + x − u) dH r (u)
...
159)

Note In this section 'A' no longer refers to 'forward recurrence time'
...

(3
...
159) by letting there x = 0 :
t

A(t) = F Y (t) + ∫ 0 F Y (t − u) dH r (u)
...
161)

A(t) is called availability of the system (system availability) or, more exactly, point
availability of the system, since it refers to a specific time point t
...

The average availability of the system in the interval [0, t] is
t
A(t) = 1 ∫ 0 A(x) dx
...

(3
...

⎝
⎠

Thus,
t

E(U(t)) = ∫ 0 A(x) dx = t A(t)
...
A proof of the assertions
need not be given since they are an immediate consequence of the fundamental renewal theorem 3
...

Theorem 3
...

E(Y) + E(Z)

(3
...

Clearly, it is A = A 0
...

It should be mentioned that equation (3
...
As illustrated by the following example, in general,
E(Y)
E⎛ Y ⎞ ≠
⎝ Y + Z ⎠ E(Y) + E(Z)
...
164)

Example 3
...

Application of the Laplace transform to (3
...

r ⎦
s+λ ⎣

(3
...

(s + λ) (s + v)

Hence, from the second equation of (3
...

s (s + λ + ν)

By inserting h r (s) into (3
...

s + λ s (s + λ) s (s + λ + ν)

Retransformation yields the point availability:
A(t) =

ν + λ e −(λ+ν) t , t ≥ 0
...
166)

Since
E(Y) = 1/λ and E(Z) = 1/ν,
taking in (3
...
163)
...
20,
E ⎛ Y ⎞ = ν ⎛ 1 + λ ln λ ⎞
...
25 E(Y) , then
A=

E(Y)
= 0
...
717
...
159) and (3
...
This is again due to the fact
that there are either no explicit or rather complicated representations of the renewal
function for most of the common lifetime distributions
...
159)
and (3
...
3
...
2 and 3
...
3
...
3
...
3
...
1 Definition and Properties
Compound stochastic processes arise by additive superposition of random variables
at random time points
...
2
...
)
Definition 3
...
} be a random marked point process with
property that {T 1 , T 2 ,
...
}, and let {N(t), t ≥ 0} be the corresponding renewal counting process
...

⎪ 0
if N(t) = 0
⎩

(3
...

The compound Poisson process defined in section 3
...
5 is a compound renewal
process with property that the renewal cycle lengths Y i = T i − T i−1 , i = 1, 2,
...
2)
...
In most applications,
C(t)
C(T 5 ) = M 1 + M 2 + M 3 + M 4 + M 5
C(T 3 ) = M 1 + M 2 + M 3
C(T 1 ) = M 1
0

T1

T2

T3

T4

T5

T5

t

Figure 3
...
M i also can represent a 'loss' or 'gain' which accumulates over the ith renewal cycle (maintenance
cost, profit by operating the system)
...
The sample paths of a compound renewal
process are step functions
...
11)
...

2) The sequences are { M 1 , M 2 ,
...
} are independent of each other
and consist each of independent, nonnegative random variables, which are identically distributed as M and Y, respectively
...
e
...

3) The mean values of Y and M are finite and positive
...
125) yields the trend function
m(t) = E(C(t)) of a compound renewal process:
m(t) = E(M) H(t) ,

(3
...
Formula (3
...
11)
imply an important asymptotic property of the trend function of compound renewal
processes:
E(C(t)) E(M)
lim
=

...
169)
t
E(Y)
t→∞
Equation (3
...
The
'stochastic analogue' to (3
...

t
E(Y)

(3
...
170), consider the obvious relationship
N(t)

Σ i=1

N(t)+1

M i ≤ C(t) ≤ Σ i=1

Mi
...

t
⎠

Now the strong law of the large numbers (theorem 1
...
139) imply (3
...

The relationships (3
...
170) are called renewal reward theorems
...
Hence, by the total probability rule,
∞
(3
...
111)
...
2
...
) If
Y has an exponential distribution with parameter λ , then C(t) has distribution function
∞
(λt) n
F C(t) (x) = e −λt Σ G ∗(n) (x)
; G ∗(0) (x) ≡ 1, x > 0, t > 0
...
172)
n!
n=0
If, in addition, M has a normal distribution with E(M) ≥ 3 Var(M) , then
⎡
⎛ x − n E(M) ⎞ (λt) n ⎤
∞
⎥;
F C(t) (x) = e −λt ⎢ 1 + Σ Φ ⎜
⎟
⎢
⎥
⎢
n=1 ⎝ n Var(M) ⎠ n! ⎥
⎣
⎦

x > 0, t > 0
...
173)

The distribution function F C(t) , for being composed of convolution powers of G and
F, is usually not tractable and useful for numerical applications
...
For surveys, see, e
...
[67, 89]
...

Theorem 3
...
174)

then
⎛ C(t) − E(M) t
⎞
E(Y)
⎟
⎜
lim P ⎜
≤ x ⎟ = Φ(x) ,
−3/2 γ t
t→∞ ⎜ [E(Y)]
⎟
⎠
⎝
where Φ(x) is the distribution function of the standardized normal distribution
...
e
...

⎠
⎝
E(Y)

(3
...

(3
...
174) is always fulfilled
...
174) actually only excludes the case γ 2 = 0 , i
...
linear dependence between Y and
M
...
19
...
4
...

Example 3
...
}, the total
renewal time in (0, t] is given by (a possible renewal time running at time t is neglected)
N(t)

C(t) = Σ i=1 Z i ,
where
N(t) = max {n, T n < t}
...
3
...
) Hence, the development of the total
renewal time is governed by a compound stochastic process
...
19, M has to be
replaced with Z and Y with Y + Z
...

Because of the independence of Y and Z,
γ 2 = Var[Z E(Y + Z) − (Y + Z) E(Z)]
= Var[Z E(Y) − Y E(Z)]
= [E(Y)] 2 Var(Z) + [E(Z)] 2 Var(Y) > 0
so that assumption (3
...
In particular, let (all parameters in hours)
E(Y) = 120 ,

Var(Y) = 40

and E(Z) = 4 ,

Var(Z) = 2
...
4
...
The
probability that C(10 4 ) does not exceed a nominal value of 350 hours is
⎛
350− 4 10 4
124
⎜
⎜ 124 −3/2 ⋅288
...
905
...
313)
...
3
...
2 First Passage Time
The previous example motivates an investigation of the random time L(x), at which
the compound renewal process {C(t), t ≥ 0} exceeds a given nominal value x for the
first time:
L(x) = inf {t, C(t) > x}
...
177)
t

If, for instance, x is the critical wear limit of an item, then crossing level x is commonly referred to as the occurrence of a drift failure
...
Since, by assumption 2, the M i are nonnegative random variables, the compound renewal process {C(t), t ≥ 0} has nondecreasing sample paths
...
12):
P(L(x) ≤ t) = P(C(t) > x)
...
178)

C(t)
L(x) < t 0
C(t 0 ) > x

x

0

T1

T2

L(x) = T 3 t 0

t

T4

Figure 3
...
172) and (3
...
The probability distribution of L(x) is generally not explicitly
available
...
The analogy of this theorem to theorem 3
...

Theorem 3
...

© 2006 by Taylor & Francis Group, LLC

184

STOCHASTIC PROCESSES

Actually, in view of our assumption that the compound process {C(t), t ≥ 0} has
nondecreasing sample paths, condition (3
...
19 and 3
...

A consequence of theorem 3
...
e
...

(3
...
179) is called Birnbaum-Saunders distribution
...
19 Mechanical wear of an item is caused by shocks
...
) After the i th
shock the degree of wear of the item increases by M i units
...
are
supposed to be independent random variables, which are identically normally distributed as M with parameters
E(M) = 9
...
8 [in 10 −4 mm ]
...
The item is replaced by an equivalent
new one if the total degree of wear exceeds a critical level of 0
...

(1) What is the probability p 100 that the item has to be replaced before or at the
occurrence of the 100 th shock? The degree of wear after 100 shocks is
100

C 100 = Σ i=1 M i
and has approximately the distribution function (unit of x: 10 −4 mm )
⎛
⎞
P(C 100 ≤ x) = Φ ⎜ x − 9
...

⎜
⎟
⎝ 28 ⎠
⎜
⎟
⎝ 2
...
86)
...
979
...

What is the probability that the nominal value of 0
...
20 can be applied since 0
...
Provided M and Y are independent, γ = 0
...
Hence,
⎛
⎞
600 − 6 10 3
9
...
848)
...
1) > 600) = 1 − Φ ⎜
⎟
⎜ (9
...
6 ⋅ 0
...
1) > 600) = 0
...

3
...
8 Regenerative Stochastic Processes
At the beginning of this chapter on renewal theory it has been pointed out that, apart
from its own significance, renewal theory provides mathematical foundations for
analyzing the behaviour of complicated systems which have renewal points imbedded in their running times
...

2) Within every regeneration cycle the operation of the system is governed by the
same stochastic rules
...
However, now it is not only the distance between regeneration points that is interesting, but also the behaviour of the system within a regeneration cycle
...
} is introduced, where L i is the random length of the i th regeneration cycle
...
The time
points
T n = Σ n L i ; n = 1, 2,
...
The i th regeneration cycle is
given by
{(L i , W i (x)), 0 ≤ x < L i },
where W i (x) denotes the state of the system at time x (with respect to the preceeding
regeneration point)
...
The probability distribution of the typical
regeneration cycle is called the cycle distribution
...
10 Let {N(t), t ≥ 0} be the renewal counting process, which belongs to
the ordinary renewal process {L 1 , L 2 ,
...
180)
is said to be a regenerative stochastic process
...
; are
its regeneration points
...
10 means that T N(t) , the regeneration point before t,
is declared to be the new origin
...

Thus, a regenerative process restarts at every regeneration point
...
20 The alternating renewal process {(Y i , Z i ); i = 1, 2,
...
In this special case the cycle length L i is given by
the sum of life- plus renewal time L i = Y i + Z i , where the random vectors (Y i , Z i )
are independent of each other and identically distributed as (Y, Z)
...

Y≤x
Therefore, the typical regeneration cycle is
{(L, W(x)), 0 ≤ x < L}
with L = Y + Z
...

Let B be a subset of the state space of {W(x), x ≥ 0} and H(t) be the renewal function belonging to the ordinary renewal process {L 1 , L 2 ,
...
159) it can be shown that the one-dimensional probability distribution
of the regenerative stochastic process {X(t), t ≥ 0} is given by
t

P(X(t) ∈ B) = Q(t, B) + ∫ 0 Q(t − x, B) dH(x) ,

(3
...

The following theorem considers the behaviour of the probability (3
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

187

Theorem 3
...
182)

and
t
∞
lim 1 ∫ P(X(x) ∈ B) dx = 1 ∫ 0 Q(x, B) dx
...
183)

This theorem is an immediate consequence of the fundamental renewal theorem 3
...

The practical application of the stationary state probabilities (3
...
183) of a
regenerative stochastic process is illustrated by analyzing a standard maintenance
policy
...
2
...
3
...
21 (age replacement policy) The system is replaced upon failure or at
age τ by a preventive renewal, whichever occurs first
...
e
...
Unscheduled and preventive replacements require the constant times d r and d p , respectively
...

To specify an underlying regenerative stochastic process, the time points at which a
system starts resuming its work are declared to be the regeneration points
...

⎩ d p with probability F(τ)

E{min(T, τ)} = ∫ τ F(t) dt,
0

the mean length of a regeneration cycle is
τ

E(L) = ∫ 0 F(t) dt + d r F(τ) + d p F(τ)
...

otherwise
⎩ 0
Then, for B = {1},
⎧ 0
Q(x, B) = P(W(x) = 1, L > x) = ⎨
⎩ F(x)
© 2006 by Taylor & Francis Group, LLC

for τ < x ≤ L

...

Now (3
...

t→∞
∫ 0 F(x) dx + d e F(τ) + d p F(τ)

The age replacement policy can also be described by an alternating renewal process
...
163) would yield the same result
...
Then τ ∗ satisfies the necessary condition
τ
λ(τ)∫ 0 F(x) dx − F(τ) = d
1−d
with d = d p /d e
...
The corresponding maximum availability is
A(τ∗) =

3
...

1 + (d e − d p )λ(τ∗)

APPLICATIONS TO ACTUARIAL RISK ANALYSIS

3
...
1 Basic Concepts
Random point processes are key tools for quantifying risk in the insurance industry
...
) A risky situation for an insurance company arises if it has to
pay out a total claim amount, which tends to exceed the total premium income plus
its initial capital
...
Claims arrive at random time points T 1 , T 2 ,
...
Thus, the insurance company is
subjected to a random marked point process {(T 1 , M 1 ), (T 2 , M 2 ),
...
The two components of the risk process are the claim arrival process
{T 1 , T 2 ,
...
Let {N(t), t ≥ 0} be the random counting process which belongs to the claim arrival process
...

⎪ 0
if N(t) = 0
⎩

© 2006 by Taylor & Francis Group, LLC

(3
...
With the terminology introduced in sections
3
...
5 and 3
...
7, {C(t), t ≥ 0} is a compound Poisson process if {N(t), t ≥ 0} is a Poisson process and a compound renewal process if {N(t), t ≥ 0} is a renewal process
...
Let κ(t) be the total premium income of
the insurance company in [0, t]
...
With an initial capital or an initial reserve of x, x ≥ 0, which the
company has at its disposal at the start, the risk reserve at time t is defined as
R(t) = x + κ(t) − C(t)
...
185)

The corresponding risk reserve process is
{R(t), t ≥ 0}
...
This leads to the definition of the ruin
probability p(x) of the company (Figure 3
...

(3
...

The probabilities p(x) and q(x) refer to an infinite time horizon
...

R(t)
M5

x

ruin
0

T1

T2

T3

T4

T5 T6

T7

T8

Figure 3
...
Since ruin can only occur at the arrival times of claims (Figure 3
...
187)

and
p(x, τ) = P(there is a finite, positive integer n with T n ≤ τ so that R(T n ) < 0),

where R(T n ) is understood to be R(+T n ), i
...
the value of the risk reserve process
including the effect of the nth claim
...
)

3
...
2 Poisson Claim Arrival Process
In this section, the problem of determining the ruin probability is considered under
the following 'classical assumptions':
1) {N(t), t ≥ 0} is a homogeneous Poisson process with parameter λ = 1/μ
...
are independent, identically as M distributed random
variables
...

3) The premium income is a linear function in t:
κ(t) = κ t, κ > 0, t ≥ 0
...

4) The time horizon is infinite (τ = ∞)
...
e
...
For instance, consider a portfolio which comprises
policies covering burgleries in houses
...
Generally, an insurance company tries to
establish its portfolios in such a way that they are approximately homogeneous
...

By assumption 1, the interarrival time Y of claims has an exponential distribution
with parameter λ = 1/μ
...
Hence,
μ = E(Y) and ν = E(M)
...
188)
By (3
...
168), under the assumptions 1 and 2, the trend function of the total
claim size process {C(t), t ≥ 0} is a linear function in time:
ν
E(C(t)) = μ t , t ≥ 0
...
189)
© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

191

This justifies assumption 3, namely a linear premium income in time
...
Hence, in what follows, let
ν κμ − ν
κ − μ = μ > 0
...

(3
...

To derive an integro-differential equation for q(x) , consider what may happen in the
time interval [0, Δt] :
1) No claim arrives in [0, Δt]
...

2) One claim arrives in [0, Δt], the risk reserve remains positive
...

3) One claim arrives in [0, Δt], the risk reserve becomes negative (ruin occurs)
...

4) At least 2 claims arrive in [0, Δt]
...

Therefore, given the initial capital x,
q(x) = [1 − λ Δt + o(Δt)] q(x + κ Δt)
x+κ Δt

+[λ Δt + o(Δt)] ∫ 0

q(x + κ Δt − y) b(y) dy + o(Δt)
...

κ
h
h

Assuming that q(x) is differentiable, letting h → 0 yields
x
q (x) = λ ⎡ q(x) − ∫ 0 q(x − y) b(y) dy ⎤
...
191)

A solution can be obtained in terms of Laplace transforms: Let q(s) and b(s) be the
Laplace transforms of q(x) and b(s)
...
191), using its properties (1
...
33), and replacing λ with 1/μ yields

© 2006 by Taylor & Francis Group, LLC

192

STOCHASTIC PROCESSES
q(s) =

1
q(0)
...
192)

This representation of q(s) involves the survival probability q(0) on condition that
there is no initial capital
...
22 Let the claim size M have an exponential distribution with parameter
1/ν
...

νs + 1
Hence,
νs + 1
q(s) =
q(0) μ κ
...
193)

q(s) simplifies to

1
⎤ q(0)
...
29))
α
1 1 α
q(x) = ⎡ e − ν x + α − α e − ν x ⎤ q(0)
...
194)

so that the parameter α is the company's probability to survive without any initial
capital
...

(3
...

Renewal Equation for q(x) To be able to construct an approximation for q(x) for
large x, the integro-differential equation (3
...
e
...
126):
x

q(x) = a(x) + ∫ 0 q(x − y) g(y) dy ,

where g(y) is a probability density and a(x) an integrable function on [0, ∞)
...
196)

3 POINT PROCESSES

193

1) Firstly, an integral equation for q(x) will be constructed
...
191) from
x = 0 to x = t yields
t x
1 t
q(t) − q(0) = μκ ⎡ ∫ 0 q(x) dx − ∫ 0 ∫ 0 q(x − y) b(y) dy dx ⎤
...
197)
⎣
⎦

By partial integration and application of Dirichlet's formula (1
...
197) becomes
t x

∫ 0 ∫ 0 q(x − y) b(y) dy dx
t
t
t x
= ∫ 0 q(x)dx − q(0)∫ 0 B(x)dx − ∫ 0 ∫ 0 q (x − y) B(y) dy dx
t

t

t

t

t

t

= ∫ 0 q(x)dx − q(0)∫ 0 B(x)dx − ∫ 0 B(y) q(t − y) dy + q(0)∫ 0 B(x)dx
= ∫ 0 q(x)dx − ∫ 0 B(y) q(t − y) dy
...
197) and replacing t with x,
1 x
q(x) = q(0) + μκ ⎡ ∫ 0 q(x − y) B(y) dy ⎤
...
198)

Letting x → ∞ in (3
...

Since q(∞) = 1,

ν
q(0) = 1 − μ κ = α
...
199)

Interestingly, this probability depends on the probability distributions of the random
variables involved only via their mean values
...
194) and (3
...

2) To establish an integro-differential equation for the ruin probability p(x) , in formula (3
...

Hence,
x
1 x
1
p(x) = 1 − α − μκ ∫ 0 B(y) dy + ∫ 0 p(x − y) μκ B(y) dy
...
200)

Formally, this integral equation in the ruin probability p(x) looks like the integral
equation of renewal type (3
...

(3
...

However, g (y) can be thought of as characterizing a defective probability distribution with a defect of α
...
200) is called a defective integral
equation of renewal type
...
200) will be multiplied by a factor e r y
so that the product e r y g(y) is a probability density
...

(3
...
202) is called a Lundberg exponent
...
With a(x) and g(y) given by (3
...

Then, multiplying (3
...
202), gives an
integral equation of renewal type for the function p r (x) :
x

p r (x) = a r (x) + ∫ 0 p r (x − y) g r (y) dy
...
203)

This integral equation can easily be solved in the image space of the Laplace transformation (just as (3
...
When doing this, note that, for instance, the Laplace
transform of a r is given by
L(a r ) = a(s − r) ,
where a is the Laplace transform of a
...
13 to
the integral equation of renewal type (3
...

ν
Since 1 − α = μκ ,
∞
∞
1 x
∫ 0 a r (x) dx = ∫ 0 e rx ⎡ 1 − α − μκ ∫ 0 B(y) dy ⎤ dx
⎣
⎦
∞
∞
1
= ∫ 0 e rx ⎡ 1 − α − μκ ⎛ ν − ∫ x B(y) dy ⎞ ⎤ dx
⎝
⎠⎦
⎣
∞
1 ∞
= μκ ∫ 0 e rx ⎛ ∫ x B(y) dy ⎞ dx
...
33) and
making use of (3
...

⎣
⎦

Hence,

∞

∫0

a r (x) dx = α
...

(3
...
13 (the constant μ which occurs in theorem 3
...

x→∞
x→∞
Hence, for large values of the initial capital x ,
α
p(x) ≈ m r e −r x ,

(3
...
202) and (3
...
This
approximation frequently yields excellent results even for small values of x
...
205) is called the Cramér-Lundberg approximation to the ruin probability
...

(3
...
A proof will be given in section 6
...
Both H
...
Lundberg did their pioneering research in collective risk analysis in the first third of the 20th century
...
22 It is interesting to evaluate the Cramér-Lundberg approximation to the ruin probability if the claim size M has an exponential distribution, since in this case the exact value of the ruin probability is known
...

According to (3
...

∫0
Hence,

r = α /ν
...
204), the parameter m is obtained as follows:
1
∞ 1
1 ∞
1
m = μκ ∫ 0 y e r y e −(1/v) y dy = μκ ⋅ ν ∫ 0 y ( ν − r) e −( ν −r) y dy
1 − νr

2
1
= μκ ⋅ ⎛ ν ⎞
...

By comparing these results with (3
...

3
...
3 Renewal Claim Arrival Process
Much effort has been put into determining the ruin probability under more general
assumptions than the 'classical' assumptions 1 to 4 stated in section 3
...
2
...
Thus, the interarrival times need no longer be exponentially distributed
...
[67]
...
} of the claim interarrival
times Y i be an ordinary renewal process
...

The Z 1 , Z 2 ,
...
Hence, the discrete-time stochastic process {S 1 , S 2 ,
...
207)

is a random walk with independent, identically distributed increments Z i
...

n=1,2,
...
Thus, determining the ruin probability is closely related to the first passage time behaviour of
random walks
...

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

197

As in section 3
...
2, to make sure that p(x) < 1, a positive safety loading σ = κ μ − ν is
required
...
168), the stochastic process {S 1 , S 2 ,
...

Let z(s) be the Laplace transform of Z = κ Y − M :
z(s) = Ee −s Z )

Since M and Y are independent,
z(s) = E(e −s κY ) E(e s M )
...

The Lundberg exponent r is now the positive solution of
(3
...

As under the assumption of a homogeneous Poisson claim arrival process, an explicit
formula for the ruin probability exists if M has an exponential distribution:
p(x) = (1 − rν) e −r x ,

x ≥ 0
...
209)

Given r as solution of (3
...
206):
p(x) ≤ e −r x
...

However, the value of the constant c cannot be given here (see the references given
above)
...
} of the claim interarrival
times Y i be a stationary renewal process
...
17, if the Y 2 , Y 3 ,
...
155)
...
17 (formula (3
...

⎝ Σ i=1 i ⎠

In what follows, the ruin probability referring to a stationary renewal claim arrival
process is denoted as p s (x), whereas p(x) refers to the ordinary renewal claim arrival process
...

⎦
⎣

© 2006 by Taylor & Francis Group, LLC

(3
...

μ
μ
These probabilities do not depend on the type of the distributions of Y and M, but
only on their mean values (insensitivity)
...
194)
...
209) in (3
...

3
...
4 Normal Approximations for Risk Processes
Let the process of the claim interarrival times {Y 1 , Y 2 ,
...
Otherwise, assumptions 2 to 4 of section 3
...
2 will be retained
...
19, if t is sufficiently large compared to μ, the total claim size in [0, t] has
ν
approximately a normal distribution with mean value μ t and variance μ −3 γ 2 t :
ν
C(t) ≈ N ⎛ μ t, μ −3 γ 2 t ⎞ ,
⎠
⎝

(3
...

The random profit the insurance company has made in [0, t] is given by
G(t) = R(t) − x = κ t − C(t)
...
211), G(t) has approximately a normal distribution with parameters
ν
E(G(t)) = (κ − μ ) t and Var(G(t)) = μ −3 γ 2 t
...
Note that examples 3
...
24 refer to the situation that, when being 'in red numbers' (ruin has happened),
the company continues operating until it reaches a profitable time period and so on
...

Example 3
...
} with
μ = E(Y) = 2 [h],
ν = E(M) = 900 [$],

Var(Y) = 3 [h 2 ],
Var(M) = 360 000 [$ 2 ]
...
95?

© 2006 by Taylor & Francis Group, LLC

3 POINT PROCESSES

199

Since γ = 1967
...
95 − 100))
− 100) − 450 ⎞
⎛ (κ
= Φ ⎜ 0
...

−1
...
672 ⎟
⎝ 2
⎠

Since the 0
...
95 = 1
...
95 satisfies equation
κ 0
...
64
...
955

Hence,
κ 0
...

Of course, this result does not take into account the fact that the premium size has an
influence on the claim flow
...
Thus, the company has
a positive safety loading of σ = 10 [$]
...
5 ⋅ 1967
...
910) = 0
...

The following example uses the approximate distribution of the first passage time
L(a) of the compound claim size process {C(t), t ≥ 0} with respect to level a as
given by theorem (3
...

Example 3
...
} be
μ = E(Y) = 5 [h],
ν = E(M) = 1000 [$],

Var(Y) = 25 [h 2 ],
Var(M) = 640 000 [$ 2 ]
...
5 ⋅ 6403 ⋅ 10 3 ⎠
⎝ 1000
= Φ(2
...
993
...
5 EXERCISES
Sections 3
...
2
3
...

(1) What is the probability p ≥2 that at least two catastrophic accidents will occur in
the second half of the current year?
(2) Determine the same probability given that two catastrophic accidents have occurred in the first half of the current year
...
2) By making use of the independence and homogeneity of the increments of a
homogeneous Poisson process {N(t), t ≥ 0} with intensity λ show that its covariance
function is given by
C(s, t) = λ min(s, t)
...
3) The number of cars which pass a certain intersection daily between 12:00 and
14:00, follows a homogeneous Poisson process with intensity λ = 40 per hour
...
8% which disregard the STOP-sign
...
4) A Geiger counter is struck by radioactive particles according to a homogeneous
Poisson process with intensity λ = 1 per 12 seconds
...

(1) What is the probability p ≥2 that the Geiger counter records at least 2 particles a
minute?
(2) What are mean value [min] and variance [min 2 ] of the random time Y between
the occurrence of two successively recorded particles?
3
...
002 and λ 2 = 0
...
A shock of type 1 always causes a system failure, whereas a shock of
type 2 causes a system failure with probability 0
...

What is the probability of the event A that the system fails within 24 hours due to a
shock?
3
...
Determine the mean value of the random number of
events of process 2 (type 2-events) which occur between any two successive events
of process 1 (type 1-events)
...
7) Let {N(t), t ≥ 0} be a homogeneous Poisson process with intensity λ
...

3
...
} be the associated point process
...

0,
elsewhere

3
...
} be the associated random point process
...

Note that {X(t), t ≥ 0} is the same process as the one analyzed in example 3
...

3
...
Let {T 1 , T 2 ,
...
The car arriving at time T i can immediately be resaled by the
dealer at price C i , where the C 1 , C 2 ,
...
However, if a buyer acquires the car, which arrived at T i , at time
T i + τ, then he only has to pay an amount of
e −α τ C i with α > 0
...
What
will be the mean total price E(K) the car dealer achieves?
3
...
m
...
m
...

(1) How many cars arrive on average between 12:00 a
...
and 4:00 a
...
?
(2) What is the probability that at least 40 cars arrive between 2:00 and 4
...
m
...
12)* Let {N(t), t ≥ 0} be a nonhomogeneous Poisson process with intensity function λ(t), trend function Λ(t) = ∫ t λ(x) dx and arrival time point T i of the ith Poisson
0
event
...
, T n ) has the same
probability distribution as n ordered, independent, and identically distributed random
variables with distribution function
⎧ Λ(x) for 0 ≤ x < t
⎪

...
5
...
13) Determine the optimal renewal interval τ ∗ and the corresponding maintenance
cost rate K(τ) for policy 1 (section 3
...
6
...

3
...

(1) Determine the state probabilities of this process at time t
...

(3) For what values of α and β are trend and variance function of a Polya arrival
process identical to the ones obtained under (2)?
3
...
55)
...

3
...
2
...
4)
...

Under the same assumptions as in section 3
...
6
...
96)
and (3
...
105)
...
17) A system is maintained according to policy 7 with a constant repair cost limit
c
...
The cost of a minimal repair is assumed (quite naturally) to depend on c as
follows: c m = c m (c) = E(C C ≤ c)
...
85) for any
distribution function F(t) and for any distribution function R(x) = P(C ≤ x) with
density r(x) and property R(c r ) = 1
...
91) and R(x) given
by (3
...
3 and 3
...
18 to 3
...
The functions f (t)
and F(t) denote density and distribution function; the parameters μ and μ 2 are mean
value and second moment of the cycle length Y
...

3
...
Its lifetime has approximately a normal
distribution with mean value μ = 120 and standard deviation σ = 24 [hours]
...
How many spare systems must be available in order to be
able maintain the replacement process over an interval of length 10,000 hours
(1) with probability 0
...
99 ?
3
...

(2) For λ = 1, sketch the exact graph of the renewal function and the bounds (3
...
(Make sure that the bounds (3
...
)
3
...

By means of the Laplace transformation, determine the associate renewal function
...
21)* (1) Verify that the probability
p(t) = P(N(t) is odd)
satisfies the integral equation
t

p(t) = F(t) − ∫ 0 p(t − x) f (x) dx,

f (x) = F (x)
...

3
...
Determine
the probability P(N(10) ≥ 2)
...
23)* Verify that H 2 (t) = E(N 2 (t)) satisfies the integral equation
t

H 2 (t) = 2H(t) − F(t) + ∫ 0 H 2 (t − x) f(x) dx
...
24) Given the existence of the first 3 moments of the cycle length Y, prove equations (3
...

© 2006 by Taylor & Francis Group, LLC

204

STOCHASTIC PROCESSES

3
...

(1) Show that the corresponding renewal function H(n); n = 0, 1,
...
+ H(n) p 0
with q n = P(Y ≤ n) = p 0 + p 1 +
...

(2) Consider the special cycle length distribution
P(Y = 0) = p, P(Y = 1) = 1 − p
and determine the corresponding renewal function
...
)
3
...

(1) What is the statement of theorem 3
...
14 (formula (3
...
27) The time intervals between the arrivals of successive particles at a counter generate an ordinary renewal process
...
Particles arriving during a blocked period are not registered
...
28) Let A(t) be the forward and B(t) the backward recurrence times of an ordinary
renewal process at time t
...

3
...
145) by means of theorem 3
...

Hint Let Z(t) = H(t) − t/μ
...
30) Let (Y, Z) be the typical cycle of an alternating renewal process, where Y and Z
have an Erlang distribution with joint parameter λ and parameters n = 2 and n = 1,
respectively
...

Hint Process states as introduced in section 3
...
6
...
31) The time intervals between successive repairs of a system generate an ordinary
renewal process {Y 1 , Y 2 ,
...
The costs of repairs are
mutually independent, independent of {Y 1 , Y 2 ,
...

The random variables Y and M have parameters
μ = E(Y ) = 180 [days], σ = Var(Y ) = 30,
ν = E(M ) = $ 200,

Var(M ) = 40
...

3
...
21
...

3
...

Contrary to example 3
...
Further, let F(t) be the distribution function of the system lifetime T and λ(t) be the corresponding failure rate
...
(Note Total maintenance cost' includes
replacement and repair costs
...

(3) Determine τ ∗ if T has a uniform distribution over the interval [0, z]
...
34) A system is preventively renewed at fixed time points τ, 2τ,
...
(This replacement policy is called block replacement
...

(2) On condition that the system lifetime has distribution function
F(t) = (1 − e −λ t ) 2 , t ≥ 0,
give a necessary condition for a renewal interval τ = τ ∗ which is optimal with respect to K(τ)
...
13
...
35) Under the model assumptions of example 3
...
206),
(3) under otherwise the same conditions, draw the respective graphs of the ruin probability p(x) for x = 20, 000 and x = 0 (no initial capital) in dependence on κ over the
interval 1600 ≤ κ ≤ 1800,
3
...
35 (1),
(1) determine the ruin probability if claims arrive according to an ordinary renewal
process the typical cycle length of which has an Erlang distribution with parameters
n = 2 and λ = 4,
(2) determine the ruin probability if claims arrive according to the corresponding stationary renewal process
...
37) Under otherwise the same assumptions as made in example 3
...

3
...
The corresponding claim sizes M 1 , M 2 ,
...
Let the Y i be distributed as Y; i
...
Y is the typical interarrival interval
...
From historical observations it is known that
μ = E(Y) = 2 [h], Var(Y) = 3, ν = E(M) = $ 900, Var(M) = 360, 000
...
95 ?
(2) What is the probability that the total claim amount hits level $ 4 ⋅ 10 6 in the interval [0, 7,000 hours]?
(Before possibly reaching its goals the insurance company may have experienced
one or more ruins with subsequent 'red number periods'
...
1 FOUNDATIONS AND EXAMPLES
This chapter is subjected to discrete-time stochastic processes {X 0 , X 1 ,
...
That is, on condition X n = x n
the random variable X n+1 is independent of all X 0 , X 1 ,
...
However, without
this condition, X n+1 may very well depend on all the other X i , i ≤ n
...
1 Let {X 0 , X 1 ,
...
Then {X 0 , X 1 ,
...
, x n+1 with x k ∈ Z and for all n = 1, 2,
...
, X 1 = x 1 , X 0 = x 0 ⎞ = P(X n+1 = x n+1 X n = x n ) (4
...
1) is called the the Markov property
...
, 1, 0 are in the past
...

Note that for the special class of stochastic processes considered in this chapter definition 4
...
19) in chapter
2
...
1)
...
For instance, the final profit
of a gambler usually depends on his present profit, but not on the way he has obtained it
...
A car driver checks the tread depth of his
tires after every 5000 km
...
On the other hand, for predicting the future concentration of noxious substances in the air, it has been proved necessary to take into account not only
the present value of the concentration, but also the past development leading to this
value
...
Hence, states will be denoted as i, j, k,
...

are the one-step transition probabilities of the Markov chain
...
Thus, a Markov chain is
homogeneous if and only if its one-step transition probabilities do not depend on n:
p i j (n) = p i j for all n = 0, 1,
...
For the sake of brevity, the attribute homogeneous is generally omitted
...

...

...

...

...

p i 0 p i1 p i 2

...

...

...

...

...

...

...

⎞
⎟
⎟
⎟
...
With probability p ii the Markov chain remains
in state i for another time unit
...
2)
p i j ≥ 0,
Σ p i j = 1; i, j ∈ Z
...

(4
...
However, in between the Markov chain may already have
(1)

arrived at state j
...
It is convenient to introduce the notation
⎧ 1 if i = j
(0)
p i j = δ ij = ⎨

...
4)

δ ij defined in this way is called the Kronecker symbol
...
, m
...
5)

4 DISCRETE-TIME MARKOV CHAINS

209

The proof is easy: Conditioning with regard to the state, which the Markov chain assumes after r time units, 0 ≤ r ≤ m, and making use of the Markov property yields
(m)

p i j = P(X m = j X 0 = i) =
=

Σ

k∈Z

=

Σ

k∈Z

P(X m = j, X r = k X 0 = i)

P(X m = j X r = k, X 0 = i) P(X r = k X 0 = i )

Σ

k∈Z

P(X m = j X r = k) P(X r = k X 0 = i )
=

Σ

k∈Z

(r) (m−r)

pi k pk j

...
5)
...

⎝⎝
⎠⎠

Then Chapman-Kolmogorov's equations can be written in the elegant form
P (m) = P (r) P (m−r) ;

r = 0, 1,
...

This relationship implies that
P (m) = P m
...

A probability distribution p (0) of X 0 is said to be an initial distribution of the Markov chain:
⎧ (0)
⎫
(0)
(4
...

i∈Z
⎩
⎭
A Markov chain is completely characterized by its transition matrix P and an initial
distribution p (0)
...
, i n ,
P(X 0 = i 0 , X 1 = i 1 ,
...
, X n−1 = i n−1 ) ⋅ P(X 0 = i 0 , X 1 = i 1 ,
...
, X n−1 = i n−1 )
= pi

n−1 i n

⋅ P(X 0 = i 0 , X 1 = i 1 ,
...

The second factor in the last line is now treated in the same way
...
, X n = i n ) = p i ⋅ p i i ⋅ p i i ⋅
...

0 1
1 2
n−1
0

(4
...

The absolute or one-dimensional state probabilities of the Markov chain after m
steps are
(m)

pj

= P(X m = j),
(0)

Given an initial distribution p (0) = p i
(m)

pj

=

Σ

(0)

i∈Z

pi

j ∈ Z
...

(4
...
2 An initial distribution {π i = P(X 0 = i); i ∈ Z } is called stationary if
it satisfies the system of linear equations
πj =

Σ

i∈Z

π i p i j ; j ∈ Z
...
9)

Furthermore, it can be shown by induction that in this case even the absolute state
probabilities after any number of steps are the same as in the beginning:
(m)

pj

=

Σ

i∈Z

(m)

πi pi j = πj ,

m = 1, 2,
...
10)

Thus, state probabilities π i satisfying (4
...
They are also called equilibrium state
probabilities of the Markov chain
...
7) of the n-dimensional state probabilities verifies theorem 2
...

Markov chains in discrete time virtually occur in all fields of science, engineering,
operations research, economics, risk analysis and finance
...
More examples will be given in the text
...
1 (random walk) A particle moves along the real axis in one step from
an integer-valued coordinate i either to i + 1 or to i − 1 with equal probabilities
...
If X 0 is the starting position of the particle
and X n the position of the particle after n steps, then {X 0 , X 1 ,
...
} and one-step transition probabilities
1/2 for j = i + 1 or j = i − 1
pi j =

...
2 (random walk with absorbing barriers) Example 4
...

There are absorbing barriers at x = 0 and x = 6 , i
...
if the particle arrives at state 0 or
at state 6, it cannot leave these states anymore
...
} is Z = {0, 1,
...

for i = j = 0 or i = j = 6
⎪
otherwise
⎩ 0
The matrices of the one and two-step transition probabilities are
⎛
⎜
⎜
⎜
⎜
=⎜
⎜
⎜
⎜
⎜
⎝

1
1/2
0
0
0
0
0

0
0
1/2
0
0
0
0

0
1/2
0
1/2
0
0
0

0
0
1/2
0
1/2
0
0

0
0
0
1/2
0
1/2
0

0
0
0
0
1/2
0
0

0
0
0
0
0
1/2
1

⎞
⎟
⎟
⎟
⎟
⎟,
⎟
⎟
⎟
⎟
⎠

⎛
⎜
⎜
⎜
P (2) = ⎜
⎜
⎜
⎜
⎜
⎜
⎝

1
1/2
1/4
0
0
0
0

0
1/4
0
1/4
0
0
0

0
0
1/2
0
1/4
0
0

0
1/4
0
1/2
0
1/4
0

0
0
1/4
0
1/2
0
0

0
0
0
1/4
0
1/4
0

0
0
0
0
1/4
1/2
1

⎞
⎟
⎟
⎟
⎟
...
, 5},
(0)

pi

= P(X 0 = i) = 1/5; i = 1, 2,
...
8), the absolute distribution of the position of the particle after 2 steps is
p (2) =

3, 2, 3, 3, 3, 3, 3
...
3 (random walk with reflecting barriers) For a given positive integer z,
the state space of a Markov chain is Z = {0, 1,
...
A particle moves from position i to position j in one step with probability
⎧
⎪
⎪
pi j = ⎨
⎪
⎪
⎩

© 2006 by Taylor & Francis Group, LLC

2z−i
2z
i
2z

0

for j = i + 1
for j = i − 1
...
11)

212

STOCHASTIC PROCESSES

Thus, the greater the distance of the particle from the central point z of Z, the greater
the probability that the particle moves in the next step into the direction of the central
point
...

(Hence the terminology reflecting barriers
...
In
this sense, the particle is at x = z in an equilibrium state
...
Its attraction to
a particle increases with the particle's distance from this point
...
4 (Ehrenfest's diffusion model ) P
...
Ehrenfest came across a random walk with reflecting barriers as early as 1907 whilst investigating the following
diffusion model: In a closed container there are exactly 2z molecules of a particular
type
...
Let X n be the random number of the molecules in one
part of the container after n transitions of any molecule from one part of the container to the other
...
} behaves approximately as a Markov chain with transition probabilities (4
...
Thus, the more
molecules are in one part of the container, the more they want to move into the other
part
...
e
...
The system of linear equations (4
...
, 2z − 1
...
, 2z
...

Example 4
...
The
one-step transition from trajectory i to trajectory j occurs with probability
p i j = a i e −b i−j , b > 0
...

The a i cannot be chosen arbitrarily
...
2), they must satisfy condition

© 2006 by Taylor & Francis Group, LLC

4 DISCRETE-TIME MARKOV CHAINS

213

∞
a i ⎛ e −b(i−1) + e −b(i−2) +
...

a i ⎜ e −b 1 − e
+
⎟
−b
⎝
1−e
1 − e −b ⎠

Therefore,
ai =

eb − 1
;
1 + e b − e −b(i−1)

i = 1, 2,
...

Example 4
...
Then,
n

X n = Σ i=1 Y i
...
Then {X 1 , X 2 ,
...
} and transition probabilities
⎧ q
pi j = ⎨ k
⎩ 0

if j = i + k; k = 0, 1,
...

otherwise

Example 4
...
} be a sequence of
independent, identically distributed binary random variables with
P(Y i = 1) = P(Y i = −1) = 1/2
...

X n has range {−1, 0, + 1} and probability distribution
P(X n = −1) = 1 , P(X n = 0) = 1 , P(X n = +1) = 1
...

⎟
⎠

The matrix of the one-step transition probabilities p i j = P(X n+1 = j X n = i) is

© 2006 by Taylor & Francis Group, LLC

214

STOCHASTIC PROCESSES
⎛ 1/2 1/2 0 ⎞

⎜
⎟
P (1) = P = ⎜ 1/4 1/2 1/4 ⎟
...
Therefore, the sequence of moving averages {X 1 , X 2 ,
...

4
...
2
...
12)

If a Markov chain is in a closed set of states, then it cannot leave this set since (4
...
Furthermore, (4
...

(4
...
12) can be proved as follows: From (4
...
Now
formula (4
...

A closed set of states is called minimal if it does not contain a proper closed subset
...

Otherwise the Markov chain is reducible
...
Thus, if a Markov chain has arrived in an
absorbing state, it cannot leave this state anymore
...
Absorbing barriers of a random walk (example 4
...

Example 4
...
2 0 0
...
3 0 ⎞
⎜ 0
...
9 0 0 ⎟
⎜
⎟
P = ⎜ 0 1 0 0 0 ⎟
...
4 0
...
2 0 0
...
2

215

0
...
3

0
...
1

4

0
...
2

3

0
...
9

Figure 4
...
8

It is helpful to illustrate the possible transitions between the states of a Markov chain
by transition graphs
...
A directed edge from node i to node j exists if and only if p ij > 0 , that is, if a
one-step transition from state i to state j is possible
...
Figure 4
...
12) is not fulfilled for i = 4
...
This Markov chain is, therefore,
reducible
...
2
...
The relation ' ⇒ ' is transitive: If i ⇒ k and k ⇒ j, there
(m)

(n)

exist m > 0 and n > 0 with p i k > 0 and p k j > 0
...

r∈Z

Consequently, i ⇒ k and k ⇒ j imply i ⇒ j , that is, the transitivity of ' ⇒ '
...
In order to prove this assertion it is to show that k ∈ M(i) and j ∉ M(i) im/
ply k ⇒ j
...
But this contradicts the definition
of M(i)
...
Communication '⇔ ' is an equivalence relation since it satisfies the three
characteristic properties:

© 2006 by Taylor & Francis Group, LLC

216

STOCHASTIC PROCESSES

(1) i ⇔ i
...

(3) If i ⇔ j and j ⇔ k , then i ⇔ k
...
To
verify property (3), note that i ⇔ j and j ⇔ k imply the existence of m and n so that
(n)
(m)
p i j > 0 and p j k > 0 , respectively
...
5),
(m+n)

=

(m) (n)

Σ

(m) (n)

p i r p r k ≥ p ij p j k > 0
...

The equivalence relation '⇔ ' partitions state space Z into disjoint, but not necessarily closed classes in the following way: Two states i and j belong to the same class if
and only if they communicate
...
Clearly, any state in a class can be used to characterize this class
...
e
...

A state i is called essential if any state j which is accessible from i has the property
that i is also accessible from j
...

A state i is called inessential if it is not essential
...
If i is inessential, then there exists a state j for which i ⇒ j and j ⇒ i
...
In example 4
...

Theorem 4
...
(2) Inessential classes
are not closed
...

/
(2) If i is inessential, then there is a state j with i ⇒ j and j ⇒ i
...

(m)

Assuming C(i) is closed implies that p k j = 0 for all m ≥ 1, k ∈ C(i) and j ∉ C(i)
...
(According to the definition of the relation i ⇒ j ,
(m)

there exists a positive integer m with p i j > 0
...

4 DISCRETE-TIME MARKOV CHAINS

217

Furthermore, let C w and C u be the sets of all essential and inessential states of a
Markov chain
...
This theorem
justifies the notation essential and inessential states
...

Theorem 4
...
Then,
(m)
lim p (C u ) = 0
...
9 If the number of states in a Markov chain is small, the essential and
inessential states can immediately be identified from the transition matrix
...

⎟
⎟
⎟
⎠

By changing the order of rows and columns, an equivalent representation of P is
⎛
⎜
P=⎜
⎜
⎜
⎜
⎝

3/5
1/3
0
0

2/5
2/3
0
0

0
0
3/4
1/2

0
0
1/4
1/2

⎞
⎟
⎟ =
⎟
⎟
⎟
⎠

⎛ Q 11 0 ⎞
⎜
⎟,
⎝ 0 Q 22 ⎠

where Q 11 and Q 22 are square matrices of order 2 and 0 is a square matrix with all
elements equal to zero
...
Its state space (in new
notation) consists of two essential classes C(0) = {0, 1} and C(2) = {2, 3} with transition matrices Q 11 and Q 22 , respectively
...
10 Let Z = {0, 1,
...
4 0 0
...
1 0
...
2
0
...
2 0
...
2 0
...
1

© 2006 by Taylor & Francis Group, LLC

⎞
⎟
⎟
⎟
⎟
⎟
⎟
⎟
⎠

⎛ Q 11 0
0
⎜
= ⎜ 0 Q 22 0
⎜
⎝ Q 31 Q 32 Q 33

⎞
⎟
⎟,
⎟
⎠

218

STOCHASTIC PROCESSES

where the symbolic representation of the transition matrix, introduced in the previous
example, is used
...

It is evident that, from the class of inessential states, transitions both to essential and
inessential states are possible
...
2, the Markov chain
will sooner or later leave the inessential class for one of the essential classes and
never return
...
2
...

(m)
Then d i is said to be the period of state i
...
A state i is said to be aperiodic if d i = 1
...

Hence, returning to state i is only possible after such a number of steps which is a
multiple of d i
...

Theorem 4
...

(m)

(n)

Proof Let i ⇔ j
...
If
(r)
the inequality p i i > 0 holds for a positive integer r, then, from (4
...

Since
(2 r)

(r)

(r)

≥ pi i ⋅ pi i > 0 ,
this inequality also holds if r is replaced with 2 r :
pi i

(n+2 r+m)

pj j

> 0
...

(r)

Since this holds for all r for which p i i > 0 , d j must divide d i
...
Thus, d i = d j
...
11 Let a Markov chain
matrix
⎛ 1/3 2/3
⎜ 1/3 1/3
⎜
⎜ 1 0
P = ⎜ 0 1/3
⎜
⎜
⎜ 0 0
⎜ 0 0
⎜
⎝ 0 0

219

have state space Z = {0, 1,
...

⎟
⎟
⎟
⎟
⎟
⎠

Clearly, {0, 1, 2} is a closed set of essential states
...
Having once arrived in a closed set of states the Markov chain
cannot leave it again
...
When starting in one of
its sets of inessential states, the Markov chain will at some stage leave this set and
never return
...

Theorem 4
...
, Z d with
Z=

d
k=1

Zk

such that from any state i ∈ Z k a transition can only be made to a state j ∈ Z k+1
...

This theorem implies a characteristic structure of the transition matrix of a periodic
Markov chain
...
(Q i and 0 refer to the notation introduced in example 4
...
) According to the definition of a period, if a Markov chain with period d
starts in Z i , it will again be in Z i after d transitions
...

⎟
⎠

220

STOCHASTIC PROCESSES

This structure of the transition matrix allows the following interpretation: A Markov
chain {X 0 , X 1 ,
...
, Z d if, with respect to transitions within the Markov
chain {X 0 , X 1 ,
...

Example 4
...
, 5} and transition
matrix
⎛ 0 0 2/5 3/5 0 0 ⎞
⎜ 0 0 1 0 0 0 ⎟
⎜
⎟
⎜ 0 0 0 0 1/2 1/2 ⎟
P=⎜
⎟
...
One-step transitions are possible in the order
Z 1 = {0, 1} → Z 2 = {2, 3} → Z 1 = {4, 5} → Z 1
...

⎟
⎟
⎟
⎠

4
...
4 Recurrence and Transience
This section deals with the return of a Markov chain to an initial state
...
, m − 1 X 0 = i ⎞ ; i, j ∈ Z
⎠

(m)

Thus, f i j is the probability that the Markov chain, starting from state i, makes its
(m)
first transition into state j after m steps
...
For m = 1,
(1)

(1)

fi j = pi j = pi j
...
14)

(0)

where, by convention, p j j = 1 for all j ∈ Z
...

k=1

(4
...
,
is a first-passage time
...

The probability of ever making a transition into state j if the process starts in state i
is
(m)

∞

f i j = Σ m=1 f i j
...
16)

In particular, f i i is the probability of ever returning to state i
...

Clearly, if state i is transient, then μ i i = ∞
...
Therefore, recurrent states are classified as follows:
A recurrent state i is said to be positive recurrent if μ i i < ∞ and null-recurrent
if μ i i = ∞
...

The random time points T i,n ; n = 1, 2,
...
10, section
3
...
8)
...
The time spans between neighbouring regeneration
points T i,n − T i,n−1 ; n = 1, 2,
...
They are independent
and identically distributed as L i i
...
Let
N i (t) = max(n; T i,n ≤ t),
N i (∞) = lim N i (t),
t→∞

© 2006 by Taylor & Francis Group, LLC

H i (t) = E(N i (t)),

H i (∞) = lim H i (t)
...
5 State i is recurrent if and only if
(1) H i (∞) = ∞ , or
∞

(m)

(2) Σ m=1 p i i = ∞
...
The limit N i (∞) is
finite if and only if there is an n with T i,n = ∞
...

Thus, assumption f i i = 1 implies N i (∞) = ∞ and, therefore, H i (∞) = ∞ is true with
probability 1
...
In this case, N i (∞) has a geometric distribution with mean
value (section 1
...
2
...

1 − fi i
Both results together prove part (1) of the theorem
...

⎩ 0 for X m ≠ i
Then,

∞

N i (∞) = Σ m=1 I m,i
...

Now assertion (2) follows from (1)
...
15) from m = 1 to ∞ and changing the order of summation according to formula (1
...
5 implies the following corollary
...

m→∞ ij

© 2006 by Taylor & Francis Group, LLC

(4
...
6 Let i be a recurrent state and i ⇔ j
...

Proof By definition of the equivalence relation '⇔ ', there are integers m and n with
(m)

(n)

p i j > 0 and p j i > 0
...
5),
(n) (r) (m)

p n+r+m ≥ p j i p i i p i j ,
jj
so that
∞
Σ r=1

(m) (n)

p n+r+m ≥ p i j p j i
jj

∞
Σ r=1

(r)

pi i = ∞
...
5
...
Hence, an irreducible
Markov chain is either recurrent or transient
...

An irreducible Markov chain with finite state space is recurrent
...
Therefore, each recurrent state is
essential
...
This assertion is proved by the following example
...
13 (unbounded random walk) Starting from x = 0, a particle jumps a
unit distance along the x-axis to the right with probability p or to the left with probability 1 − p
...
Let X n denote the
location of the particle after the nth jump
...
} with
X 0 = 0 has period d = 2
...

In order to be back in state x = 0 after 2m steps, the particle must jump m times to
the left and m times to the right
...
Hence,
(2m)
p 00 = ⎛ 2m ⎞ p m (1 − p) m ; m = 1, 2,
...

224

STOCHASTIC PROCESSES

Thus, the sum
(m)

∞
Σ m=0 p 00

is finite for all p ≠ 1/2
...
5, state 0 is transient
...
6, the Markov chain is transient, since it is irreducible
...

(4
...
However, for any p with 0 < p < 1, all
states are essential since there is always a positive probability of making a transition
to any state irrespective of the starting position
...
In the 3-dimensional
Euclidian space, the particle jumps one unit to the West, South, East, North, upward,
or downward, respectively, each with probability 1/6
...
Thus, there is a positive probability that somebody who randomly chooses one of the six possibilities in
a 3-dimensional labyrinth, each with probability 1/6, will never return to its starting
position
...
14 A particle jumps from x = i to x = 0 with probability p i or to i + 1
with probability 1 − p i ; 0 < p i < 1, i = 0, 1,
...
Let X n denote the position of the particle after the n th jump
...
} is
⎛
⎜
⎜
⎜
P=⎜
⎜
⎜
⎜
⎝

p0 1 − p0
0
0
0
1 − p1
0
p1
0
0
1 − p2
p2

...

...

...

...

...

...

...

0
pi

...

...

...

...

...

...

0
0
0

...

...

...

...
0

...
0

...

...

...

...

...

...

...

...

⎞
⎟
⎟
⎟
⎟
...
} is irreducible and aperiodic
...
It is not difficult to determine f 00 :

© 2006 by Taylor & Francis Group, LLC

4 DISCRETE-TIME MARKOV CHAINS

225

(1)

f 00 = p 0 ,
⎞
(m) ⎛ m−2
f 00 = ⎜ Π (1 − p i ) ⎟ p m−1 ;
⎝ i=0
⎠

m = 2, 3,
...

Hence,
m+1 (n)
⎛m
⎞
Σ f 00 = 1 − ⎜ Π (1 − p i ) ⎟ , m = 1, 2,
...

(4
...
19) is true if and only if
∞
Σ i=0

pi = ∞
...
20)

To prove this proposition, note that
1 − p i ≤ e −p i ; i = 0, 1,
...

⎠

Letting m → ∞ proves that (4
...
20)
...
19) is true and
(4
...

By induction,
m

m

Π i=k (1 − p i ) > 1 − p k − p k+1 −
...

Therefore,
lim

m→∞

m

m

lim ⎝
Π i=k (1 − p i ) > m→∞ ⎛ 1 − Σ i=k

pi ⎞ > 0
...
19) is true, and, hence, completes
the proof of the proposition
...
20) is true
...

© 2006 by Taylor & Francis Group, LLC

226

STOCHASTIC PROCESSES

4
...
7 Let state i and j communicate, i
...
i ⇔ j
...

n
n→∞ m=1
jj

(4
...
5 it can be shown that, given the Markov chain is in state i at time t = 0, the sum
n

(m)

Σ m=1 p i j

is equal to the mean number of transitions into state j in the time interval (0, n]
...
11)
...
)
(m)

Theorem 4
...

instance, the case if
(1)

(2)

has no limit
...

However,

n (m) 1
lim 1 Σ p
=
...
21) (indirect proof)
...
7 implies theorem 4
...
8 Let p i j be the m-step transition probabilities of an irreducible, aperiodic Markov chain
...

m→∞ i j
jj

If state j is transient or null-recurrent, then
(m)
lim p
= 0
...

m→∞ i j
jj

© 2006 by Taylor & Francis Group, LLC

4 DISCRETE-TIME MARKOV PROCESSES

227

Theorem 4
...
Then a stationary distribution
does not exist
...
Then there exists a unique stationary
distribution {π j , j ∈ Z} , which for any i ∈ Z is given by
(m)
π j = lim p i j = μ1
...

(1) By (4
...
} satisfies for any m = 1, 2,
...

(4
...
}, which is solution of (4
...

(2) Next the existence of a stationary distribution is shown
...
,
M

Σ j=0

(m)

∞

(m)

p i j < Σ j=0 p i j = 1
...

(4
...

(4
...
24) is a proper inequality, then, by
summing up the inequalities (4
...

© 2006 by Taylor & Francis Group, LLC

228

STOCHASTIC PROCESSES

But this is a contradiction to the fact that, by (4
...
Therefore
∞
π j = Σ k=0 π k p k j ; j = 0, 1,
...
} where
pj =

πj
,
∞
Σ i=0 π i

j ∈ Z
...
8, letting m → ∞ in (4
...
}
∞

∞

p j = Σ i=0 p i π j = π j Σ i=0 p i = π j ,

j ∈ Z
...
} with π j = 1/μ j j is the only stationary distribution
...
15 A particle moves along the real axis
...
When the particle arrives at state 0, it remains there for a further time unit
with probability q or jumps to state 1 with probability p
...
Under which condition has the Markov
chain {X 0 , X 1 ,
...
, the system (4
...

By recursively solving this system of equations,
p i
π i = ⎛ q ⎞ π 0 ; i = 0, 1,
...
In
this case,
q−p p i
π i = q ⎛ q ⎞ ; i = 0, 1,
...
25)
⎝ ⎠
The necessary condition p < 1/2 for the existence of a stationary distribution is intuitive, since otherwise the particle would tend to drift to infinity
...

Theorem 4
...
} be an irreducible, recurrent Markov chain with state
space Z and stationary state probabilities π i , i ∈ Z
...

4 DISCRETE-TIME MARKOV PROCESSES

229

For example, if c i = g(i) is the profit which accrues from the Markov chain by making a transition to state i, then
Σ i∈Z π i c i
is the mean profit resulting from a state change of the Markov chain
...
10 is the analogue to the renewal reward theorem (3
...
In particular, let
1 for i = k
g(i) =

...
By theorem 4
...
This property of the stationary state distribution illustrates once more that it refers to an equilibrium state of the Markov chain
...
10 under weaker assumptions can be found in [81]
...
16 A system can be in one of the three states 1, 2, and 3: In state 1 it
operates most efficiently
...
State 3 is the down state, the system is no longer operating and has to be
maintained
...
Transitions into the same state are allowed
...
} is assumed to be a Markov chain with transition matrix
1 2 3
⎛ 0
...
1 0
...
6 0
...

⎟
⎜
3 ⎝ 0
...
2 ⎠
Note that from state 3 the system most likely makes a transition to state 1, but it may
also stay in state 3 for one or more time units (for example, if a maintenance action
has not been successful)
...
8 π 1
+ 0
...
1 π 1 + 0
...
1 π 1 + 0
...
2 π 3
Only two of these equations are linearly independent
...

6
The profits the system makes per unit time in states 1 and 2 are
6

(4
...
According to theorem 4
...

Now, let Y be the random time in which the system is in the profitable states 1 and 2
...
Further, let Z be the random time in which the system is in the unprofitable state 3
...
The random vector (Y, Z) characterizes the typical cycle of an alternating renewal process
...
163), the ratio
E(Y) /[E(Y) + E(Z)]
is equal to the mean percentage of time the system is in states 1 or 2
...
10, this percentage must be equal to π 1 + π 2 :
E(Y)
= π1 + π2
...
27)
E(Y) + E(Z)
Since the mean time between transitions into state 3 is equal to E(Y) + E(Z), the ratio
1/[ E(Y) + E(Z)]
is equal to the rate of transitions to state 3
...

Hence,
1
= π 1 p 13 + π 2 p 23
...
27) and (4
...

π 1 p 13 + π 2 p 23

Substituting the numerical values (4
...
25 and E(Z) = 1
...

© 2006 by Taylor & Francis Group, LLC

(4
...
4 BIRTH- AND DEATH PROCESSES
In some of the examples considered so far only direct transitions to 'neighbouring'
states were possible
...
In these cases, the positive one-step transition probabilities have structure (Figure 4
...
29)
p i i+1 = p i , p i i−1 = q i , p i i = r i with p i + q i + r i = 1
...
, n}, n ≤ ∞, and transition probabilities (4
...
(The state space implies that
q 0 = 0
...
9 is a special birth- and death
process with
p i = p for i = 0, 1,
...
,
q 0 = 0, r 0 = q = 1 − p
The unbounded random walk in example 4
...

p0
r0

0

r1
q1

p1

1
q2

...

p n−2

p n−1

r n−1

n-1

q n−1

rn

n

qn

Figure 4
...
17 (random walk with absorbing barriers) A random walk with absorbing barriers 0 and s can be modeled by a birth- and death process
...
29), its transition probabilities satisfy conditions
r 0 = r s = 1 , p i > 0 and q i > 0 for i = 1, 2,
...

(4
...
, s − 1
...
) In view of the total probability rule,
p(k) = p k p(k + 1) + q k p(k − 1) + r k p(k) ,
or, replacing r k with r k = 1 − p k − q k ,
q
p(k) − p(k + 1) = p k [p(k − 1) − p(k)] ;
k

k = 1, 2,
...

Repeated application of this difference equation yields
p( j) − p( j + 1) = Q j [p(0) − p(1)] ; j = 0, 1,
...
31)

232

STOCHASTIC PROCESSES

where p(0) = 1, p(s) = 0 and
q j q j−1
...
p ;
j j−1
1

j = 1, 2,
...

Summing the equations (4
...

In particular, for k = 0,
1 = [p(0) − p(1)]

s−1

Σ j=0 Q j
...
, s − 1 ;

p(s) = 0
...
32)

Besides the interpretation of this birth- and death process as a random walk with absorbing barriers, the following application may be more interesting: Two gamblers
begin a game with stakes of $ k and $ (s − k), respectively; k, s integers
...

These possibilities are governed by transition probabilities satisfying (4
...
30)
...
Hence this birth- and death
process is also called gambler's ruin problem
...
29) have to
be supplemented by
p i > 0 for i = 0, 1,
...

(4
...
11 Under the additional assumptions (4
...
q 1
(4
...
p = ∞
...
This can be established by
using the result (4
...
17, since
lim p(k) = f k0 ;

s→∞

k = 1, 2,
...
16)
...

However, f k0 = 1 if and only if (4
...

© 2006 by Taylor & Francis Group, LLC

4 DISCRETE-TIME MARKOV PROCESSES

233

Conversely, let (4
...
Then, by the total probability rule,
f 00 = p 00 + p 01 f 10 = r 0 + p 0 ⋅ 1 = 1
...

The notation birth- and death process results from the application of these processes
to describing the development in time of biological populations
...
Correspondingly, the p i are called birth- and the q i death probabilities
...
6
...
5 EXERCISES
4
...
} has state space Z = {0, 1, 2} and transition matrix
⎛ 0
...
5
⎜
P = ⎜ 0
...
2 0
...
4 0
...

⎟
⎠

(1) Determine P ⎛ X 2 = 2 X 1 = 0, X 0 = 1) and P(X 2 = 2, X 1 = 0 X 0 = 1 ⎞
⎝
⎠
(2) Determine P ⎛ X 2 = 2, X 1 = 0 X 0 = 0) and, for n > 1,
⎝
P(X n+1 = 2, X n = 0 X n−1 = 0 ⎞
⎠
(3) Assuming the initial distribution
P(X 0 = 0) = 0
...
3,
determine P(X 1 = 2) and P(X 1 = 1, X 2 = 2)
...
2) A Markov chain {X 0 , X 1 ,
...
2 0
...
5 ⎞
⎟
⎜
P = ⎜ 0
...
2 0 ⎟
...
6 0 0
...

(2) Given the initial distribution
P(X 0 = i) = 1/3 ; i = 0, 1, 2 ;
determine the probabilities
P(X 2 = 0) and P(X 0 = 0, X 1 = 1, X 2 = 2)
...
3) A Markov chain {X 0 , X 1 ,
...
4 0
...
8 0 0
...

⎟
⎜
⎝ 0
...
5 0 ⎠
(1) Given the initial distribution
P(X 0 = 0) = P(X 0 = 1) = 0
...
2,
determine P(X 3 = 2)
...

(3) Determine the stationary distribution
...
4) Let {Y 0 , Y 1 ,
...

Define a sequence of random variables {X 1 , X 2 ,
...

Check whether the random sequence {X 1 , X 2 ,
...

4
...
} has state space Z = {0, 1, 2, 3} and transition matrix
⎛ 0
...
2 0
...
3 ⎞
⎜ 0
...
3 0
...
4 ⎟
⎟
...
4 0
...
3 0
...
3 0
...
2 0
...

(2) Determine the stationary distribution of this Markov chain
...
6) Let {X 0 , X 1 ,
...
, n}, n < ∞,
and with the doubly stochastic transition matrix P = ((p ij )), i
...

Σ

j∈Z

p i j = 1 for all i ∈ Z and

Σ

i∈Z

p i j = 1 for all j ∈ Z
...
} is given by
πj = 1 ,
n

j ∈ Z
...
} be a transient Markov chain?

© 2006 by Taylor & Francis Group, LLC

4 DISCRETE-TIME MARKOV PROCESSES

235

4
...
Random noises
S 1 , S 2 ,
...
Let X 0 = 0 or X 0 = 1 denote whether the
source has emitted a '0' or a '1' for transmission
...
The
random sequence {X 0 , X 1 ,
...

⎝ q 1−q ⎠
(1) Verify: On condition 0 < p + q ≤ 1, the m-step transition matrix is given by
P (m) =

m
1 ⎛ q p ⎞ + (1 − p − q) ⎛ p −p ⎞
...
1
...
, S 5
...

4
...
For the town of Musi, a fairly reliable prediction of
tomorrow's weather can only be made on the basis of today's and yesterday's
weather
...
Based on historical observations it is known that, given
the constellation (S,S) today, the weather tomorrow will be sunny with probability
0
...
2; given (S,C) today, the weather tomorrow will be
sunny with probability 0
...
6; given (C,S) today, the
weather tomorrow will be sunny with probability 0
...
4; given (C,C) today, the weather tomorrow will be cloudy with probability 0
...
2
...

(2) Determine the matrix of the transition probabilities of the corresponding discretetime Markov chain and its stationary state distribution
...
9)* An area (e
...
a stiffy disc) is partitioned into n segments S 1 , S 2 ,
...
, O n (e
...
pieces of information) are stored in these
segments so that each segment contains exactly one object
...

one of the objects is needed
...
This is done in the following way: The segments are checked in increasing order of their indices
...
, S k−1 will
be moved in this order to S 2 , S 3 ,
...

Let p i be the probability that at a time point t object O i is needed; i = 1, 2,
...
It is
assumed that these probabilities do not depend on t
...
e
...

(2) What is the stationary distribution of the location of O 1 given that
p 1 = α and p 2 = p 3 =
...
10) A supplier of toner cartridges of a certain brand checks his stock every Monday
...
The weekly
demands of cartridges D are independent and identically distributed according to
p i = P(D = i); i = 0, 1,
...

(1) Is {X 1 , X 2 ,
...

4
...
5
0
...
1
0
...
4
0
0
0
0

0
0
0
0
...
1
0

⎞
⎟
⎟
⎟
...

(2) Check, whether inessential states exist
...
12) A Markov chain has state space Z = {0, 1, 2, 3} and transition matrix
⎛
⎜
P=⎜
⎜
⎜
⎜
⎝

0
1
0
...
1

0
0
0
...
4

1
0
0
0
...
3

⎞
⎟
⎟
...

© 2006 by Taylor & Francis Group, LLC

4 DISCRETE-TIME MARKOV PROCESSES

237

4
...
2
0
0
0
0

0
...
9
0
...
1
0
...

⎟
⎟
⎠

(1) Draw the transition graph
...

(3) Determine the stationary distribution
...
14) A Markov chain has state space Z = {0, 1, 2, 3, 4} and transition matrix
⎛
⎜
⎜
P=⎜
⎜
⎜
⎝

0
1
0
...
2
0
...
2
0
...
1

0
0
0
...
1

0
0
0
...
4

⎞
⎟
⎟
⎟
...

(2) Find the recurrent and transient states
...
15) Determine the stationary distribution of the random walk considered in example 4
...

4
...
; p 0 = 1
...

4
...
Show that both i and j are
recurrent
...
18) The respective transition probabilities of two irreducible Markov chains (1)
and (2) with common state space Z = {0, 1,
...
;
i+2
i+2
(2) p i i+1 = i + 1 , p i 0 = 1 ; i = 0, 1,
...

© 2006 by Taylor & Francis Group, LLC

238

STOCHASTIC PROCESSES

4
...

Determine E(N i ) and Var(N i )
...
20) A haulier operates a fleet of trucks
...
There
are 3 premium levels: λ 1 , λ 2 and λ 3 with λ 3 < λ 2 < λ 1
...
If a claim had been made in the previous year,
the premium level in the current year is λ 1
...
In case of a claim, the insurance company will cover the full amount
minus a profitincreasing amount of a i , 0 ≤ a i < c i
...

Given a vector of claim limits (c 1 , c 2 , c 3 ), determine the haulier's long-run mean
loss cost a year
...
} , where X n = i if the premium level at
the beginning of year n is λ i and make use of theorem 4
...

(Loss cost = premium plus total damage not refunded by the insurance company
...
1 BASIC CONCEPTS AND EXAMPLES
This chapter deals with Markov processes which have parameter set T = [0, ∞) and
state space Z = {0, ±1, ±2,
...
According to the terminology introduced in section 2
...

Definition 5
...
, t n+1 } with t 0 < t 1 <
...
, i n+1 }, i ∈ Z,
k

the following relationship holds:
P(X(t n+1 ) = i n+1 X(t n ) = i n ,
...
1)

= P(X(t n+1 ) = i n+1 X(t n ) = i n )
...
1) is the same as for discretetime Markov chains:
The future development of a continuous-time Markov chain depends only on
its present state and not on its evolution in the past
...
A Markov chain is said to be
homogeneous if for all s, t ∈ T and i, j ∈ Z the transition probabilities p i j (s, t) depend only on the difference t − s :
p i j (s, t) = p i j (0, t − s)
...

Note This chapter only considers homogeneous Markov chains
...

The transition probabilities are comprised in the matrix of transition probabilities P
(simply: transition matrix):
© 2006 by Taylor & Francis Group, LLC

240

STOCHASTIC PROCESSES
P(t) = (( p i j (t)));

i, j ∈ Z
...
2)
Σ p i j (t) = 1; t ≥ 0, i ∈ Z
...

(5
...

This situation approximately applies to nuclear chain reactions and population explosions of certain species of insects (e
...
locusts)
...

(5
...
2), this assumption is equivalent to
p i j (0) = lim p i j (t) = δ i j ;
t→ +0

i, j ∈ Z
...
5)

The Kronecker symbol δ i j is defined by (4
...

Analogously to (4
...
6)

for any t ≥ 0, τ ≥ 0, and i, j ∈ Z
...
6) is proved as follows:
p i j (t + τ) = P(X(t + τ) = j X(0) = i) =
=

P(X(t + τ) = j, X(0) = i)
P(X(0) = i)

P(X(t + τ) = j, X(t) = k, X(0) = i)
P(X(0) = i)
k∈Z

Σ

=

P(X(t + τ) = j X(t) = k, X(0) = i) P(X(t) = k, X(0) = i)
P(X(0) = i)
k∈Z

=

P(X(τ + t) = j X(t) = k) P(X(t) = k X(0) = i) P( X(0) = i)
P(X(0) = i)
k∈Z

Σ

Σ

=

Σ

k∈Z

P(X(τ) = j X(0) = k) P(X(t) = k X(0) = i)
=

© 2006 by Taylor & Francis Group, LLC

Σ

k∈Z

p i k (t) p k j (τ)
...
p i (t) is called absolute state probability
(of the Markov chain) at time t
...
In particular,
{p i (0); i ∈ Z} is called an initial ( probability) distribution of the Markov chain
...

(5
...
, t n with 0 ≤ t 0 < t 1 <
...
This can be proved by repeated application of the formula of the conditional probability (1
...
, X(t n ) = i n )
= p i (t 0 ) p i i (t 1 − t 0 ) p i i (t 2 − t 1 )
...

0
0 1
1 2
n−1

(5
...
2 An initial distribution {π i = p i (0), i ∈ Z} is said to be stationary if
(5
...

Thus, if at time t = 0 the initial state is determined by a stationary initial distribution,
then the absolute state probabilities p j (t) do not depend on t and are equal to π j
...
Moreover, it follows from (5
...
10)
{P(X(t 1 + h) = i 1 , X(t 2 + h) = i 2 ,
...
e
...
(This result verifies the more general statement of theorem 2
...
) Moreover, it is justified to call {π i , i ∈ Z} a stationary (probability) distribution of the Markov chain
...
1 A homogeneous Poisson process {N(t), t ≥ 0} with intensity λ is a homogeneous Markov chain with state space Z = {0, 1,
...

( j − i)!

The sample paths of the process {N(t), t ≥ 0} are nondecreasing step-functions
...

© 2006 by Taylor & Francis Group, LLC

242

STOCHASTIC PROCESSES

Thus, a stationary initial distribution cannot exist
...
1 in section 3
...
)
Example 5
...
Their lifetimes are independent, identically distributed exponential random variables with parameter λ
...
, n}, transition probabilities
p i j (t) = ⎛ i ⎞ (1 − e −λ t ) i−j e −λ t j , n ≥ i ≥ j ≥ 0
...
The structure of these transition probabilities
is based on the memoryless property of the exponential distribution (example 1
...

Of course, this Markov chain cannot be stationary
...
3 Let Z = {0, 1) be the state space and
⎛
P(t) = ⎜
⎜
⎜
⎝

1
t ⎞
t+1 t+1 ⎟
t
1 ⎟
⎟
t+1 t+1 ⎠

the transition matrix of a stochastic process {X(t), t ≥ 0}
...
Assuming the initial distribution
p 0 (0) = P(X(0) = 0) = 1
and applying formula (5
...

On the other hand, applying (5
...

Therefore, Chapman-Kolmogorov's equations (5
...

Classification of States The classification concepts already introduced for discretetime Markov chains can analogously be defined for continuous-time Markov chains
...

A state set C ⊆ Z is called closed if
p ij (t) = 0 for all t > 0 , i ∈ C and j ∉ C
...
The state j is
accessible from i if there exists a t with p ij (t) > 0
...
Thus, equivalence classes, essential and
inessential states as well as irreducible and reducible Markov chains can be defined
as in section 4
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

243

State i is recurrent (transient) if
⎛ ∫ ∞ p (t) dt < ∞ ⎞
...
Since it can easily be shown
that p i j (t 0 ) > 0 implies p i j (t) > 0 for all t > t 0 , introducing the concept of a period
analogously to section 4
...
3 makes no sense
...
2 TRANSITION PROBABILITIES AND RATES
This section discusses some structural properties of continuous-time Markov chains
which are fundamental to mathematically modeling real systems
...
1 On condition (5
...

Proof For any h > 0 , the Chapman-Kolmogorov equations (5
...

Thus,
−(1 − p ii (h)) ≤ −(1 − p ii (h)) p ij (t) ≤ p ij (t + h) − p ij (t)
≤

Σ

k∈Z
/
k≠i

p ik (h) p kj (t) ≤

Σ

k∈Z
/
k≠i

p ik (h)

= 1 − p ii (h)
...

The uniform continuity of the transition probabilities and, therefore, their differentiability for all t ≥ 0 is now a consequence of assumption (5
...

Transition Rates The following limits play an important role in all future derivations
...
11)
q i = lim
,
h
h→0
q ij = lim

h→0

© 2006 by Taylor & Francis Group, LLC

p ij (h)
h

,

i ≠ j
...
12)

244

STOCHASTIC PROCESSES

These limits exist, since, by (5
...
1,
d p ii (t)
p ii (0) =
= −q i ,
dt
t=0
p ij (0) =

d p ij (t)
dt

t=0

= q ij ,

i ≠ j
...
13)
(5
...
13) and (5
...
15)

i ≠ j,

(5
...
The parameters q i and q i j are the transition rates of the Markov chain
...
According to (5
...
17)
Σ q ij = q i , i ∈ Z
...
For this purpose, the system of Chapman-Kolmogorov
equations is written in the form
p i j (t + h) = Σ p i k (h) p k j (t)
...

h
k≠i h

Σ

By (5
...
14), letting h → 0 yields Kolmogorov's backward equations for the
transition probabilities:
p ij (t) =

Σ q ik p kj (t) − q i p ij (t),

t ≥ 0
...
18)

Analogously, starting with
p i j (t + h) =

Σ

k∈Z

p i k (t) p k j (h) ,

yields Kolmogorov's forward equations for the transition probabilities:
p ij (t) =

© 2006 by Taylor & Francis Group, LLC

Σ p i k (t) q k j − q j p i j (t),

k≠j

t ≥ 0
...
19)

5 CONTINUOUS-TIME MARKOV CHAINS

245

Let { p i (0), i ∈ Z} be any initial distribution
...
19) by p i (0) and summing with respect to i yields

Σ

i∈Z

p i (0) p ij (t) =
=

Σ

k≠j

qk j

Σ

i∈Z

Σ

i∈Z

p i (0) Σ p i k (t) q k j −
k≠j

p i (0) p i k (t) − q j

Σ

i∈Z

Σ

i∈Z

p i (0) q j p i j (t)

p i (0) p i j (t)
...
7), the absolute state probabilities satisfy the system of linear
differential equations
p j (t) =

Σ q k j p k (t) − q j p j (t) ,

k≠j

t ≥ 0,

j ∈ Z
...
20)

In future, the absolute state probabilities are assumed to satisfy

Σ

i∈Z

p i (t) = 1
...
21)

This normalizing condition is always fulfilled if Z is finite
...

Transition Times and Transition Rates It is only possible to exactly model real
systems by continuous-time Markov chains if the lengths of the time periods between
changes of states are exponentially distributed, since in this case the 'memoryless
property' of the exponential distribution (example 1
...

If the times between transitions have known exponential distributions, then it is no
problem to determine the transition rates
...
11), the unconditional rate of leaving this state is given by
−λ 0 h
1 − p 00 (h)
= lim 1 − e
h
h
h→0
h→0

q 0 = lim

λ 0 h + o(h)
o(h)
= λ 0 + lim

...

(5
...
If Y 01 < Y 02 , the Markov chain makes a transition to state 1
and if Y 01 > Y 02 to state 2
...
12), the conditional transition rate from state
0 to state 1 is,
p (h)
(1 − e −λ 1 h ) e −λ 2 h + o(h)
q 01 = lim 01
= lim
h
h→0 h
h→0
λ 1 h (1 − λ 2 h)
o(h)
+ lim
h
h→0
h→0 h

= lim

= lim (λ 1 − λ 1 λ 2 h) = λ 1
...

(5
...
22) and (5
...
4
...
These graphs are constructed analogously to the transition graphs
for discrete-time Markov chains: The nodes of a transition graph represent the states
of the Markov chain
...
The edges are weighted by their corresponding transition rates
...
The unconditional transition rate q i equals the sum of
the weights of all those edges leaving node i
...

Example 5
...
After a failure the system is replaced by an equivalent
new one
...
All life- and replacement times are assumed to be independent
...
3
...
Consider the Markov chain
{X(t), t ≥ 0} defined by
X(t) =

1
0

if the system is operating

...
The absolute state probability
p 1 (t) = P(X(t) = 1)
of this Markov chain is the point availability of the system
...
Hence, by (5
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

247

λ

0

1
μ

Figure 5
...
4)

The corresponding Kolmogorov differential equations (5
...

These two equations are linearly dependent
...
) Replacing p 0 (t) in the second equation by 1 − p 1 (t)
yields a first-order nonhomogeneous differential equation with constant coefficients
for p 1 (t) :
p 1 (t) + (λ + μ)p 1 (t) = μ
...

The corresponding stationary availability is
μ
π 1 = lim p 1 (t) =

...
17, the same results have been obtained by applying the Laplace transform
...
)
Example 5
...
The system is available if and only if at least one of its units is
available
...
After the failure of a
unit, the other one (if available) is immediately switched from the redundancy state
to the operating state and the replacement of the failed unit begins
...
Otherwise it immediately
resumes its work
...
L and Z are assumed to be exponentially distributed with respective parameters λ and μ
...
e
...
A system failure occurs,
when a unit fails whilst the other unit is being replaced
...
Let Y i be the unconditional sojourn time
of the system in state i and Y i j be the conditional sojourn time of the system in state i
given that the system makes a transition from state i into state j
...
2 Transition graph for example 5
...
Hence, Y 0 = Y 01 = L
...
22), the corresponding transition rate is given by
q 0 = q 01 = λ
...
The unconditional sojourn time of the system in state 1 is
Y 1 = min (L, Z)
...
23), the corresponding transition rates are
q 12 = λ, q 10 = μ and q 1 = λ + μ
...

a) Survival probability In this case, only the time to entering state 2 (system failure)
is of interest
...
2) so that
q 20 = q 21 = 0
...

The corresponding system of differential equations (5
...
24)

p 2 (t) = +λ p 1 (t)
...
Combining the first two differential equations in (5
...

The corresponding characteristic equation is
x 2 + (2λ + μ) x + λ 2 = 0
...

⎝
2⎠
Hence, since p 0 (0) = 1, for t ≥ 0,
p 0 (t) = a sinh c t with c = 4 λ μ + μ 2
...
24) yields a = 2λ/c and
p 1 (t) = e

−

2 λ+μ
t
2

⎛ μ sinh c t + cosh c t ⎞ ,
⎝c
2
2 ⎠

t ≥ 0
...

(For a definition of the hyperbolic functions sinh and cosh, see section 3
...
1
...
12):
μ
E(L s ) = 2 +

...
25)
λ λ2
For the sake of comparison, in case of no replacement (μ = 0) , the system lifetime
L s has an Erlang distribution with parameters 2 and λ :
F s (t) = (1 + λ t) e −λ t ,

E(L s ) = 2/λ
...
In this case, the transition rate q 21 from state 2
to state 1 is positive
...
Assuming that a
mechanic cannot replace two failed units at the same time, then (Figure 5
...

For r = 2, the sojourn time of the system in state 2 is given by Y 2 = min(Z 1 , Z 2 ),
where Z 1 and Z 2 are independent and identically as Z distributed
...

λ

λ
1

0
μ

2
rμ

Figure 5
...
5 b)

© 2006 by Taylor & Francis Group, LLC

250

STOCHASTIC PROCESSES

Hence, the transition rates q 10 and q 12 have the same values as under a)
...
20) becomes, when replacing the last
differential equation by the normalizing condition (5
...

The solution is left as an exercise to the reader
...
4 Transition graph for example 5
...
6 (two-unit system, parallel redundancy) Now assume that both units of
the system operate at the same time when they are available
...
In particular, the system is
available if and only if at least one unit is available
...
Y 0 has an exponential distribution with parameter 2λ and from
state 0 only a transition to state 1 is possible
...

When the system is in state 1, then it behaves as in example 5
...

a) Survival probability As in the previous example, state 2 has to be thought of as
absorbing: q 20 = q 21 = 0 (Figure 5
...
Hence, from (5
...
21),
p 0 (t) = −2λ p 0 (t) + μ p 1 (t),
p 1 (t) = +2λ p 0 (t) − (λ + μ) p 1 (t),
1 = p 0 (t) + p 1 (t) + p 2 (t)
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV PROCESSES

251

The solution is
⎛ 3 λ+μ ⎞ t
2 ⎠ ⎡ cosh c t + μ − λ sinh c t ⎤
⎢
⎥
c
2 ⎦
2
⎣

−
p 0 (t) = e ⎝

where
c = λ2 + 6 λ μ + μ2
...

−
p 1 (t) = 4cλ e ⎝

2

The survival probability of the system is
F s (t) = P(L s > t) = p 0 (t) + p 1 (t)
Hence,
⎛ 3 λ+μ ⎞ t
2 ⎠ ⎡ cosh c t + 3 λ + μ sinh c t ⎤ ,
⎢
⎥
c
2 ⎦
2
⎣

−
F s (t) = e ⎝

The mean system lifetime is

t ≥ 0
...
26)

μ
E( L s ) = 3 +

...

2λ

λ
1

0
μ

2
rμ

Figure 5
...
6 b)

b) Availability If r (r = 1 or r = 2 ) mechanics replace failed units, then
q 2 = q 21 = r μ
...
5 b)
...

Solving this system of linear differential equations is left to the reader
...
3 STATIONARY STATE PROBABILITIES
If {π j , j ∈ Z} is a stationary distribution of the Markov chain {X(t), t ≥ 0}, then this
special absolute distribution must satisfy Kolmogorov's equations (5
...
Since the
π j are constant, all the left-hand sides of these equations are equal to 0
...
20) simplifies to a system of linear algebraic equations in the unknowns π j :
0=

Σ

k∈Z, k≠j

qk j πk − qj πj,

j ∈ Z
...
27)

This system of equations is frequently written in the form

qj πj =

Σ

k∈Z, k≠j

qk j πk ,

j ∈ Z
...
28)

This form clearly illustrates that the stationary state probabilities refer to an equilibrium state of the Markov chain:
The mean intensity per unit time of leaving state j, which is q j π j , is equal to
the mean intensity per unit time of arriving in state j
...
21), only those solutions {π j , j ∈ Z} of (5
...

(5
...
(Recall
that an irreducible Markov chain with finite state space Z is always positive recurrent
...
27) and (5
...
Moreover, in this case the limits

p j = lim p i j (t)
t→∞
exist and are independent of i
...

(5
...

t→∞ j
Otherwise, p j (t) would unboundedly increase as t → ∞ , contradictory to p j (t) ≤ 1
...
20) and (5
...
30) are
seen to satisfy the system of equations (5
...
29)
...

For a detailed discussion of the relationship between the solvability of (5
...

Continuation of example 5
...

Substituting the transition rates from Figure 5
...
27) and (5
...

Case r = 1

π0 =

μ2
(λ + μ) 2 − λ μ

π1 =

,

λμ
(λ + μ) 2 − λ μ

A = π0 + π1 =

π2 =

,

μ2 + λ μ
(λ + μ) 2 − λ μ

λ2
,
(λ + μ) 2 − λ μ

...

Continuation of example 5
...
5, the π j are solutions of

−2 λ π 0 +

μ π1

= 0,

+2 λ π 0 − (λ + μ) π 1 + r μ π 2 = 0,
π0 +
Case r = 1

π0 =

μ2
(λ + μ) 2 + λ 2

,

π1 =

π1 +
2λμ

(λ + μ) 2 + λ 2

A = π0 + π1 =

© 2006 by Taylor & Francis Group, LLC

π 2 = 1
...

2 λ2
,
(λ + μ) 2 + λ 2

254

STOCHASTIC PROCESSES

Case r = 2

π0 =

μ2
(λ + μ) 2

,

π1 =

2λμ

(λ + μ) 2

A = π0 + π1 = 1 −

,

π2 =

μ2
(λ + μ) 2

,

⎛ λ ⎞2

...
6 shows a) the mean lifetimes and b) the stationary availabilities of the twounit system for r = 1 as functions of ρ = λ/μ
...
With parallel redundancy, this switching problem
does not exist since an available spare unit is also operating
...
8

parallel

5
parallel
0
...
5

1

ρ

0

0
...
6 Mean lifetime a) and stationary availability b

Example 5
...
After a
type i-failure the system is said to be in failure state i; i = 1, 2
...
Thus, if at time t = 0 a
new system starts working, the time to its first failure is Y 0 = min (L 1 , L 2 )
...
The
time required for this is exponentially distributed with parameter ν
...
A renewed system immediately
starts working
...
This
process continues to infinity
...
This model is, for example, of importance in traffic
safety engineering: When the red signal in a traffic light fails (type 1-failure), then
the whole traffic light is switched off (type 2-failure)
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV PROCESSES

255

1
λ1

ν
λ2

0

2
μ

Figure 5
...
7

Consider the following system states:
0
1
2

system is operating
type 1-failure state
type 2-failure state

If X(t) denotes the state of the system at time t, then {X(t), t ≥ 0} is a homogeneous
Markov chain with state space Z = {0, 1, 2}
...
7)

q 01 = λ 1 , q 02 = λ 2 , q 0 = λ 1 + λ 2 , q 12 = q 1 = v, q 20 = q 2 = μ
...

The solution is

π0 =

μν
,
(λ 1 + λ 2 ) ν + (λ 1 + ν) μ

π1 =

λ1μ
,
(λ 1 + λ 2 ) ν + (λ 1 + ν) μ

π2 =

(λ 1 + λ 2 ) ν

...
4 SOJOURN TIMES IN PROCESS STATES
So far the fact has been used that independent, exponentially distributed times between changes of system states allow for modeling system behaviour by homogeneous Markov chains
...
8) and (5
...
, n X(0) = i)
n→∞ ⎝ ⎝ ⎠
n
= lim ⎡ p ii ⎛ 1 t ⎞ ⎤
n→∞ ⎣ ⎝ n ⎠ ⎦
n
t
= lim ⎡ 1 − q i n + o ⎛ 1 ⎞ ⎤
...
31)

since e can be represented by the limit

x
e = lim ⎛ 1 + 1 ⎞
...
32)

Thus, Y i has an exponential distribution with parameter q i
...
Let m(nt) denote the greatest integer m satisfying the
inequality m/n ≤ t or, equivalently,
nt − 1 < m(nt) ≤ nt
...

⎝n⎠ ⎦
n
n→∞
⎣
qi 1 + o⎛ 1 ⎞
n
⎝n⎠

Hence, by (5
...

(5
...
e
...
33) with
respect to j ∈ Z) verifies (5
...
Two other important conclusions are:

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV PROCESSES

257

1) Letting in (5
...

(5
...

Knowledge of the transition probabilities p ij suggests to observe a continuous-time
Markov chain {X(t), t ≥ 0} only at those discrete time points at which state changes
take place
...
Then {X 0 , X 1 ,
...
34):
qi j

p ij = P(X n = j X n−1 = i) = q ,
i

i, j ∈ Z
...

(5
...
} is embedded in the continuous-time Markov chain {X(t), t ≥ 0}
...
In these cases, they may facilitate the investigation
of non-Markov processes
...
Examples for the application of the
method of embedded Markov chains to analyzing queueing systems are given in sections 5
...
3
...
7
...
3
...
8 deals with semi-Markov chains, the framework
of which is an embedded Markov chain
...
5 CONSTRUCTION OF MARKOV SYSTEMS
In a Markov system, state changes are controlled by a Markov process
...
, Y i n ),
i

where the Y i j are independent, exponentially distributed random variables with parameters λ i j ; j = 1, 2,
...
A transition from state i to state j is made if and
only if Y i = Y i j
...

258

STOCHASTIC PROCESSES

This representation of q i results from (5
...
17)
...

Example 5
...
, L n start
operating at time t = 0
...
Failed machines are repaired
...
There is one mechanic who can only handle one failed machine at a
time
...
The
repair times are assumed to be mutually independent and identically distributed as an
exponential random variable Z with parameter µ
...
Immediately after completion of its repair, a machine resumes its work
...
Then
{X(t), t ≥ 0} is a Markov chain with state space Z = {0, 1,
...
The system stays in
state 0 for a random time
Y 0 = min (L 1 , L 2 ,
...
The corresponding transition rate is
q 0 = q 01 = n λ
...
, L n−1 , Z )
...
, n − 1} , and a
transition to state 0 if Y 1 = Z
...

In general (Figure 5
...
, n,
q j+1, j = μ ;
qi j = 0 ;

j = 0, 1,
...
, n,

q 0 = nλ
...

...
8 Transition graph for the repairman problem (example 5
...
28) is
μπ 1 = nλπ 0
(n − j + 1)λπ j−1 + μπ j+1 = ((n − j)λ + μ)π j ;

j = 1, 2,
...
, n ;

where ρ = λ /μ
...
29),
⎡ n
⎤ −1
π 0 = ⎢ Σ n! ρ i ⎥
...
This is
due to the fact that a random variable, which is Erlang distributed with parameters n
and μ , can be represented as a sum of n independent exponential random variables
with parameter μ (example 1
...
7
...
Hence, if the time interval, which the
system stays in state i, is Erlang distributed with parameters n i and μ i , then this interval is partitioned into n i disjoint subintervals (phases), the lengths of which are
independent, identically distributed exponential random variables with parameter μ i
...
, j n i to label these phases, the original nonMarkov system becomes a Markov system
...

Example 5
...
6, a two-unit
system with parallel redundancy is considered
...
The replacement times of the units are identically distributed as Z, where Z has an Erlang distribution with parameters n = 2 and µ
...
All other assumptions and model specifications are as in
example 5
...
The following system states are introduced:

0
1
2
3
4

both units are operating
one unit is operating, the replacement of the other one is in phase 1
one unit is operating, the replacement of the other one is in phase 2
no unit is operating, the replacement of the one being maintained is in phase 1
no unit is operating, the replacement of the one being maintained is in phase 2

The transition rates are (Figure 5
...
9 Transition graph for example 5
...
Then,
i

π∗ = π0, π∗ = π1 + π2, π∗ = π3 + π4
...
Letting ρ = E(Z)/E(L) = 2λ/μ , they are
The probabilities π i
−1
π ∗ = ⎡ 1 + 2ρ + 3 ρ 2 + 1 ρ 3 ⎤ ,
0 ⎣
⎦
2
4
−1
π ∗ = ⎡ 2ρ + 1 ρ 2 ⎤ π ∗ ,
1 ⎣
0
⎦
2

−1
π∗ = ⎡ ρ2 + 1 ρ3 ⎤ π∗
...

0
1
Unfortunately, applying Erlang's phase method to structurally complicated systems
leads to rather complex Markov systems
...
6 BIRTH- AND DEATH PROCESSES
In this section, continuous-time Markov chains with property that only transitions to
'neighbouring' states are possible, are discussed in more detail
...
In the economical
sciences, birth- and death processes are among other things used for describing the
development of the number of enterprises in a particular area and manpower fluctuations
...
Their name, however, comes from applications in biology, where they have been used to stochastically model the development in time of
the number of individuals in populations of organisms
...
6
...
, n} is called a (pure)
birth process if, for all i = 0, 1,
...
State n is absorbing if n < ∞
...
In what
follows, they will be called birth rates and denoted as
λ i = q i,i+1 , i = 0, 1,
...

The sample paths of birth processes are nondecreasing step functions with jump
height 1
...
In this case, λ i = λ , i = 0, 1,
...
e
...
The p j (t) are identically equal to 0 for j < m and, according to (5
...

p n (t) = +λ n−1 p n−1 (t) ,

(5
...

From the first differential equation,
p m (t) = e −λ m t ,

t ≥ 0
...
, the differential equations (5
...
37)

262

STOCHASTIC PROCESSES
e

λjt ⎛

λjt
⎞
p j−1 (t)
⎝ p j (t) + λ j p j (t) ⎠ = λ j−1 e

or
d ⎛ e λ j t p (t) ⎞ = λ e λ j t p (t)
...

(5
...
37) and (5
...
For instance, on conditions p 0 (0) = 1 and λ 0 ≠ λ 1 ,
t

p 1 (t) = λ 0 e −λ 1 t ∫ 0 e λ 1 x e −λ 0 x dx
t

= λ 0 e −λ 1 t ∫ 0 e −(λ 0 −λ 1 )x dx
=

λ 0 ⎛ −λ t
e 1 − e −λ 0 t ⎞ ,
⎠
λ0 − λ1 ⎝

t ≥ 0
...
38) yields by
induction:
p j (t) =

j

Σ

i=0

C i j λ i e −λ i t ,

j
λk
Ci j = 1 Π
,
λ j k=0, k≠i λ k − λ i

j = 0, 1,
...

λ0

Linear Birth Process A birth process is called a linear birth process or a Yule-Furry
process if its birth rates are given by
λ i = i λ ; i = 0, 1, 2,
...
Linear birth processes occur, for instance, if in the interval [t, t + h] each
member of a population (bacterium, physical particle) independently of each other
splits with probability λh + o(h) as h → 0
...
36) becomes
p j (t) = −λ [ j p j (t) − ( j − 1) p j−1 (t)] ;

j = 1, 2,
...
39)

with
p 1 (0) = 1 , p j (0) = 0 ;

j = 2, 3,
...
39) under the initial distribution (5
...

(5
...
Hence, the trend
function of the linear birth process is
m(t) = e λ t ,

t ≥ 0
...
36) which satisfies

Σ

i∈Z

p i (t) = 1
...
41)

In case of an infinite state space Z = {0, 1,
...
36) with property
(5
...
Without loss of generality, the theorem is proved on condition (5
...

Theorem 5
...
} of the system of differential equations (5
...
41) if and only if the series
∞ 1

Σ

(5
...

Proof Let
s k (t) = p 0 (t) + p 1 (t) +
...

Summing the middle equation of (5
...

By integration, taking into account s k (0) = 1 ,
t

1 − s k (t) = λ k ∫ 0 p k (x) dx
...
43)

Since s k (t) is monotonically increasing as k → ∞ , the following limit exists:
r(t) = lim (1 − s k (t))
...
43),
t

λ k ∫ 0 p k (x) dx ≥ r(t)
...
+ 1 ⎟
...
+ 1 ⎟
...
42) diverges, then this inequality implies that r(t) = 0 for all t > 0
...
41)
...
43),
t

λ k ∫ 0 p k (x) dx ≤ 1
so that
t

∫0

s k (x) dx ≤ 1 + 1 +
...

λ0 λ1
λk

By passing to the limit as k → ∞,

∞ 1

t
∫ 0 (1 − r(t)) dt ≤ Σ

i=0 λ i

...
Since t can be arbitrarily
large, the series (5
...
This result completes the proof
...
The probability of such an explosive growth is
∞
1 − Σ i=0 p i (t)
...
42) converges
...

since
∞ 1

Σ

i=1 λ i

∞
2
= 1 Σ 1 = π < ∞
...
42) does not depend on t
...
6
...
} is called a (pure)
death process if, for all i = 1, 2,
...

State 0 is absorbing
...

In what follows, they will be called death rates and denoted as
μ 0 = 0, μ i = q i,i−1 ; i = 1, 2,
...
For pure death
processes, on condition
p n (0) = P(X(0) = n) = 1,
the system of differential equations (5
...
, n − 1
...
44)

The solution of the first differential equation is
p n (t) = e −μ n t ,

t ≥ 0
...
44) yields
p j (t) = μ j+1 e

−μ j t t μ j x
∫ 0 e p j+1 (x) dx ;

j = n − 1,
...

(5
...
, 0,
can be recursively determined from (5
...
For instance, assuming μ n ≠ μ n−1 ,
t

p n−1 (t) = μ n e −μ n−1 t ∫ 0 e −(μ n −μ n−1 ) x dx
μn
⎛ e −μ n−1 t − e −μ n t ⎞
...
46)

1
D nn = μ
...

Under the initial distribution
p n (0) = P(X(0) = n) = 1
the process stays in state n an exponentially with parameter nλ distributed time:
p n (t) = e −n λ t , t ≥ 0
...
45) or simply from (5
...
, n
...

© 2006 by Taylor & Francis Group, LLC

266

STOCHASTIC PROCESSES

Example 5
...

The lifetimes of the subsystems are independent, exponentially with parameter λ distributed random variables
...

5
...
3

Birth- and Death Processes

5
...
3
...
, n}, n ≤ ∞,
is called a birth- and death process if from any state i only a transition to i − 1 or
i + 1 is possible, provided that i − 1 ∈ Z and i + 1 ∈ Z, respectively
...

The transition rates λ i = q i,i+1 and μ i = q i,i−1 are called birth rates and death rates,
respectively
...
10)
...
If a birth- and death
process describes the number of individuals in a population of organisms, then, when
arriving in state 0, the population is extinguished
...

λ0

0

λ1

1
μ1

μ2

...

λ i−1

λi

i
μi

μ i+1

Figure 5
...
20), the absolute state probabilities p j (t) = P(X(t) = j), j ∈ Z, of a
birth- and death process satisfy the system of linear differential equations
p 0 (t) = −λ 0 p 0 (t) + μ 1 p 1 (t),
p j (t) = +λ j−1 p j−1 (t) − (λ j + μ j ) p j (t) + μ j+1 p j+1 (t) ,
p n (t) = +λ n−1 p n−1 (t) − μ n p n (t) ,

© 2006 by Taylor & Francis Group, LLC

n < ∞
...
,

(5
...
} of two important birth- and death processes are determined via their respective z -transforms
∞

M(t, z) = Σ i=0 p i (t) z i
under initial conditions of type
p n (0) = P(X(0) = n) = 1
...

Furthermore, partial derivatives of the z-transforms will be needed:

∂M(t, z) ∞
∂M(t, z) ∞
= Σ p i (t) z i and
= Σ i p i (t) z i−1
...
48)

Partial differential equations for M(t, z) will be established and solved by applying
the characteristic method
...
11 (linear birth- and death process) {X(t), t ≥ 0} is called a linear birthand death process if it has transition rates
λ i = i λ , μ i = i μ , i = 0, 1,
...

Assuming p 0 (0) = 1 would make no sense since state 0 is absorbing
...
20) becomes

p 0 (t) = μ p 1 (t),
p j (t) = (j − 1)λ p j−1 (t) − j (λ + μ) p j (t) + (j + 1)μ p j+1 (t) ;

j = 1, 2,
...
49)

Multiplying the j th differential equation by z j and summing from j = 0 to j = ∞, taking into account (5
...

(5
...

(5
...
51) can be written in the form

dz
= −dt
...

λ−μ ⎝ z−1 ⎠

−

The general solution z = z(t) of the characteristic differential equation in implicit
form is, therefore, given by
λz − μ⎞
c = (λ − μ) t − ln ⎛
,
⎝ z−1 ⎠
where c is an arbitrary constant
...
50) has
structure
λz − μ⎞ ⎞
M(t, z) = f ⎛ (λ − μ)t − ln ⎛
⎝
⎝ z−1 ⎠ ⎠,
where f can be any function with a continuous derivative
...
Since
⎛ ⎛
⎞⎞
M(0, z) = f ln z − 1
= z,
⎝ ⎝λz − μ⎠ ⎠

f must have structure
f (x) =

μe x − 1

...

λz−μ ⎞
λ exp (λ − μ)t − ln ⎛
⎝ z−1 ⎠ − 1

After simplification, M(t, z) becomes

M(t, z) =

μ ⎡ 1 − e (λ−μ)t ⎤ − ⎡ λ − μe (λ−μ)t ⎤ z
⎣
⎦ ⎣
⎦

⎡ μ − λe (λ−μ)t ⎤ − λ ⎡ 1 − μe (λ−μ)t ⎤ z
⎣
⎦
⎣
⎦

...
The coefficient of z j is the desired absolute state probability p j (t)
...

Since state 0 is absorbing, p 0 (t) is the probability that the population is extinguished at time t
...

t→∞
⎪
for λ > μ
⎩λ
Thus, if λ > μ , the population will survive to infinity with positive probability μ/λ
...
In the latter
case, the distribution function of the lifetime L of the population is
(λ−μ)t
P(L ≤ t) = p 0 (t) = 1 − e
,
1 − ρe (λ−μ)t

t ≥ 0
...

From this, applying (1
...

μ⎠
μ−λ ⎝

The trend function m(t) = E(X(t)) is principally given by
∞

m(t) = Σ j=0 j p j (t)
...
23), m(t) can also be obtained from the z-transform:

m(t) =

∂M(t, z)

...
47)
...
49) by j and
summing from j = 0 to ∞ yields the following first-order differential equation:

m (t) = (λ − μ)m(t)
...
52)

Taking into account the initial condition p 1 (0) = 1 , its solution is

m(t) = e (λ−μ)t
...
47) by j 2 and summing from j = 0
to ∞ , a second order differential equation in Var(X(t)) is obtained
...

⎦
λ−μ ⎣

Of course, since M(t, z) is known, Var(X(t)) can be obtained from (1
...

If the linear birth- and death process starts in states s = 2, 3,
...
But it will be more complicated
© 2006 by Taylor & Francis Group, LLC

270

STOCHASTIC PROCESSES

to expand M(t,z) as a power series in z
...
52) with the initial condition p s (0) = 1 :

m(t) = s e (λ−μ)t ,

t ≥ 0
...
51) simplifies to

dz
= − dt
...
Therefore, M(t, z) has structure

M(t, z) = f ⎛ λ t − 1 ⎞ ,
⎝
z−1 ⎠
where f is a continuously differentiable function
...

⎝ z −1⎠
Hence, the desired function f is given by

f (x) = 1 − 1 ,
x

x ≠ 0
...

1 + λt − λt z

Expanding M(t, z) as a power series in z yields the absolute state probabilities:

p 0 (t) =

λt ,
1 + λt

p j (t) =

(λ t) j−1
(1 + λ t) j+1

;

j = 1, 2,
...

An equivalent form of the absolute state probabilities is

p 0 (t) =

λt ,
1 + λt

p j (t) = ⎡ 1 − p 0 (t) ⎤ 2 ⎡ p 0 (t) ⎤ j−1 ;
⎣
⎦ ⎣
⎦

Mean value and variance of X(t) are
E(X(t)) = 1,

j = 1, 2,
...

Var(X(t)) = 2 λ t
...

Example 5
...

and initial distribution and p 0 (0) = P(X(0) = 0) = 1
...
47) is

p 0 (t) = μ p 1 (t) − λ p 0 (t),
p j (t) = λ p j−1 (t) − (λ + μ j) p j (t) + (j + 1)μ p j+1 (t) ;

j = 1, 2,
...
53)

Multiplying the j th equation by z j and summing from j = 0 to ∞ yields a homogeneous linear partial differential equation for the moment generating function:

∂M(t, z)
∂M(t, z)
+ μ(z − 1)
= λ(z − 1) M(t, z)
...
54)

The corresponding system of characteristic differential equations is

dz = μ (z − 1) ,
dt

dM(t, z)
= λ(z − 1) M(t, z)
...
By combining both differential equations and letting
ρ = λ/μ,
dM(t, z)
= ρ dz
...
As a solution of (5
...
e
...

Therefore,

M(t, z) = exp { f (ln(z − 1) − μt) + ρ z}
...

Hence, the explicit representation of f is

f (x) = −ρ (e x + 1)
...

⎝
⎠
Equivalently,

© 2006 by Taylor & Francis Group, LLC

272

STOCHASTIC PROCESSES

M(t, z) = e − ρ (1−e

−μ t )

⋅ e +ρ (1−e

−μ t ) z

...
The coefficients of z j are
⎛ ρ (1 − e −μ t ) ⎞ j
⎝
⎠ − ρ (1−e −μ t )
p j (t) =
e
;
j!

j = 0, 1,
...
55)

This is a Poisson distribution with intensity function ρ (1 − e −μ t )
...

For t → ∞ the absolute state probabilities p j (t) converge to the stationary state probabilities:

ρ j −ρ
π j = lim p j (t) =
e ;
j!
t→∞

j = 0, 1,
...
In this case this distribution has a rather complicated structure, which will
not be presented here
...
53) can
be used to establish ordinary differential equations for the trend function m(t) and the
variance of X(t)
...
, their respective
solutions are
m(t) = ρ (1 − e −μ t ) + s e −μ t ,

Var (X(t)) = (1 − e −μ t ) ⎛ ρ + s e −μ t ⎞
...
7)
...
13 ( birth- and death process with immigration) For positive parameters λ , μ and ν, let transition rates be given by

λ i = i λ + ν,

μi = i μ ;

i = 0, 1,
...
Moreover, due to
immigration from outside, the population will increase by one individual in [t, t + Δt]
with probability ν t + o(Δt)
...
These probabilities do not depend on t and refer to Δt → 0
...
The differential equations (5
...

Analogously to the previous examples, the z-transformation M(t, z) of the probability
distribution {p 0 (t), p 1 (t),
...

∂t
∂z

(5
...
56) is
dz = −(λ z − μ)(z − 1) ,
dt
dM(t, z)
= ν(z − 1) M(t, z)
...

1 + λt

Generally it is not possible to expand M(t, z) as a power series in z
...

z=0

The trend function

m(t) = E(X(t)) =

∂M(t, z)
∂z
z=1

of this birth- and death process is
m(t) = ν ⎡ e (λ−μ) t − 1 ⎤
⎦
λ−μ ⎣

m(t) = ν t

for λ ≠ μ ,

for λ = μ
...

lim M(t, z) = ⎛ 1 − μ ⎞
⎝
⎠
⎝
μz⎠
t→∞
For λ < μ, the trend function (5
...

μ−λ
t→∞

© 2006 by Taylor & Francis Group, LLC

(5
...
6
...
2 Stationary State Probabilities
By (5
...
} of a birthand death process satisfies the following system of linear algebraic equations
λ0π0 − μ1π1 = 0

λ j−1 π j−1 − (λ j + μ j )π j + μ j+1 π j+1 = 0 ,
λ n−1 π n−1 − μ n π n = 0 ,

j = 1, 2,
...
58)

n < ∞
...

μ n π n = λ n−1 π n−1 ,

(5
...

Provided its existence, it is possible to obtain the general solution of (5
...

Then the system (5
...

h n−1 = 0,

n < ∞
...
, n
...
60)

1) If n < ∞ , then the stationary state probabilities satisfy the normalizing condition
n

Σ i=0 π i = 1
...

⎢
⎥
⎢
i ⎥
j=1 i=1
⎣
⎦

(5
...
61) shows that the convergence of the series
j λ
i−1
μi
j=1 i=1
∞

Σ Π

(5
...
A sufficient condition for
the convergence of this series is the existence of a positive integer N such that

λ i−1
μ i ≤ α < 1 for all i > N
...
63)

5 CONTINUOUS-TIME MARKOV PROCESSES

275

Intuitively, this condition is not surprising: If the birth rates are greater than the corresponding death rates, the process will drift to infinity with probability 1
...
For a proof of the
following theorem see Karlin and Taylor [45]
...
3 The convergence of the series (5
...
64)
Σ Π λi
j=1 i=1 i
is sufficient for the existence of a stationary state distribution
...
64) is, moreover, sufficient for the existence of such a time-dependent solution
{p 0 (t), p 1 (t)
...
47) which satisfies the normalizing condition (5
...

Example 5
...
8 is considered once more
...
A failed machine can be attended only by
one mechanic
...
14
...
8
...

...

...
11 Transition graph of the general repairman problem

Let X(t) denote the number of failed machines at time t
...
, n}
...
11)
...
If the service rate ρ = λ/μ is introduced, formulas (5
...
58) yield the stationary state probabilities
⎧ ⎛n⎞ j
1≤j≤r
⎪ ⎝ j ⎠ ρ π0 ;
πj = ⎨
,
n!
ρ j π0 ; r ≤ j ≤ n
⎪ j−r
⎩ r r! (n−j )!
n
⎡ r ⎛n⎞ j
⎤ −1
n!
j⎥
...
65)

276

STOCHASTIC PROCESSES
Table 5
...
14
Policy 1:
j
0
1
2
3
4
5
6
7
8
9
10

n=10, r = 2
π j,1

Policy 2:
j

0
...
1022
0
...
1655
0
...
1564
0
...
0704
0
...
0095
0
...
1450
0
...
2611
0
...
1410
0
...
65) is illustrated by a
numerical example: Let n = 10, ρ = 0
...
The efficiencies of the following
two maintenance policies will be compared:
1) Both mechanics are in charge of the repair of any of the 10 machines
...

Let X n,r be the random number of failed machines and Z n,r the random number of
mechanics which are busy with repairing failed machines, dependent on the number
n of machines and the number r of available mechanics
...
1, for policy 1,
10

E(X 10,2 ) = Σ j=1 j π j,1 = 3
...
8296
...
011
5

E(Z 5,1 ) = 1 ⋅ π 1,2 + Σ j=2 π j,2 = 0
...

Hence, when applying policy 2, the average number of failed machines out of 10 and
the average number of busy mechanics out of 2 are
2 E(X 5,1 ) = 4
...
710
...
Hence, policy 1 should be preferred if there are no other relevant performance criteria
...
15 The repairman problem of example 5
...
Thus, if only one machine
has failed, then all r units are busy with repairing this machine
...
This adaptation is repeated after each failure of a machine and after
each completion of a repair
...

If j machines have failed, then the repair rate of each failed machine is
rμ/j
...
e
...

j
The birth rates are the same as in example 5
...

Thus, the stationary state probabilities are according to (5
...
61):
⎡ n
π 0 = ⎢ Σ n! ⎛ rλ ⎞
⎢
⎢ j=1 (n − j)! ⎝ μ ⎠
⎣

πj =

n! ⎛ λ ⎞ j π ;
(n − j)! ⎝ r μ ⎠ 0

j⎤

⎥
⎥
⎥
⎦

−1

,

j = 1, 2 ,
...
65), it is apparent that
in case r = 1 the uniform distribution of the repair capacity to the failed machines has
no influence on the stationary state probabilities
...

Many of the results presented so far in section 5
...

5
...
3
...
They are
characterized by transition rates which do not depend on time
...
2
...
Its birth rates are
λ i (t) = λ(t); i = 0, 1,
...

© 2006 by Taylor & Francis Group, LLC

278

STOCHASTIC PROCESSES

2) Mixed Poisson process If certain conditions are fulfilled, mixed Poisson processes
(section 3
...
3) belong to the class of nonhomogeneous birth processes
...

dt
Equivalently, a pure birth process {X(t), t ≥ 0} with transition rates λ i (t) and with absolute state distribution
{p i (t) = P(X(t) = i); i = 0, 1,
...

i
(see also Grandel [35])
...
11, now a birth- and death process {X(t), t ≥ 0} is considered which has transition rates
λ i (t) = λ(t) i , μ i (t) = μ(t) i ; i = 0, 1,
...

Thus, λ(t) can be interpreted as the transition rate from state 1 into state 2 at time t,
and μ(t) is the transition rate from state 1 into the absorbing state 0 at time t
...
47), the absolute state probabilities p j (t) satisfy
p 0 (t) = μ(t) p 1 (t),
p j (t) = (j − 1)λ(t) p j−1 (t) − j (λ(t) + μ(t)) p j (t) + (j + 1)μ(t) p j+1 (t) ;

j = 1, 2,
...
}
is given by the partial differential equation (5
...

∂t
∂z

(5
...
51)):
dz = −λ(t) z 2 + [λ(t) + μ(t)] z − μ
...

ϕ 3 (t) − z ϕ 4 (t)

Hence, for all differentiable functions g(⋅) , the general solution of (5
...

⎝ ϕ 3 (t) − z ϕ 4 (t) ⎠
From this and the initial condition M(0, z) = z it follows that there exist two functions a(t) and b(t) so that
M(t, z) =

a(t) + [1 − a(t) − b(t)] z

...
67)

By expanding M(t, z) as a powers series in z ,
p 0 (t) = a(t),
p i (t) = [1 − a(t)][1 − b(t)][ b(t)] i−1 ; i = 1, 2,
...
68)

Inserting (5
...
66) and comparing the coefficients of z yields a system of differential equations for a(t) and b(t) :
(a b − ab ) + b = λ (1 − a) (1 − b)
a = μ (1 − a) (1 − b)
...
69)

A = −μ A B
...
70)

The first differential equation is of Bernoulli type
...
69)
y (t) = 1/B(t)
gives a linear differential equation in y:
y + (μ − λ) y = μ
...
Hence the solution of (5
...

© 2006 by Taylor & Francis Group, LLC

(5
...
70) and (5
...

y
y
A
Therefore, the desired functions a and b are
a(t) = 1 − 1 e −ω(t)
y(t)
b(t) = 1 − 1 ,
y(t)

t ≥ 0
...
68) of the
nonhomogeneous birth- and death process {X(t), t ≥ 0} is completely characterized
...

∫ 0 e ω(x) μ(x) dx + 1
Hence, the process {X(t), t ≥ 0} will reach state 0 with probability 1 if the integral
t

∫ 0 e ω(x) μ(x) dx
...
72)

diverges as t → ∞
...
e
...

t

Since state 0 is absorbing, it is justified to call L the lifetime of the process
...
72) diverges as t → ∞ , L has distribution function
F L (t) = P(L ≤ t) = p 0 (t) , t ≥ 0
...

(5
...
74)

If the process {X(t), t ≥ 0} starts at s = 2, 3,
...
e
...

then the corresponding z -transform is
M(t, z) =

⎛ a(t) + [1 − a(t) − b(t)] z ⎞ s

...
73) and
(5
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

5
...
7
...
The basic situation is the following: Customers
arrive at a service system (queueing system) according to a random point process
...
Otherwise, an available server takes care of the customer
...
The arriving customers
constitute the input (input flow, traffic, flow of demands) and the leaving customers
the output (output flow) of the queueing system
...
These customers leave the system immediately after
arrival and are said to be lost
...
A waiting-loss system has only limited waiting
capacity for customers
...
A multi-server queueing system has s > 1 servers
...
Of course, 'customers' or 'servers'
need not be persons
...

...

s

waiting

loss

1
2

...

...
12 Scheme of a standard queueing system

Supermarkets are simple examples of queueing systems
...
Filling stations also can be thought of as queueing systems
with petrol pumps being the servers
...
In this case, the parking lots are the 'servers' and the 'service times'
are generated by the customers themselves
...
During recent years the stochastic
modeling of communication systems, in particular computer networks, has stimulated
the application of standard queueing models and the creation of new, more sophisticated ones
...
K
...
14 also fits into the framework of a queueing system
...
This
example is distinguished by a particular feature: each demand (customer) is produced
by one of a finite number n of different sources 'inside the system', namely by one of
the n machines
...

The global objective of queueing theory is to provide theoretical tools for the design
and the quantitative analysis of service systems
...

Managers of service systems do not want to 'employ' more servers than necessary for
meeting given performance criteria
...

2) The mean waiting time of a customer for service
...
In this code, A characterizes the input and B the
ser-vice, s is the number of servers, and waiting capacity is available for m
customers
...

A = GI (general independent): Customers arrive in accordance with an ordinary
renewal process (recurrent input)
...

B = M (Markov) The service times are independent, identically distributed
exponential random variables
...

For instance, M/M/1/0 is a loss system with Poisson input, one server, and
exponential service times
...
For queueing systems with an infinite
number of servers, no waiting capacity is necessary
...

In waiting systems and waiting-loss systems there are several ways of choosing waiting customers for service
...
The most important ones are:
1) FCFS (first come-first served) Waiting customers are served in accordance with
their order of arrival
...

2) LCFS (last come-first served) The customer which arrived last is served first
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

283

3) SIRO (service in random order) A server, when having finished with a customer,
randomly picks one of the waiting customers for service
...

A customer with higher priority is served before a customer with lower priority, but
no interruption of service takes place (head of the line priority discipline)
...

System Parameter The intensity of the input flow (mean number of arriving customers per unit time) is denoted as λ and referred to as arrival rate or arrival intensity
...
The service intensity or service rate of the servers is denoted as μ
...

The traffic intensity of a queueing system is defined as the ratio
ρ = λ/μ ,
and the degree of server utilisation is η = E(S) /s , where S is the random number of
busy servers in the steady state
...
Note that here and in what
follows in the steady state refers to stationarity
...
In
what follows, if not stated otherwise, X(t) denotes the total number of customers at a
service station (either wait- ing or being served) at time t
...
, s + m; s, m ≤ ∞
...
7
...
7
...
1 M/M/∞ - System
Strictly speaking, this system is neither a loss nor a waiting system
...
}
and transition rates (example 5
...

The corresponding time-dependent state probabilities p j (t) of this queueing system
are given by (5
...
The stationary state probabilities are obtained by passing to the
© 2006 by Taylor & Francis Group, LLC

284

STOCHASTIC PROCESSES

limit as t → ∞ in these p j (t) or by inserting the transition rates λ i = λ and μ i = i μ
with n = ∞ into (5
...
61):
πj =

ρ j −ρ
e ;
j!

j = 0, 1,
...
75)

This is a Poisson distribution with parameter ρ
...

5
...
2
...
, s} and
λ i = λ ; i = 0, 1,
...
, s
...
60) and (5
...
, s
...
76)

The probability π 0 is called vacant probability
...
e
...
77)

...
The following recursive formula for the loss
probability as a function of s can easily be verified:
s
1
π 0 = 1 for s = 0; π = ρ π 1 + 1 ;
s
s−1
The mean number of busy servers is
E(X) =

s

Σ

i=1

i πi =

s

s−1 ρ i
ρi
π0 = ρ Σ
π0
...
77),
E(X) = ρ(1 − π s )
...

© 2006 by Taylor & Francis Group, LLC

s = 1, 2,
...

1+ρ
1+ρ
Since ρ = E(Z)/E(Y),
π0 =

285

(5
...

E(Y) + E(Z)
E(Y) + E(Z)

Hence, π 0 (π 1 ) is formally equal to the stationary availability (nonavailability) of a
system with mean lifetime E(Y) and mean renewal time E(Z) the operation of which
is governed by an alternating renewal process (section 3
...
6, formula (3
...

Example 5
...
Assume that the input (calls of subscribers wishing to be connected) has
intensity λ = 2 [min −1 ]
...
5 [min]
...

1) What is the loss probability in case of s = 7 lines? The corresponding traffic
intensity is ρ = λ /μ = 6
...
185
...
185) = 4
...
89/7 = 0
...

2) What is the minimal number of lines which have to be provided in order to make
sure that at least 95% of the desired connections can be made? The respective loss
probabilities for s = 9 and s = 10 are
π 9 = 0
...
043
...

However, in this case the degree of server utilization is smaller than with s = 7 lines:
η = η(10) = 0
...

It is interesting and practically important that the stationary state probabilities of the
queueing system M/G/s/0 also have the structure (5
...
That is, if the respective
traffic intensities ρ of the systems M/M/s/0 and M/G/s/0 are equal, then their station-

© 2006 by Taylor & Francis Group, LLC

286

STOCHASTIC PROCESSES

ary state probabilities coincide: for both systems they are given by (5
...
A
corresponding result holds for the queueing systems M/M/∞ and M/G/∞
...
75) with the stationary state probabilities (3
...
5 for the M/G/∞ -system
...
An analogous property can be defined with regard to the input
...
78), the M/M/1/0 - system is insensitive both with regard to arrival and service
time distributions ( full insensitivity)
...

5
...
2
...
The service times are independent, exponentially distributed random variables with parameter µ
...
14: during the repair of a machine, this
machine cannot produce another demand for repair
...
Let X(t) denote the number of customers being served at
time t
...
, s}
...
Therefore, the transition rates of this birth- and death process are
λ j = (n − j)λ ;
μj = j μ ;

j = 0, 1, 2,
...
, s
...
13 Engset's loss system in state X(t)=j

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

287

Inserting these transition rates into (5
...
61) with n = s yields the stationary
state distribution for Engset's loss system:
⎛n⎞ρj
⎝j⎠

πj = s

Σ

i=0

⎛ n ⎞ ρi
⎝i⎠

j = 0, 1,
...

;

In particular, π 0 and the loss probability π s are
π0 = s

Σ

i=0

1
,
⎛ n ⎞ ρi
⎝i⎠

⎛ n ⎞ ρs
⎝s⎠
πs = s

...
14, a
closed queueing system
...
7
...
7
...
1 M/M/s/∞ - System
The Markov chain {X(t), t ≥ 0} which models this system is defined as follows: If
X(t) = j with 0 ≤ j ≤ s, then j servers are busy at time t
...
In either case, X(t) is the
total number of customers in the queueing system at time t
...
} and transition rates
λj = λ ;

j = 0, 1,
...
, s ;

μ j = s μ for j > s
...
79)

In what follows it is assumed that
ρ = λ/μ < s
...
Hence, no
equilibrium state between arriving and leaving customers is possible
...
62) converges and
condition (5
...
Inserting the transition rates (5
...
60) yields
πj =

ρj
π
j! 0
πj =

© 2006 by Taylor & Francis Group, LLC

for j = 0, 1,
...

(5
...

⎦

The probability π w that an arriving customer finds all servers busy is
∞

π w = Σ i=s π i
...
Making again use of the geometrical series yields a simple
formula for π w :
πs
πw =

...
81)
1 − ρ/s
In what follows, all derivations refer to the system in the steady state
...

(5
...

(5
...
83) are left as an exercise to the reader
...
83) holds for any GI/G/s/∞ -system
...
By making use of (5
...

⎢
⎥
⎢
(s − ρ) 2 ⎥
⎣
⎦

(5
...

Then the mean queue length is
∞

∞

E(L) = Σ i=s (i − s) π i = Σ i=s i π i − s π w
...
82)-(5
...

(s − ρ) 2

(5
...
By the total probability rule,
∞

P(W > t) = Σ i=s P(W > t X = i) π i
...
86)

If a customer enters the system when it is in state X = i ≥ s , then all servers are busy
so that the current output is a Poisson process with intensity sμ
...
Therefore, the probability that the service of
precisely k customers, 0 ≤ k ≤ i − s , will be finished in this interval of length t is
(s μ t) k −s μ t
e

...
86),
P(W > t) = e −s μ t

∞

i−s (s μ t) k

i=s

k=0

Σ πi Σ

k!

= π 0 e −s μ t

∞

Σ

ρi

i−s (s μ t) k

Σ

i=s s!s i−s k=0

k!

...
25), and making use of both the power series of e x and the
geometrical series yields
P(W > t) = π 0

ρ s −s μ t ∞ ⎛ ρ ⎞ j j (s μ t) k
e
Σ ⎝ s ⎠ Σ k!
s!
j=0
k=0

= π s e −s μ t
= π s e −s μ t

∞ (λt) k

Σ

k=0

k!

∞ (s μ t) k ∞ ρ j
Σ ⎛s⎞
⎝ ⎠
k!
k=0
j=k

Σ

∞

1
...

Note that P(W > 0) is the waiting probability (5
...

The mean waiting time of a customer is
∞
s
E(W) = ∫ 0 P(W > t) dt =
πs
...
87)

A comparison of (5
...
87) yields Little's formula or Little's law:
E(L) = λ E(W)
...
88)

Little's formula can be motivated as follows: The mean value of the sum of the waiting times arising in an interval of length τ is τ E(L)
...
Hence,
τ E(L) = λτ E(W),
which is Little's formula
...
84), an equivalent representation of Little's formula is
E(X ) = λ E(T) ,
(5
...
e
...
Hence, the mean value of T is
E(T) = E(W) + 1/μ
...
For a proof of this proposition and
other 'Little type formulas' see Franken et al
...

5
...
3
...
Hence, the corresponding
stochastic process {X(t), t ≥ 0} describing the development in time of the number of
customers in the system need no longer be a homogeneous Markov chain as in the
previous queuing models
...
4)
...
Customers arrive according to a homogeneous Poisson process with positive intensity λ
...
}
be its probability distribution
...

i!
Hence,
ai =

∞

(λ t) i −λt
e g(t) dt ,
i!
0

∫

i = 0, 1,
...

Consequently, if g(⋅) denotes the Laplace transform of g(t) , then
M A (z) = g(λ − λz)
...
90)

By (1
...

z=1 = −λ dr
dz
r=0

(5
...
If X n denotes the number of customers in the system immediately after T n , then {X 1 , X 2 ,
...
} and one-step transition probabilities
⎧ aj
if i = 0 and j = 0, 1, 2,
...

⎪
otherwise
⎩ 0

(5
...
; X 0 = 0
...

The discrete-time Markov chain {X 0 , X 1 ,
...
Hence, on
condition ρ = λ/μ < 1 it has a stationary state distribution {π 0 , π 1 ,
...
9): Inserting
the transition probabilities p i j given by (5
...
9) gives
π 0 = a 0 (π 0 + π 1 ) ,
j+1

π j = π 0 a j + Σ i=1 π i a j−i+1 ; j = 1, 2,
...
93)

Let M X (z) be the z-transform of the state X of the system in the steady state:
∞

M X (z) = Σ j=0 π j z j
...
93) by z j and summing up from j = 0 to ∞ yields
∞

∞

j+1

M X (z) = π 0 Σ j=0 a j z j + Σ j=0 z j Σ i=1 π i a j−i+1
= π 0 M A (z) + M A (z)

∞
Σ i=1 π i z i−1 a j−i+1

= π 0 M A (z) + M A (z)

M X (z) − π 0

...

To determine π 0 , note that
M A (1) = M X (z) = 1
and
M A (z) − z
M (z) − 1 ⎞
dM A (z)
⎛
= lim ⎜ 1 + A
⎟ =1−
z=1 = 1 − ρ
...
94)

292

STOCHASTIC PROCESSES

Therefore, by letting z ↑ 1 in (5
...

(5
...
90), (5
...
95) yields the Formula of Pollaczek-Khinchin :
M X (z) = (1 − ρ)

1−

1−z
,
z
g(λ − λz)

z < 1
...
96)

According to its derivation, this formula gives the z-transform of the stationary
distribution of the random number X of customers in the system immediately after
the completion of a customer's service
...
Thus, X is the random number of customers at the
system in its steady state
...
} exists and is a
solution of (5
...
Hence, numerical parameters as mean value and variance of the
number of customers in the system in the steady state can be determined by (5
...
23)
...

z=1 = ρ +
2 (1 − ρ)
dz

(5
...
Then T has structure
T = W + Z,
where W is the time a customer has to wait for service (waiting time)
...
Since W and Z
are independent,
f T (r) = f W (r) g(r)
...
98)

The number of customers in the system after the departure of a served one is equal to
the number of customers which arrived during the sojourn time of this customer
...

i!
0

∫

The corresponding z-transform M X (z) of X or, equivalently, the z-transform of the
stationary distribution {π 0 , π 1 ,
...
90))
M X (z) = f T (λ − λ z)
...
98),
M X (z) = f W (λ − λz) g(λ − λz)
...
96) yields the Laplace transform of f W (r) :
r
f W (r) = (1 − ρ)

...
19) and (1
...
99)

λ E(Z 3 )

...

Thus,
E(S) = ρ
...
Hence, by (5
...

2 (1 − ρ)

(5
...
97) and (5
...
88):
E(L) = λ E(W)
...
7
...
3 GI/M/1/∞ - System
In this single-server system, the interarrival times are given by an ordinary renewal
process {Y 1 , Y 2 ,
...
The service times are identically
exponential distributed with parameter μ
...
If an arriving customer finds the server busy, it joins
the queue
...
However, as in the previous section, an embedded homogeneous discrete-time
Markov chain can be identified: The n th customer arrives at time
n

T n = Σ i=1 Y i ; n = 1, 2,
...
Then, 0 ≤ X n ≤ n; n = 0, 1,
...
} is a Markov chain with parameter space
T = {0, 1,
...
Given that the system starts operating at
time t = 0, the initial distribution of this discrete-time Markov chain is
P(X 0 = 0) = 1
...
}, let D n be the number of customers leaving the station in the interval [T n , T n+1 ) of length Y n+1
...
,
By theorem 3
...
Hence, for i ≥ 0 and 1 ≤ j ≤ i + 1,
(μ t) i+1−j −μt
P(X n = j X n−1 = i, Y n+1 = t ⎞ =
⎠ (i + 1 − j)! e ; n = 1, 2,
...

of the Markov chain {X 0 , X 1 ,
...

The normalizing condition yields p i 0 :
i+1

p i0 = 1 − Σ j=1 p i j
...
} is a homogeneous Markov chain
...

Based on the embedded Markov chain {X 0 , X 1 ,
...

5
...
4

Waiting-Loss Systems

5
...
4
...
A customer
which at arrival finds no idle server and the waiting capacity occupied is lost, that is
it leaves the system immediately after arrival
...
, s + m} and transition rates
λ j = λ,

0 ≤ j ≤ s + m − 1,

⎧ j μ for
1≤j≤s
μj = ⎨

...
60) and (5
...

⎥
π0 = ⎢ Σ 1 ρ j + Σ
⎢
⎥
⎢ j=0 j!
⎥
j=s s! s j−s
⎣
⎦

The second series in π 0 can be summed up to obtain
⎧
⎪
⎪
⎪
π0 = ⎨
⎪
⎪
⎪
⎩

⎡ s−1 1 j 1 s 1−(ρ /s) m+1 ⎤ −1
⎢Σ
⎥
ρ + ρ
for ρ ≠ s
⎢
s!
1−ρ /s ⎥
⎢ j=0 j!
⎥
⎣
⎦
−1
⎡ s−1 1 j
ss ⎤
⎥
⎢Σ
ρ + (m + 1) ⎥
⎢
s! ⎥
⎢ j=0 j!
⎦
⎣

...
e
...
The respective probabilities π f and π w that an arriving customer finds a
free (idle) server or waits for service are
s−1

π f = Σ i=0 π i ,

s+m−1

π w = Σ i=s

πi
...

Thus, the degree of server utilisation is
η = ρ (1 − π s+m ) /s
...

Example 5
...
On average, 1
...
The mean time a car
occupies a petrol pump is 5 minutes
...
Since λ = 1
...
2, the traffic intensity is ρ = 6
...
0167
...
00225
...
0167) = 5
...

After having obtained these figures, the owner of the filling station considers 2 out of
the 8 petrol pumps superfluous and has them pulled down
...
The corresponding loss probability π 12 = π 12 (6, 6) becomes
6
π 12 (6, 6) = 6 π 0 (6, 6) = 0
...

6!

Thus, about 10% of all arriving cars leave the station without having filled up
...
The corresponding loss probability π 16 = π 16 (6, 10) is
6
π 16 (6, 10) = 6 π 0 (6, 10) = 0
...

6!

Formula
−1
6 ⎡ 5
6⎤
π 6+m (6, m) = 6 ⎢ Σ 1 6 j + (m + 1) 6 ⎥
⎢
⎥
6! ⎢ j=0 j!
6! ⎥
⎣
⎦
yields that additional waiting capacity for 51 cars has to be provided to equalize the
loss caused by reducing the number of pumps from 8 to 6
...
7
...
2 M/M/s/∞ -System with Impatient Customers
Even if there is waiting capacity for arbitrarily many customers, some customers
might leave the system without having been served
...
If the service of a
customer does not begin before its patience time expires, the customer leaves system
...
Real time monitoring and control systems have memories for data to be processed
...
Bounded waiting
times are also typical for packed switching systems, for instance in computer-aided
booking systems
...
Of the many available models dealing with such situations, the following one is considered in some detail: Customers
arriving at an M/M/s/∞ -system have independent, exponentially with parameter ν
distributed patience times
...
,

jμ
s μ + (j − s)ν

for j = 1, 2,
...

for j = s, s + 1,
...
Hence the sufficient
condition for the existence of a stationary distribution stated in theorem 5
...
6
...
2) is fulfilled
...
That is, the system is self-regulating, aiming at reaching the equilibrium state
...
60) and (5
...
, s
for j = s + 1,
...

⎤ −1
⎡
⎥
⎢ s
s ∞
⎥
⎢
1 j ρ
λ j−s
π0 = ⎢ Σ ρ +
Σ
⎥
...
Then,
∞

E(L) = Σ j=s+1 ( j − s) π j
...

⎥
⎢
⎥
⎢ i=1
j=1
⎦
⎣
∞

Σ

In this model, the loss probability π v is not strictly associated with the number of customers in the system
...
Therefore, 1 − π v is the
probability that a customer leaves the system after having been served
...
} one obtains
E(L) = λ π v
...
(Compare to Little's formula (5
...
)

© 2006 by Taylor & Francis Group, LLC

298

STOCHASTIC PROCESSES

Variable Arrival Intensity Finite waiting capacities and patience times imply that
in the end only a 'thinned flow' of potential customers will be served
...
However, those customers which actually enter the
system do not leave it without service
...
For example, the
following birth rates have this property:
⎧ λ
⎪
λj = ⎨ s λ
⎪ j+α
⎩

5
...
5

for j = 0, 1,
...

,

α ≥ 0
...
7
...
1 System with Priorities
A single-server queueing system with waiting capacity for m = 1 customer is subject
to two independent Poisson inputs 1 and 2 with respective intensities λ 1 and λ 2
...
Type 1-customers
have absolute (preemptive) priority, i
...
when a type 1 and a type 2-customer are in
the system, the type 1-customer is being served
...
The displaced customer will
occupy the waiting facility if it is empty
...
A waiting
type 2-customer also has to leave the system when a type 1-customer arrives, since
the newcomer will occupy the waiting facility
...
) An arriving type 1-customer is lost only
when both server and waiting facility are occupied by other type 1-customers
...

The service times of type 1- and type 2- customers are assumed to have exponential
distributions with respective parameters μ 1 and μ 2
...
Note that if X(t) denotes the system state at time t, the stochastic
process {X(t), t ≥ 0} can be treated as a one-dimensional Markov chain, since scalars
can be assigned to the six possible system states, which are given as two-component
vectors
...
Figure 5
...

According to (5
...
14 Transition graph for a single-server priority queueing system with m = 1

(λ 1 + λ 2 ) π (0,0) = μ 1 π (1,0) + μ 2 π (0,1)
(λ 1 + λ 2 + μ 1 ) π (1,0) = λ 1 π (0,0) + μ 1 π (2,0)
(λ 1 + λ 2 + μ 2 ) π (0,1) = λ 2 π (0,0) + μ 1 π (1,1) + μ 2 π (0,2)
(λ 1 + μ 1 ) π (1,1) = λ 2 π (1,0) + λ 1 π (0,1) + λ 1 π (0,2)
μ 1 π (2,0) = λ 1 π (1,0) + λ 1 π (1,1)
(λ 1 + μ 2 ) π (0,2) = λ 2 π (0,1)
π (0,0) + π (1,0) + π (0,1) + π (1,1) + π (2,0) + π (0,2) = 1
m = 0 Since there is no waiting capacity, each customer, notwithstanding its type, is
lost if the server is busy with a type 1-customer
...
The state space is
Z = {(0, 0), (0, 1), (1, 0)}
...
15 shows the transition rates
...
9) for the stationary state probabilities is
(λ 1 + λ 2 ) π (0,0) = μ 1 π (1,0) + μ 2 π (0,1)
μ 1 π (1,0) = λ 1 π (0,0) + λ 1 π (0,1)
1 = π (0,0) + π (1,0) + π (0,1)
The solution is
π (0,0) =
π (0,1) =

μ 1 (λ 1 + μ 2 )
,
(λ 1 + μ 1 )(λ 1 + λ 2 + μ 2 )

λ2 μ1
,
(λ 1 + μ 1 )(λ 1 + λ 2 + μ 2 )

π (1,0) =

λ1

...
It is simply the probability that the
service time of type 1-customers is greater than their interarrival time
...
15 Transition graph for a 1-server priority loss system

that at the arrival time of a type 2-customer the server is idle, this customer is lost if
and only if during its service a type 1-customer arrives
...

1 2

∫0

Therefore, the (total) loss probability for type 2-customers is
λ

1
π l = λ +μ π (0,0) + π (0,1) + π (1,0)
...
18 Let λ 1 = 0
...
2, and μ 1 = μ 2 = 0
...
Then the stationary state
probabilities are
π (0,0) = 0
...
3073, π (1,0) = 0
...
1765, π (0,2) = 0
...
0924
...
4000 , π (1,0) = 0
...
2667
...
7333
...
7
...
2 M/M/1/m - System with Unreliable Server
If the implications of server failures on the system performance are not negligible,
server failures have to be taken into account when building up a mathematical model
...
The lifetime of the
server is assumed to have an exponential distribution with parameter α , both in its
busy phase and in its idle phase, and the subsequent renewal time of the server is
assumed to be exponentially distributed with parameter β
...
When the server fails, all customers leave the system, i
...
, the
custom- er being served and the waiting customers if there are any are lost
...
e
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV PROCESSES

301

α
α
α
α
m+2

1

0

β

λ

λ

λ

2

µ

µ

µ

...

λ
m+1
µ

Figure 5
...
, m + 1
X(t) = ⎨

...
16):
q j,j+1 = λ;
q j, j−1 = μ;

j = 0, 1,
...
, m + 1

(5
...
m + 1
q m+2,0 = β
According to (5
...
, m

(5
...
+ α π m+1
The last equation is equivalent to
β π m+2 = α (1 − π m+2 )
...

α+β

Now, starting with the first equation in (5
...
, π m+1 can be successively determined
...

(5
...

α+β

Modification of the Model It makes sense to assume that the server can only fail
when it is busy
...
, m + 1
...
101) remain valid
...
16 with the arrow from node 0 to node
m + 2 deleted
...
, m

(5
...
+ απ m+1
The last equation is equivalent to
βπ m+2 = α(1 − π 0 − π m+2 )
...

0
α+β

Starting with the first equation in (5
...
, π m+1 can be
obtained as above
...

β(α + μ) + λ(α + β)

Comment It is interesting that this queueing system with unreliable server can be interpreted as a queueing system with priorities and absolutely reliable server
...
The service provided to this 'customer' consists in the renewal of the
server
...
Hence it is not surprising that the theory of
queueing systems with priorities also provides solutions for more complicated queuing systems with unreliable servers than the one considered in this section
...
7
...
7
...
1 Introduction
Customers frequently need several kinds of service so that, after leaving one service
station, they have to visit one or more other service stations in a fixed or random
order
...
12
...

Typical examples are technological processes for manufacturing (semi-) finished
products
...
Queuing systems are frequently subject to several inputs, i
...
customers with
different service requirements have to be attended
...
Examples of such situations are computer- and communication networks
...

If technical systems have to be repaired, then, depending on the nature and the extent
of the damage, service of different production departments in a workshop is needed
...

Using a concept from graph theory, the service stations of a queueing network are
called nodes
...
Each node may have its own external input
...
Thus, in an open network, each node may have to serve external and internal
customers, where internal customers are the ones which arrive from other nodes
...
Consequently, no customer departs
from the network
...
The
directed edges between the nodes symbolize the possible transitions of customers
from one node to another
...
, n
...

5
...
6
...
Hence, this section is restricted to a rather simple
class of queueing networks, the Jackson queueing networks
...

2) The service times of all servers at node i are independent, identically distributed
exponential random variables with parameter (intensity) μ i
...

© 2006 by Taylor & Francis Group, LLC

304

STOCHASTIC PROCESSES

3) External customers arrive at node i in accordance with a homogeneous Poisson
process with intensity λ i
...

4) When the service of a customer at node i has been finished, the customer makes a
transition to node j with probability p i j or leaves the network with probability a i
...
Let I be the identity matrix
...

According to the definition of the a i and p i j ,
n

a i + Σ j=1 p i j = 1
...
105)

In a Jackson queueing network, each node is principally subjected to both external
and internal input
...
In the steady
state, α j must be equal to the total output intensity from node j
...
Thus,
n

Σ i=1 α i p i j
is the total internal input intensity to node j
...
, n
...
106)

By introducing vectors the α = (α 1 , α 2 ,
...
, λ n ), the relationship (5
...

Since I − P is assumed to be nonsingular, the vector of the total input intensities α is
α = λ (I − P) −1
...
107)

Even under the assumptions stated, the total inputs at the nodes and the outputs from
the nodes are generally nonhomogeneous Poisson processes
...
Its realizations are
denoted as x i ; x i = 0, 1,
...
, X n (t))
with realizations x = (x 1 , x 2 ,
...
The set of all these vectors x forms the state
space of the Markov chain {X(t), t ≥ 0}
...
} n , i
...
Z is the set of all those n-dimensional vectors the
components of which assume nonnegative integers
...

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV PROCESSES

305

To determine the transition rates of {X(t), t ≥ 0}, the n-dimensional vector e i is introduced
...
, 0, 1, 0,
...

1 2
...

(5
...
Since the components of any state
vector x are nonnegative integers, each x can be represented as a linear combination
of all or some of the e 1 , e 2 ,
...
In particular, x + e i (x − e i ) is the vector which
arises from x by increasing (decreasing) the ith component by 1
...

2) When a service at node i is finished, x i > 0 , and the served customer leaves the
network, the Markov chain makes a transition to state x − e i
...

Therefore, starting from state x = (x 1 , x 2 ,
...
105),

Σ

j, j≠i

p i j = 1 − p ii − a i
...

According to (5
...

(5
...
80)),
⎧ 1 j
for j = 1, 2 ,
...

⎪
j−s i i i
⎪
⎩ si! si
s
⎡ s i −1
⎤ −1
ρi i
1 j
⎢
⎥ ,
ϕ i (0) = ⎢ Σ
ρi +
(s i −1)! (s i −ρ i ) ⎥
⎢ j=0 j !
⎥
⎣
⎦

,

ρi < si,

ρi < si
...
) The stationary state probabilities of the queueing network
are simply obtained by multiplying the corresponding state probabilities of the queuing systems M/M/s i /∞; i = 1, 2,
...
, α n ) given by (5
...
, n ;
then the stationary probability of state x = (x 1 , x 2 ,
...

(5
...
This implies that each node of the network behaves like an M/M/s i /∞ -system
...
In particular, the total
input into a node need not be a homogeneous Poisson process
...
110) of the stationary state probabilities proves that the queue lengths at the nodes
in the steady state are independent random variables
...

To verify that the stationary state distribution indeed has the product form (5
...
110) into the system of equations (5
...
Using (5
...
106), one obtains an identity after some tedious algebra
...
19 The simplest Jackson queueing network arises if n = 1
...
This leads to a queueing system with feedback
...

...

waiting capacity

a

2

...

...
17 Queueing system with feedback

For instance, when servers have done a bad job, then the affected customers will
soon return to exercise possible guarantee claims
...
Roughly speaking, a single-node Jackson queueing network is a mixture between an open and a closed waiting system (Figure 5
...
A customer leaves
the system with probability a or reenters the system with probability p 11 = 1 − a
...

From (5
...
106), the total input rate α into the system satisfies
α = λ + α(1 − a)
...
) Thus,
α = λ/a
...
In this case the stationary state probabilities are
⎧ 1 ⎛ρ⎞ j
for j = 1, 2,
...

⎩ s ! s j−s ⎝ ⎠

,

where
⎡
⎛ρ⎞ s
⎝a⎠
⎢ s−1 1 ⎛ ρ ⎞ j
π0 = ⎢ Σ
⎝a⎠ +
ρ
⎢ j=1 j !
(s − 1)! ⎛ s − a ⎞
⎠
⎝
⎣

⎤ −1
⎥
⎥
...

© 2006 by Taylor & Francis Group, LLC

308

STOCHASTIC PROCESSES

Example 5
...

For example, a 'customer' may be a car being manufactured on an assembly line
...
They subsequently visit in this order
the nodes 1, 2,
...
18)
...

μ2

n

an = 1

μn

Figure 5
...
, n

p i ,i+1 = 1;

i = 1, 2,
...
= a n−1 = 0 ,

an = 1

According to (5
...
= αn
...
, n), a stationary state distribution
exists if
ρ i = λ 1 /μ i < 1 ; i = 1, 2,
...
, μ n )
...
The stationary probability of state x = (x 1 , x 2 ,
...

Of course, the sequential network can be generalized by taking feedback into account
...

Example 5
...
2 [h −1 ]
...
Depending on the result, the robots will have to visit other departments of
the workshop
...
The failure diagnosis in
© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV PROCESSES

309
a 2 = 0
...
2

0
...
2

1

0
...
7

0
...
1

0
...
2

a4 = 1

4

Figure 5
...
After having being maintained
in department (2), 60% of the robots leave the workshop, 30% are sent to department
(3), and 10% to department (4)
...
After elimination of possible software failures all robots leave the
workshop
...

The following transition probabilities result from the transfer of robots between the
departments:
p 12 = 0
...
2 , p 14 = 0
...
3,
p 24 = 0
...
2 ,

p 34 = 0
...

a 1 = 0 , a 2 = 0
...
7 , a 4 = 1
...
45, μ 3 = 0
...
1 [h −1 ]
...
19 illustrates the possible transitions between the
departments
...
The system of equations (5
...
2
α 2 = 0
...
2 α 3

α 3 = 0
...
3 α 2
α 4 = 0
...
1 α 2 + 0
...
20, α 2 = 0
...
08,

α 4 = 0
...

310

STOCHASTIC PROCESSES

The corresponding traffic intensities ρ i = α i / μ i are
ρ 1 = 0
...
3,

ρ 3 = 0
...
6
...
110), the stationary probability of state x = (x 1 , x 2 , x 3 , x 4 ) for single-server
nodes is
4

π x = Π i=1 ρ x i (1 − ρ i )

or
π x = 0
...
2) x 1 (0
...
2) x 3 (0
...
} 4
...
1792,

where x 0 = (0, 0, 0, 0)
...
Then the probability that, in the steady state, there is at least one robot
in the admissions department, is
∞

P(X 1 > 0) = 0
...
2) i = 0
...

Analogously,
P(X 2 > 0) = 0
...
2, and P(X 4 > 0) = 0
...

Thus, when there is a delay in servicing defective robots, the cause is most probably
department (4) in view of the comparatively high amount of time necessary for finding and removing software failures
...
7
...
3 Closed Queueing Networks
Analogously to the closed queueing system, customers cannot enter a closed queueing network 'from outside'
...
Hence, the number of customers in a closed queueing network is a constant N
...

When the service of a customer at node i is finished, then the customer moves with
probability p i j to node j for further service
...
, n ,

(5
...
Provided the discrete Markov chain given
by transition matrix P = ((p i j )) and state space Z = (1, 2,
...
, π n } which according to (4
...
, n,

(5
...

Let X i (t) be the random number of customers at node i at time t and
X(t) = (X 1 (t), X 2 (t),
...

The state space of the Markov chain {X(t), t ≥ 0} is
Z = x = (x 1 , x 2 ,
...
113)

where the x i are nonnegative integers
...

⎝
⎠
N

Let μ i = μ i (x i ) be the service intensity of all servers at node i if there are x i customers at this node, μ i (0) = 0
...
108)
...
111), the rate of leaving state x is
n

q x = Σ i=1 μ i (x i )(1 − p ii )
...
28), the stationary distribution {π x , x ∈ Z} of the Markov
chain {X(t), t ≥ 0} satisfies
n

Σ

i=1

μ i (x i )(1 − p ii ) π x =

n

Σ

i,j=1,i≠j

μ j (x j + 1) p j i π x−e i +e j ,

(5
...
, x n ) ∈ Z
...
Let ϕ i (0) = 1 and
ϕ i (j) =

j ⎛ π ⎞
i

Π

⎟;

⎜

k=1 ⎝ μ i (k) ⎠

i = 1, 2,
...
, N
...
, x n ) ∈ Z is
πx = h

n

Π ϕ i (x i ) ,

i=1

n
⎡
⎤ −1
h = ⎢ Σ Π ϕ i (y i ) ⎥
⎢
⎥
⎢ y∈Z i=1
⎥
⎣
⎦

(5
...
, y n )
...
115) into (5
...

© 2006 by Taylor & Francis Group, LLC

312

STOCHASTIC PROCESSES

1
μ1

...
20 Closed sequential queueing network

Example 5
...
There is only N = 1 customer in the system
...
Hence, with vectors e i as defined by (5
...
, e n }
...
, n − 1; p n,1 = 1
...
114) is a uniform distribution:
π 1 = π 2 =
...

Let μ i = μ i (1) be the service rate at node i
...
, n ;

n 1 −1
h = n ⎡ Σ i=1 μ ⎤
...
115) are
πei =

1/μ i
n

1
Σ i=1 μ i

; i = 1, 2,
...

In particular, if μ i = μ ; i = 1, 2,
...
, n
...
(1/μ n ) x n
⎝ 1⎠ ⎝ 2⎠
πx =
n
1 yi
Σ Π ⎛ μi ⎞
⎝ ⎠
y∈Z i=1

where x = (x 1 , x 2 ,
...
Given μ i = μ; i = 1, 2,
...

5 CONTINUOUS-TIME MARKOV PROCESSES

313

Example 5
...
A new program starts in the central processor 2
...
From the disc drive the program goes to central processor 3 with probability 1
...
Here it terminates or goes back to central processor 2
...
Hence, a program formally goes from the printer to the central
processor 2 with probability 1
...
The state space Z of this system and the matrix P of the
transition probabilities p ij are
Z = y = (y 1 , y 2 , y 3 , y 4 ); y i = 0, 1,
...
21)
...
114) is
π1 =

1−α ,
4−α−β

π2 = π3 =

1
,
4−α−β

π4 =

1−β

...
Then,
π xi
ϕ i (x i ) = ⎛ μ i ⎞ , i = 1, 2,
...

⎝ i⎠

2

1−α
α

1
1

1
β

3

4
1−β

Figure 5
...

−α
Σ ⎛ 1μ 1 ⎞ ⎛ μ 2 ⎞ ⎝ μ3 ⎠ ⎝ μ 4 ⎠
⎝
⎠ ⎝ ⎠
y∈Z

Application-oriented treatments of queueing networks are, for instance, Gelenbe and
Pujolle [32], Walrand [86]
...
8 SEMI-MARKOV CHAINS
Transitions between the states of a continuous-time homogeneous Markov chain are
controlled by its transition probabilities
...
4, the sojourn time in
a state has an exponential distribution and depends on the current state, but not on
the history of the process
...
This approach leads to the semi-Markov chains
...
}
evolves in the following way: Transitions between the states are governed by a discrete-time homogeneous Markov chain {X 0 , X 1 ,
...

If the process starts at time t = 0 in state i 0 , then the subsequent state i 1 is determined according to the transition matrix P, while the process stays in state i 0 a random
time Y i i
...
The process stays
0 1

in state i 1 a random time Y i i and so on
...
They are assumed to be independent
...
The
sample paths of a semi-Markov chain are piecewise constant functions which, by con-

© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

315

vention, are continuous on the right
...

Let T 0 , T 1 ,
...
Then
X n = X(T n ) ; n = 0, 1,
...
116)

where X 0 = X(0) is the initial state ( X n = X(T n + 0))
...

In view of (5
...
} is embedded in
the (continuous-time) semi-Markov chain {X(t), t ≥ 0} (see section 5
...

As already pointed out, the future development of a semi-Markov chain from a jump
point T n is independent of the entire history of the process before T n
...
By the total probability rule, the unconditional sojourn time Y i of the chain in state i is
F i (t) = P(Y i ≤ t) = Σ j∈Z p i j F i j (t),

i ∈ Z
...
117)

Special cases 1) An alternating renewal process is a semi-Markov chain with state
space Z = {0, 1} and transition probabilities
p 00 = p 11 = 0 and p 01 = p 10 = 1
...
In this case, F 01 (⋅) and F 10 (⋅) are in this order the distribution functions of the
renewal time and the system lifetime
...
} is
a semi-Markov chain with the same state space and transition probabilities (5
...
By (5
...

In what follows, semi-Markov processes are considered under the following three
assumptions:

© 2006 by Taylor & Francis Group, LLC

316

STOCHASTIC PROCESSES

1) The embedded homogeneous Markov chain {X 0 , X 1 ,
...
By (4
...

(5
...
3, a unique stationary state distribution exists if the Markov chain {X 0 , X 1 ,
...

2) The distribution functions F i (t) = P(Y i ≤ t) are nonarithmetic
...
3
...
Otherwise, the distribution function is nonarithmetic
...

Note μ i denotes no longer an intensity, but a mean sojourn time
...
Let N k (t) be the random number of k-transitions occuring in [0, t] and
H k (t) = E(N k (t))
...

(5
...
Strictly speaking, the right-hand side of (5
...
The following
formulas and the analysis of examples is based on (5
...

From (5
...

Σ πi μi
i∈Z

Hence the portion of time the chain is in state k is
πk μk
Ak =

...
120)

i∈Z

Consequently, in the long run, the fraction of time the chain is in a set of states Z 0 ,
Z 0 ⊆ Z, is
© 2006 by Taylor & Francis Group, LLC

5 CONTINUOUS-TIME MARKOV CHAINS

317

Σ

AZ =

πk μ k

Σ

πi μ i

k∈Z 0

0

i∈Z

...
121)

In other words, A Z is the probability that a visitor, who arrives at a random time
0
from 'outside', finds the semi-Markov chain in a state belonging to Z 0
...
Then the
mean total (transition) cost per unit time is
C=

Σ

πk ck

Σ

πi μi

k∈Z
i∈Z

(5
...

Note that the formulas (5
...
122) depend only on the unconditional sojourn
times of a semi-Markov chain in its states
...

F(τ)

1

1

F(τ)

0

1

2

Figure 5
...
24

Example 5
...

To determine the stationary system availability, system states have to be introduced:
0
operating
1
emergency renewal
2
preventive renewal
Let L be the random system lifetime, F(t) = P(L ≤ t) its distribution function, and
F(t) = 1 − F(t) = P(L > t)
its survival probability
...
22)
p 01 = F(τ), p 02 = F(τ), p 10 = p 20 = 1
...
Then the conditional sojourn times of the system in the states are
Y 01 = L, Y 02 = τ, Y 10 = Z e , Y 20 = Z p
...

© 2006 by Taylor & Francis Group, LLC

318

STOCHASTIC PROCESSES

The system behaviour can be described by a semi-Markov chain {X(t), t ≥ 0} with
state space Z = {0, 1, 2} and the transition probabilities and sojourn times given
...
118) in the stationary probabilities of the embedded Markov chain are
π0 =
π1 + π2
π 1 = F(τ) π 0

1 =

π0 + π1 + π2

The solution is
π 0 = 1/ 2 ,

π 1 = F(τ) / 2 ,

π 2 = F(τ) / 2
...

According to (5
...

(5
...
21)
...
122), the mean renewal cost
per unit time in the steady state is
K(τ) =

c e π 1 + c p π 2 c e F(τ) + c p F(τ)
=

...

If λ(t) is the failure rate of the system, a cost-optimal renewal interval τ = τ ∗ satisfies
the necessary condition
τ
λ(τ) ∫ 0 F(t) dt − F(τ) = c
1−c
∗ exists if c < 1 and λ(t) strictly increases to
with c = c p /c e
...
Since K(τ) has the same functional structure as
1/A(τ) − 1,
maximizing A(τ) and minimizing K(τ) leads to the same equation type for determining the corresponding optimal renewal intervals
...
25 A series system consists of n subsystems e 1 , e 2 ,
...
The lifetimes
of the subsystems L 1 , L 2 ,
...
, λ n
...
, n
...
As soon as the renewal of the
failed subsystem is finished, the system continues operating
...
As long as a subsystem is being renewed, the other
subsystems cannot fail, i
...
during such a time period they are in the cold-standby
mode
...
, n
...
, n}
...
, n,
and its unconditional sojourn time in state 0 is
Y 0 = min{L 1 , L 2 ,
...

Thus, Y 0 has distribution function
F 0 (t) = 1 − G 1 (t) ⋅ G 2 (t)
...

Letting λ = λ 1 + λ 2 +
...

The system makes a transition from state 0 into state i with probability
p 0i = P(Y 0 = L i )
∞
= ∫ 0 G 1 (x) ⋅ G 2 (x)
...
G n (x) g i (x) dx

...

∞
= ∫ 0 e −(λ 1 +λ 2 + +λ i−1 +λ i+1 + +λ n ) x λ i e −λ i x dx
∞
= ∫ 0 e −λx λ i dx
...
, n
...
118) becomes
π0 = π1 + π2 +
...
, n
...
+ π n = 1 − π 0 , the solution is
λ
π 0 = 1 ; π i = i ; i = 1, 2,
...

2
2λ
With these ingredients, formula (5
...

n
1 + Σ i=1 λ i μ i

Example 5
...
Hence, their interarrival times are identically distributed as an exponential random variable Y with parameter λ
...
L 0 is exponential with parameter λ 0 and L 1 is exponential with parameter λ 1
...
When at the time point of server failure a customer is
being served, then this customer is lost, i
...
it has to leave the system
...
7
...
2)
...

State 1 The server is busy
...

To determine the steady state probabilities of the states 0, 1 and 2, the transition probabilities p i j are needed:
p 00 = p 11 = p 22 = p 21 = 0, p 20 = 1
∞
λ
p 01 = P( L 0 > Y) = ∫ 0 e −λ 0 t λ e −λ t dt = λ+λ
0
λ

0
p 02 = 1 − p 01 = P(L 0 ≤ Y) = λ+λ
0

∞
p 10 = P(L 1 > Z) = ∫ 0 e −λ 1 t b(t) dt
∞

p 12 = 1 − p 10 = P(L 1 ≤ Z) = ∫ 0 [1 − e −λ 1 t ]b(t) dt
...
} can be obtained from (5
...

2(λ + λ 0 ) + λ p 12
2(λ + λ 0 ) + λ p 12
2(λ + λ 0 ) + λ p 12

The sojourn times in state 0, 1 and 2 are:
Y 0 = min (L 0 , Y), Y 1 = min (L 1 , Z), Y 2 = Z
...

With these parameters, the stationary state probabilities of the semi-Markov process
are given by (5
...

The time-dependent behaviour of semi-Markov chains is discussed, for instance, in
Kulkarni [52]
...
9 EXERCISES
5
...
Check
whether {X(t), t ≥ 0} is a homogeneous Markov chain
...
2) A system fails after a random lifetime L
...
A renewal takes another random time Z
...
On completion of a renewal, the system immediately resumes its work
...
All life, waiting, and renewal times are assumed to be independent
...

(1) Draw the transition graph of the corresponding Markov chain {X(t), t ≥ 0}
...

5
...
e
...
When a subsystem fails, the other one continues to
work
...
On its completion,
both subsystems resume their work at the same time
...
The joint renewal time is exponential
with parameter µ
...
Let X(t)
be the number of subsystems operating at time t
...

(2) Given P(X(0) = 2) = 1, determine the time-dependent state probabilities
p i (t) = P(X(t) = i); i = 0, 1, 2
...

© 2006 by Taylor & Francis Group, LLC

STOCHASTIC PROCESSES

322
Hint Consider separately the cases

(λ + μ + ν) 2 (=)(<)(>) 4(λμ + λν + μν)
...
4) A launderette has 10 washing machines which are in constant use
...
There are two mechanics who repair failed machines
...
During this time,
the second mechanic is busy repairing another failed machine, if there is any, or this
mechanic is idle
...
All random variables involved are independent
...

1) What is the average percentage of operating machines?
2) What is the average percentage of idle mechanics?
5
...
5
a) on condition that the lifetimes of the units are exponential with respective parameters λ 1 and λ 2
...
5 remain valid
...

5
...
6
on condition that the lifetimes of the units are exponential with parameters λ 1 and
λ 2 , respectively
...
6 remain valid
...

5
...
7 is generalized as follows: If the system
makes a direct transition from state 0 to the blocking state 2, then the subsequent renewal time is exponential with parameter μ 0
...

(1) Describe the behaviour of the system by a Markov chain and draw the transition
graph
...
8) Consider a two-unit system with standby redundancy and one mechanic
...

Apart from this, the other model assumptions listed in example 5
...

(1) Describe the behaviour of the system by a Markov chain and draw the transition
graph
...

(3) Sketch the stationary availability of the system as a function of
ρ = λ/μ
...
9) When being in states 0, 1, and 2 a (pure) birth process {X(t), t ≥ 0} with state
space Z = {0, 1, 2,
...

Given X(0) = 0, determine the time-dependent state probabilities
p i (t) = P(X(t) = i) for the states i = 0, 1, 2
...
10) Consider a linear birth process with birth rates
λ j = j λ, j = 0, 1,
...

(1) Given X(0) = 1, determine the distribution function of the random time point T 3
at which the process enters state 3
...

5
...
Its splits into two particles of
the same type after an exponential random time Y with parameter λ (its lifetime)
...
e
...
All lifetimes of the particles are assumed to be independent
...

Determine the absolute state probabilities
p j (t) = P(X(t) = j) ; j = 1, 2,
...

5
...
} has death rates
μ 0 = 0, μ 1 = 2, and μ 2 = μ 3 = 1
...

5
...

(1) Given X(0) = 2, determine the distribution function of the time to entering state 0
('lifetime' of the process)
...

5
...
After an exponential random time

© 2006 by Taylor & Francis Group, LLC

324

STOCHASTIC PROCESSES

with parameter µ any molecule of type b combines, independently of the others, with
a molecule of type a to form a molecule ab
...
15) At time t = 0 a cable consists of 5 identical, intact wires
...

Given a load of w kp per wire, the time to breakage of a wire (its lifetime) is exponential with mean value
1000 [weeks]
...
For any fixed number of wires, their lifetimes are assumed
to be independent and identically distributed
...
16)* Let {X(t), t ≥ 0} be a death process with X(0) = n and positive death rates
μ 1 , μ 2 ,
...

Prove: If Y is an exponential random variable with parameter λ and independent of
the death process, then
n
μi
P(X(Y) = 0) = Π

...
17) Let a birth- and death process have state space Z = {0, 1,
...
, n
...

5
...
,
j+2
has a stationary state distribution
...
19) A birth- and death process has transition rates
λ j = (j + 1)λ and μ j = j 2 μ; j = 0, 1,
...

Confirm that this process has a stationary state distribution and determine it
...
20) A computer is connected to three terminals (for example, measuring devices)
...
When the computer is processing two data records and in the meantime another data record has
been produced, then this new data record has to wait in a buffer when the buffer is
empty
...
(The buffer can store only one data record
...
The
terminals produce data records independently according to a homogeneous Poisson
process with intensity λ
...
They are assumed to be independent of the input
...

(1) Verify that {X(t), t ≥ 0} is a birth- and death process, determine its transition rates
and draw the transition graph
...
e
...

5
...
20, it is assumed that a
data record which has been waiting in the buffer a random patience time, will be deleted as being no longer up to date
...
They
are also independent of all arrival and processing times of the data records
...

5
...
21, it is assumed that a
data record will be deleted when its total sojourn time in the buffer and computer exceeds a random time Z, where Z has an exponential distribution with parameter α
...

Determine the stationary loss probability
...
23) A small filling station in a rural area provides diesel for agricultural machines
...
On average, 8 machines
per hour arrive for diesel
...
The mean time a machine occupies the pump is 5 minutes
...

(1) Determine the stationary loss probability
...

5
...
Customers arrive according to a homogeneous Poisson process with intensity λ
...
e
...
The service times of both servers are idd exponential random variables with
parameter μ
...

Determine the stationary state probabilities of the stochastic process {X(t), t ≥ 0}
...
25) A 2-server loss system is subject to a homogeneous Poisson input with intensity λ
...
Otherwise, a customer goes to the idle server (if there is any)
...
All arrival and service times are independent
...

5
...

If there are not more than 3 customers in the system, the service times have an exponential distribution with mean 1/μ = 2 [min]
...
All arrival
and service times are independent
...

(2) Determine the mean length of the waiting queue in the steady state
...
27) Taxis and customers arrive at a taxi rank in accordance with two independent
homogeneous Poisson processes with intensities λ 1 = 4 an hour and λ 2 = 3 an hour,
respectively
...
(Groups of customers who will use the same
taxi are considered to be one customer
...

What is the average number of customers waiting at the rank?
5
...
There are 2 maintenance
teams for repairing the trucks after a failure
...
The times between failures
of a truck (lifetime) is exponential with parameter λ
...
All life and repair times are assumed to be independent
...
2
...
29) Ferry boats and customers arrive at a ferry station in accordance with two independent homogeneous Poisson processes with intensities λ and μ , respectively
...
If k > n, then the remaining
k − n customers wait for the next boat
...

Model the situation by a suitable homogeneous Markov chain {X(t), t ≥ 0} and draw
the transition graph
...
30) The life cycle of an organism is controlled by shocks (e
...
virus attacks, accidents) in the following way: A healthy organism has an exponential lifetime L with
parameter λ h
...

However, a sick organism may recover and return to the healthy state
...
If during a period of sickness another shock
occurs, the organism cannot recover and will die a random time D after the occurrence of the second shock
...

The random variables L, S, R, and D are assumed to be independent
...

(2) Determine the mean lifetime of the organism
...
31) Customers arrive at a waiting system of type M/M/1/∞ with intensity λ
...
As soon
as the n th customer arrives, the server resumes its work and stops working only then,
when all customers (including newcomers) have been served
...
Let 1/μ be the
mean service time of a customer and X(t) be the number of customers in the system
at time t
...

(2) Given that n = 2 , compute the stationary state probabilities
...
)
5
...
As soon as
a computer fails, it is separated from the system by an automatic switching device
with probability 1 − p
...
The lifetimes of the computers are independent and have an exponential distribution with parameter λ
...
Provided the switching device has
© 2006 by Taylor & Francis Group, LLC

328

STOCHASTIC PROCESSES

operated properly when required, the system is available as long as there is at least
one computer available
...
By convention, if, due to the switching device, the entire system has failed in
[0, t), then X(t) = 0
...

(2) Given n = 2, determine the mean lifetime E(X s ) of the system
...
33) A waiting-loss system of type M/M/1/2 is subject to two independent Poisson
inputs 1 and 2 with respective intensities λ 1 and λ 2 (type 1- and type 2-customers)
...

When a type 1-customer and a type 2-customer are waiting, then the type 1-customer
will always be served first, regardless of the order of their arrivals
...

Describe the behaviour of the system by a homogeneous Markov chain, determine
the transition rates, and draw the transition graph
...
34) A queueing network consists of two servers 1 and 2 in series
...
A customer is lost
if server 1 is busy
...
If
server 2 is busy, the customer is lost
...

All arrival and service times are independent
...
35) A queueing network consists of three nodes (queueing systems) 1, 2 and 3,
each of type M/M/1/∞
...
The respective mean service times at the nodes are 4, 2 and 1
[min]
...
4 or leaves the system with probability 0
...
From node 2, a customer goes to node 3 with probability 0
...
1
...
2 or leaves the system
with probability 0
...
The external inputs and the service times are independent
...

(2) Determine the stationary state probabilities of the network
...
36) A closed queueing network consists of 3 nodes
...
There
are 2 customers in the network
...
All service times are independent random
variables and have an exponential distribution with parameter µ
...
37) Depending on demand, a conveyor belt operates at 3 different speed levels 1, 2,
and 3
...
8 , p 13 = 0
...
5 ,
p 31 = 0
...
6
...

Determine the stationary percentages of time in which the conveyor belt operates at
levels 1, 2, and 3 by modeling the situation as a semi-Markov chain
...
38) The mean lifetime of a system is 620 hours
...
20% of all failures are type 2- failures
...
Upon each
repair the system is 'as good as new'
...
This process is continued indefinitely
...

(1) Describe the situation by a semi-Markov chain with 3 states and draw the transition graph of the underlying discrete-time Markov chain
...

5
...
After a type ifailure the system is said to be in failure state i ; i = 1, 2
...

Thus, if at time t = 0 a new system starts working, the time to its first failure is
Y 0 = min (L 1 , L 2 )
...
After a type 1-failure, the system is switched from failure state 1 into failure state 2
...
When in state 2,
the system is being renewed
...
A renewed system immediately starts working, i
...
the system makes a

© 2006 by Taylor & Francis Group, LLC

330

STOCHASTIC PROCESSES

transition from state 2 to state 0 with probability 1
...

(For motivation, see example 5
...

(1) Describe the system behaviour by a semi-Markov chain and draw the transition
graph of the embedded discrete-time Markov chain
...

5
...
26, determine
the stationary probabilities of the states 0, 1, and 2 introduced there on condition that
the service time B is a constant μ; i
...
determine the stationary state probabilities of
the loss system M/D/1/0 with unreliable server
...
1

DISCRETE-TIME MARTINGALES

6
...
1 Definition and Examples
Martingales are important tools for solving prestigious problems in probability
theory and its applications
...

Heuristically, martingales are stochastic models for 'fair games' in a wider sense, i
...

games in which each 'participant' has the same chance to win and to lose
...
Martingales were introduced as a special class of stochastic processes by J
...
Levy
...
L
...
Martingales as stochastic processes are defined for discrete
and continuous parameter spaces T
...
The definition of a martingale relies heavily on the concept of the conditional mean value of a
random variable given values of other random variables or, more generally, on the
concept of the (random) conditional mean value of a random variable given other
random variables (section 1
...

Definition 6
...
} with state space Z ,
which satisfies E( X n ) < ∞ , n = 0, 1, 2,
...
, x n ) with x i ∈ Z and n = 0, 1,
...
, X 1 = x 1 , X 0 = x 0 ) = x n
...
1)

Under the same assumptions, { X 0 , X 1 ,
...
, X 1 = x 1 , X 0 = x 0 ) ≤ x n ,

(6
...
, X 1 = x 1 , X 0 = x 0 ) ≥ x n
...
3)

If, for instance, the X n are continuous random variables, then, in view of (1
...
1) to (6
...
, X n ) and integrating over its range yields:
© 2006 by Taylor & Francis Group, LLC

332

STOCHASTIC PROCESSES

Martingale:

E(X n+1 ) = E(X n ); n = 0, 1,
...

Submartingale:

E(X n+1 ) ≥ E(X n ); n = 0, 1,
...

(6
...
The
trend function of a supermartingale (submartingale) is nonincreasing (nondecreasing)
...
1) to (6
...
, X 1 = x 1 , X 0 = x 0 ) = 0
E(X n+1 − X n X n = x n ,
...
5)

E(X n+1 − X n X n = x n ,
...
7)

(6
...
} with finite absolute first moments
E( X n ), n = 0, 1,
...
5)
...
} is a martingale and X n is interpreted as the random fortune of a
gambler at time n, then, on condition X n = x n , the conditional mean fortune of the
gambler at time n + 1 is also x n , and this is independent on the development in time
of the fortune of the gambler before n ( fair game)
...

Example 6
...
} be a sequence of independent random variables with E(Y i ) = 0 for n = 1, 2,
...
} defined by
X n = Y 0 + Y 1 +
...

is a martingale
...
, X 1 = x 1 , X 0 = x 0 )
= E(X n + Y n+1 X n = x n ,
...

The sum martingale { X 0 , X 1 ,
...

The constant trend function m = E(X n ), n = 0, 1,
...

© 2006 by Taylor & Francis Group, LLC

6 MARTINGALES

333

Example 6
...
} be a sequence of independent,
positive random variables with E(Y 0 ) < ∞, μ = E(Y i ) < ∞ for i = 1, 2,
...
Y n
...

E(X n+1 X n = x n ,
...
, X 1 = x 1 , X 0 = x 0 ⎞
⎠
= x n E(Y n+1 X n = x n ,
...

Thus, {X 0 , X 1 ,
...

For μ = 1, the random sequence {X 0 , X 1 ,
...

This martingale seems to be a realistic model for describing the development in time
of share prices or derivates from these, since, from historical experience, the share
price at a time point in future is usually proportional to the present share price level
...

Xn
Important special cases are:
1) Discrete Black-Scholes Model:
Y i = eUi

with U i = N(μ, σ 2 ), i = 1, 2,
...
8)

2) Binomial model:
⎧ r with probability α
Yi = ⎨
; i = 1, 2,
...

⎩ 1/r with probability 1 − α
In this case, with a random integer N, N ≤ n , the share price at time n has structure
Xn = Y0 r N ;

n = 0, 1, 2,
...
} is a martingale
...

© 2006 by Taylor & Francis Group, LLC

334

STOCHASTIC PROCESSES

Example 6
...
} be a sequence of independent, identically as Z distributed random variables and θ be a real number with
m(θ) = E(e θ Z ) < ∞
...
} is defined by
Y n = Y 0 + Z 1 +
...

Then the sequence of random variables {X 0 , X 1 ,
...

⎜ m(θ) ⎟
ne
(m(θ))
⎠
i=1 ⎝

(6
...
The proof is easily established, since, in view of the independence of
the Z i ,
E(X n+1 X n = x n , X n−1 = x n−1 ,
...

In particular, if Z is a binary random variable with probability distribution
⎧ 1 with probability p
Z=⎨
,
⎩ −1 with probability 1 − p
then {Y 0 , Y 1 ,
...
In this case,
m(θ) = E(e θZ ) = p e θ + (1 − p) e −θ
...

⎠ ⎟
⎝⎝
⎠

© 2006 by Taylor & Francis Group, LLC

(6
...
4 (likelihood ratios) Suppose the null hypothesis has to be tested that
the random variables Y 0 , Y 1 ,
...

On condition {y, ϕ (y) > 0} = {y, ψ(y) > 0}, the ratio
⎧ ϕ (y) /ψ(y),
r(y) = ⎨
⎩ 0,

ψ(y) > 0
ψ(y) = 0

is introduced
...
} with
X n = r(Y 0 ) r(Y 1 )
...
In view of example 6
...

Example 6
...
These
numbers are independent and have mean value μ
...
Since each of those children will have on
average μ children of its own,
E(X n+1 X n = x n ,
...

(6
...
} is a martingale if μ = 1, a supermartingale if μ ≤ 1, and a submartingale if μ ≥ 1
...
} with
Xn
Zn = n
μ
is a martingale
...
, Z 1 = z 1 , Z 0 = z 0 ⎞
⎠
⎛ X n+1
= E ⎜ n+1
⎝μ
=

X1
x X0
x ⎞
Xn
x
= n ,
...
, X 1 = x 1 , X 0 = x 0 ⎞
⎠
μ n+1
x
1
= n+1 μ x n = n = z n
...
1
...
} as introduced
in definition 6
...
}, which is usually related to {X 0 , X 1 ,
...
5) to (6
...

Definition 6
...
} and {Y 0 , Y 1 ,
...
If
E( X n ) < ∞ for all n = 0, 1,
...
} is a martingale with regard to {Y 0 , Y 1 ,
...
, y n ) with y i
elements of the state space of {Y 0 , Y 1 ,
...
,
E(X n+1 − X n Y n = y n ,
...

(6
...
} is a supermartingale with regard to {Y 0 , Y 1 ,
...
, Y 1 = y 1 , Y 0 = y 0 ) ≤ 0 ,
and a submartingale with regard to {Y 0 , Y 1 ,
...
, Y 1 = y 1 , Y 0 = y 0 ) ≥ 0
...
1
...

Definition 6
...
} be a discrete-time Markov chain (not necessarily homogeneous) with state space Z = {
...
} and transition probabilities
p n (y, z) = P(Y n+1 = z Y n = y) ; y, z ∈ Z; n = 0, 1,
...
is said to be concordant with {Y 0 , Y 1 ,
...

(6
...
1 Let {Y 0 , Y 1 ,
...
, −1, 0, +1,
...
},

© 2006 by Taylor & Francis Group, LLC

6 MARTINGALES

337

a) the sequence of random variables {X 0 , X 1 ,
...

(6
...
}, and
b) the sequence {X 0 , X 1 ,
...

Proof a) By the Markov property and the concordance of h with {Y 0 , Y 1 ,
...
, Y 1 = y 1 , Y 0 = y 0 ⎞
⎠
⎞ − E(X n Y n = y n ,
...
, Y 1 = y 1 , Y 0 = y 0 ⎠
1
1 0
0
= E(h(Y n+1 , n + 1) Y n = y n ) − E(h(Y n , n) Y n = y n )
=

Σ

z∈Z

p n (y n , z) h(z, n + 1) − h(y n , n)

= h(y n , n) − h(y n , n) = 0
...
} is a martingale with regard to {Y 0 , Y 1 ,
...
, x n , the random event A be defined as the 'martingale condition'
A = {X n = x n ,
...

Since the X n are fully determined by the random variables Y n , there exists a set Y
of vectors y = (y n ,
...
, Y 1 = y 1 , Y 0 = y 0 , y ∈ Y,
implies the occurrence of event A:
A=

y∈Y

Ay
...
} is easily established:
E(X n+1 X n = x n ,
...

Hence, {X 0 , X 1 ,
...
1
...
6 (variance martingale ) Let {Z 1 , Z 2 ,
...
, −1, 0, +1,
...

i
i
With an integer-valued constant z 0 , a discrete-time Markov chain {Y 0 , Y 1 ,
...
, −1, 0, +1,
...
+ Zn
...
and Var(Y n ) = Σ i=1 σ 2 for n = 1, 2,
...
15)

is concordant with {Y 0 , Y 1 ,
...
} at time n
...

Therefore,

Σ

z∈Z

p n (y, z) h(z, n + 1) =
=

=
=

Σ

z∈Z

(n+1)

Σ

z∈Z

Σ

z∈Z

(n+1)

q z−y h(z, n + 1)

(n+1)
n+1
q z−y ⎛ z 2 − Σ i=1 σ 2 ⎞
⎝
i⎠

(n+1)
n+1
q z−y ⎡ (z − y + y) 2 − Σ i=1 σ 2 ⎤
i⎦
⎣
z∈Z

Σ

q z−y ( z − y) 2 + 2 y

Σ

z∈Z

(n+1)

q z−y (z − y) +

Σ

z∈Z

(n+1)

n+1

q z−y y 2 − Σ i=1 σ 2
i

n+1

= σ 2 + 2y ⋅ 0 + 1 ⋅ y 2 − Σ i=1 σ 2
n+1
i
n

= y 2 − Σ i=1 σ 2 = h(y, n)
...
Thus, by theorem 6
...
} with X n generated by
2
X n = Y n − Var(Y n )

is a martingale
...
16)

6 MARTINGALES

339

Example 6
...
, S i ≥ 0
...

Hence, his total profit up to time t = n is
n−1

X n = Σ i=0 S i (Y i+1 − Y i ) ;

n = 1, 2,
...
17)

It makes sense to assume that the investor's choice, what amount of share to hold in
[n, n + 1) does not depend on the profit made in this and later intervals, but only on
the profits made in the previous intervals
...
, Y n
...
} is a supermartingale with regard to {Y 0 , Y 1 ,
...
} is a supermartingale
...
, Y 1 = y 1 , Y 0 = y 0 )
= E(S n (Y n+1 − Y n ) Y n = y n ,
...
, Y 1 = y 1 , Y 0 = y 0 ) ≤ 0
...
, Y 1 = y 1 , Y 0 = y 0 '
the share amount S n is constant and that {Y 0 , Y 1 ,
...
Hence,
no matter how well-considered the investor fixes the amount of share to be held in an
interval, in the long-run he cannot expect to make positive profit if the share price
develops unfavourably
...
)
Example 6
...
17) includes as a special case the net
profit development when applying the 'doubling strategy': A gambler bets $ 1 on the
first game
...
But if he loses, he suffers a loss of
$ -1 and will bet $ 2 on the next play
...
But if he loses he will bet $ 4 on the next game and so on
...
, N,
Y i+1 − Y i = −1 ; i = 0, 1,
...

Hence, when assuming a win occurs with probability p and a loss with probability
1 − p, the Y 1 , Y 2 ,
...
+ Z i , Y 0 = 0,
(6
...
are independent, identically as Z distributed binary random variables:
⎧ 1 with probability
p
Z=⎨
−1 with probability 1 − p
...
18), the net winnings of the gambler at time n, 1 ≤ n ≤ N, are
given by (6
...
Now, on condition that after every win the game starts anew and the
S i are adjusted accordingly, (6
...
Note, if the gambler loses at time N + 1 , his total winnings
become 0
...
17) are random as well
...
} is a supermartingale
...
} is a supermartingale if p ≤ 1/2
...
(If p > 0, then P(N < ∞) = 1
...
; has a
positive probability to occur and the casino must allow arbitrarily large bets
...

6
...
3 Martingale Stopping Theorem and Applications
As pointed out in the beginning of this chapter, martingales are suitable stochastic
models for fair games, i
...
the chances to win or to lose are equal
...
Hence, a proper time for
finishing a game seems to be a stopping time N for {X 0 , X 1 ,
...
According to definition 1
...
} is a positive, integer-valued random variable N with property that the
occurrence of the random event 'N = n ' is fully determined by the random variables
X 0 , X 1 ,
...
However, the
© 2006 by Taylor & Francis Group, LLC

6 MARTINGALES

341

martingale stopping theorem (also called optional stopping theorem or optional sampling theorem) excludes the possibility of winning in the long-run if finishing the
game is controlled by a stopping time (see also examples 6
...
8)
...
2 (martingale stopping theorem for discrete-time Markov chains) Let
N be a stopping time for the martingale {X 0 , X 1 ,
...
19)

if at least one of the following three conditions is fulfilled:
1) N is finite and there exists a finite constant C 1 with
X min(N,n) ≤ C 1 for all n = 0, 1,
...
e
...

3) E(N) is finite and there exists a finite constant C 3 so that
E ( X n+1 − X n | X 1 , X 2 ,
...

Hint When comparing formulas (6
...
19), note that in (6
...

Example 6
...
2 implies Wald's identity (1
...
To see this, let
n

X n = Σ i=1 (Y i − E(Y)) ; n = 1, 2,
...
1, the sequence {X 1 , X 2 ,
...
Hence, theorem 6
...

On the other hand,
N
E(X N ) = E ⎛ Σ i=1 (Y i − E(Y)) ⎞
⎝
⎠
N
= E ⎛ Σ i=1 Y i − N E(Y) ⎞
⎝
⎠
N
= E ⎛ Σ i=1 Y i ⎞ − E(N) E(Y)
...

⎝
⎠

© 2006 by Taylor & Francis Group, LLC

(6
...
10 ( fair game) Let {Z 1 , Z 2 ,
...

Z=⎨
⎩ −1 with probability P(Z = −1) = 1/2
Since E(Z i ) = 0, the sequence {Y 1 , Y 2 ,
...
+ Z n ; n = 1, 2,
...
1)
...
The gambler finishes
the game as soon he has won $ a or lost $ b
...

(6
...
Since E(N) is finite,
by theorem 6
...

Combining this relationship with
P(Y N = a) + P(Y N = −b) = 1,
yields the desired probabilities
P(Y N = a) =

b ,
a+b

P(Y N = −b) =

a
...
} with
X n = Y 2 − Var(Y n ) = Y 2 − n
n
n
is used (example 6
...
By theorem 6
...

Therefore,
2
E(N) = E(Y N ) = a 2 P(Y N = a) + b 2 P(Y N = −b)
...

a+b
a+b
Example 6
...

⎩ −1 with probability 1 − p
Thus, the win and loss probabilities on a play are different
...
22)

6 MARTINGALES

343

The mean value of Z i is
E(Z i ) = 2p − 1
...
} be defined as in example 6
...

By introducing Y n = Z 1 + Z 2 +
...

If this martingale is stopped at time N given by (6
...
2 yields
0 = E(X N ) = E(Y N ) − (2p − 1) E(N) ,

(6
...

For establishing another equation in the three unknowns
P(Y N = a), P(Y N = −b) , and E(N),
the exponential martingale (example 6
...
Let θ be given by
θ = ln [(1 − p)/p]
...

⎝
⎠
Hence, the sequence {U 1 , U 2 ,
...

is a martingale
...
2,
1 = E(U 1 ) = E(U N ) = e θ a P(Y N = a) + e −θ b P(Y N = −b)
...
24)

Equations (6
...
24) together with P(Y N = a) + P(Y N = −b) = 1 yield the 'hitting' probabilities
P(Y N = a) =

p ⎞b
1−⎛
⎝ 1−p ⎠

⎛ 1−p ⎞ a − ⎛ p ⎞ b
⎝ p ⎠
⎝ 1−p ⎠

,

P(Y N = −b) =

⎛ 1−p ⎞ a − 1
⎝ p ⎠
⎛ 1−p ⎞ a − ⎛ p ⎞ b
⎝ p ⎠
⎝ 1−p ⎠

and the mean duration of a game
E(N) =

© 2006 by Taylor & Francis Group, LLC

a P(Y N = a) − b P(Y N = −b)

...
1
...
} are listed
...
Then there exists a random variable X ∞ with
property that the random sequence X 0 , X 1 ,
...
9
...

n→∞

2) Let sup E(X 2 ) < ∞
...
converges in mean square towards X ∞ :
lim E((X n − X ∞ ) 2 ) = 0
...
and X 0 = μ
...
;
then, for for all n = 1, 2,
...

⎭

Hence, if the increments X i+1 − X i of the martingale {X 1 , X 2 ,
...

4) (Doob's inequalities) For all n = 1, 2,
...

λα
⎝ i=0,1,
...

⎠ ⎝α−1⎠
⎝ i=0,1,
...

⎝ ⎠
⎝ ⎠
⎝ i=0,1,
...
2 CONTINUOUS-TIME MARTINGALES
This section summarizes some results on continuous-time martingales
...
The following definition of
continuous-time martingales is based on the concept of the conditional mean value of
a random variable given one or more other random variables (section 1
...
3)
...
4 A stochastic process {X(t), t ≥ 0} with E( X(t) ) < ∞ for all t ≥ 0 is
called a martingale if for all integers n = 0, 1,
...
, t n with
0 ≤ t 0 < t 1 <
...
, X(t 1 ), X(t 0 )) = X(t n )
...
25)

Thus, to predict the mean value of a martingale at a time t, only the last observation
point before t is relevant
...
Hence, regardless how large the difference t − t n is, on average no increase/decrease of the process
{X(t), t ≥ 0} can be expected in [t n , t]
...
25) of a martingale under the assumptions made is frequently written in the form
E(X(t) X(y), y ≤ s) = X(s), s < t
...
26)

{X(t), t ≥ 0} is a supermartingale (submartingale) if in (6
...
If Z is the state space of {X(t), t ≥ 0}, then, as a consequence of
(6
...
, X(t 1 ) = x 1 , X(t 0 ) = x 0 ) = x n
for all (x 0 , x 1 ,
...
4, can be used to define continuous-time martingales
analogously to discrete-time martingales
...

Definition 6
...
Therefore, the occurrence of the random event 'L ≤ s ' is independent of all
X(t) with t > s
...

I L>t =

© 2006 by Taylor & Francis Group, LLC

1 if L > t occurs,

...
3 (martingale stopping theorem) If {X(t), t ≥ 0} is a continuous-time
martingale and L a stopping time for this martingale, then
E(X(L)) = E(X(0))

(6
...

2) P(L < ∞) = 1, E( X(L) ) < ∞, and lim E( X(t) I L>t ) = 0
...
For proofs of theorems 6
...
3 see, for instance, Kannan [43] and Rolski et al
...

Example 6
...
3, a proof of Lundberg's inequality
(3
...
4
...
e
...
The
claim sizes M 1 , M 2 ,
...

∞
Y(t) = e −r R(t) and h(r) = E(e r M ) = ∫ 0 e r x b(t)dt

for any positive r with property
h(r) < ∞
...
28)

Then
E(Y(t)) = e −r (x+κ t) E ⎛ e +r C(t) ⎞
⎝
⎠
∞

= e −r (x+κ t) Σ E(e +r C(t) N(t) = n) P(N(t) = n)
= e −r (x+κ t)

∞

i=0

Σ [h(r)] n

i=0

(λ t) n −λt
e
= e −r(x+κt) e λ t [h(r)−1]
...

E(Y(t))

Since {C(t), t ≥ 0} has independent increments, the process {X(t), t ≥ 0} has independent increments, too
...

Thus, {X(t), t ≥ 0} is a martingale
...

t

(6
...
Therefore, for any
finite z > 0,
L ∧ z = min (L, z)
is a bounded stopping time for {X(t), t ≥ 0} (exercise 6
...
Hence, theorem 6
...

By (6
...
Thus, from the first and the last line of this derivation,
1 > E(e r (x+κ L) − λ L (h(r)−1) L < z) P(L < z) ,
or, equivalently,
1 > e r x E(e [r κ − λ (h(r) −1)] L L < z) P(L < z)
...
30)

If the parameter r is chosen in such away that
r κ − λ [h(r) − 1] = 0 ,

(6
...
30) simplifies to
P(L < z) < e −r x
...

(6
...
143), the probability P(L < ∞) is nothing but the ruin probability p(x)
...
31) is equivalent to equation (3
...
To verify this by partial integration of
∞

E(e r M ) = ∫ 0 e r x b(t)dt,

© 2006 by Taylor & Francis Group, LLC

348

STOCHASTIC PROCESSES

note that condition (6
...

t→∞

Thus, (6
...
161) for the ruin probability
...

They are quite analogous to the corresponding ones for discrete-time martingales
...

1) If sup E( X t ) < ∞ , then there exists a random variable X ∞ with property that
t

X(t) converges both with probability one and in mean towards X ∞ as t → ∞ :
P( lim X t = X ∞ ) = 1,
t→∞

lim E( X t − X ∞ ) = 0
...

t→∞

3) Let [a, b] ⊆ [0, ∞)
...

t∈[a,b]

4) (Doob's inequalities) Let [a, b] ⊆ [0, ∞)
...

α
E( X(b) α ) ≤ E([ sup X(t)] α ) ≤ ⎛ α ⎞ E( X(b) α )
...

t∈[a,b]

For proofs and a more prestigious treatment of martingales see, for instance, Rolski
et al
...

© 2006 by Taylor & Francis Group, LLC

6 MARTINGALES

349

6
...
1) Let Y 0 , Y 1 ,
...
Is the discrete-time stochastic process {X 0 , X 1 ,
...
a martingale?

6
...
be a sequence of independent random variables with finite mean
values E(Y i )
...
} generated by the
n
sums X n = Σ i=0 (Y i − E(Y i )) a martingale
...
3) Let a discrete-time stochastic process {X 0 , X 1 ,
...
⋅ Yn ,
0

1

where the random variables Y i are independent and have a uniform distribution over
the interval [0, T]
...
} (1) a martingale, (2) a submartingale, (3) a supermartingale?
6
...
} be the discrete Black-Scholes model defined by
Xn = Y ⋅ Y ⋅
...
Under which condition is {X 0 , X 1 ,
...
5) Starting at value 0, the profit of an investor increases per week by one unit with
probability p, p > 1/2, or decreases per week by one unit with probability 1 − p
...

Let N be the random number of weeks until the investor's profit reaches for the first
time a given positive integer n
...

6
...
, Z n be a sequence of independent, identically as Z distributed random variables with
1
with probability p
Z=
, 0 < p < 1,
0 with probability 1 − p
Y n = Z 1 + Z 2 +
...
; where, for any real y,
h(y) = [(1 − p) /p] y
...
} is a martingale with regard to {Y 1 , Y 2 ,
...
7) Starting at value 0, the fortune of an investor increases per week by $ 200 with
probability 3/8, remains constant with probability 3/8 and decreases by $ 200 with
probability 2/8
...
The investor stops the 'game' as soon as he has made a total fortune of
$ 2000 or a loss of $ 1000, whichever occurs first
...

6
...

(1) Prove that the sequence {X 0 , X 1 ,
...

T
(2) Show that E(X k ) = k+1 ; k = 0, 1,
...
9) Let {X 1 , X 2 ,
...
, n} and transition probabilities
⎛ ⎞ i j − n−j
p i j = P(X k+1 = j X k = i) = n ⎛ n ⎞ ⎛ n n i ⎞ ; i, j ∈ Z
...
} is a martingale
...
)
6
...

6
...
Verify that L ∧ z = min(L, z) is a stopping
time for {X(t), t ∈ T}
...
12)* The ruin problem described in section 3
...
1 is modified in the following way:
The risk reserve process {R(t), t ≥ 0} is only observed at the end of each year
...
,
where x is the initial capital, κ is the constant premium income per year, and M i is
the total claim size the insurance company has to cover in year i, M 0 = 0
...
are assumed to be independent and identically distributed
as M = N(μ, σ 2 ) with κ > μ > 3σ
...
e
...
so that R(n) < 0)
...

© 2006 by Taylor & Francis Group, LLC

CHAPTER 7

Brownian Motion
7
...
In 1828, the English botanist Robert Brown published a paper, in which
he summarized his observations on this motion and tried to find its physical explanation
...
) However, at that time
Brown could only speculate on the causes of this phenomenon and was at an early
stage of his research even convinced that he had found an elementary form of life
which is common to all particles
...
Although the ceaseless, seemingly chaotic
zigzag movement of microscopically small particles in fluids had already been detected before Brown, it is generally called Brownian motion
...
Bachelier (1900) and A
...
Both found the normal distribution to be
an appropriate model for describing the Brownian motion and gave a physical explanation of the observed phenomenon: The chaotic movement of sufficiently small particles in fluids and in gases is due to the huge number of impacts with the surrounding molecules, even in small time intervals
...
) More precisely, Einstein showed that water molecules could
momentarily form a compact conglomerate which has sufficient energy to move a
particle, when banging into it
...
) These bunches of molecules would hit the 'giant' particles from random
directions at random times, causing its apparently irregular zigzag motion
...
As a 'byproduct', his theory of the Brownian motion and its experimental confirmation yielded another argument for the existence of atoms
...
N
...
He defined and analysed a stochastic process, which has served up till now as a stochastic model of Brownian motion
...
Frequently, this process is also referred to as the Wiener process
...
The Brownian motion process is an essential ingredient in stochastic calculus, plays a crucial role in mathematics of finance, is basic for defining one of the most important classes of Markov
processes, the diffusion processes, and for solving large sample estimation problems
in mathematical statistics
...

This chapter only deals with the one-dimensional Brownian motion
...
1 (Brownian motion) A continuous-time stochastic process {B(t), t ≥ 0}
with state space Z = (−∞, + ∞) is called (one-dimensional) Brownian motion process
or simply Brownian motion if it has the following properties:
1) B(0) = 0
...

3) B(t) has a normal distribution with
E(B(t)) = 0 and Var (B(t)) = σ 2 t,

t > 0
...
Actually, in what follows situations will arise in which a Brownian motion is required to start at B(0) = u ≠ 0
...

The process {B u (t), t ≥ 0} with B u (t) = u + B(t) is called a shifted Brownian motion
...

(7
...
Note that
σ 2 = Var (B(1))
...
2)

b(t)

t

0

Figure 7
...
For any Brownian motion with parameter σ,
B(t) = σ S(t)
...
3)
Laplace Transform Since B(t) = N(0, σ 2 t), the Laplace transform of B(t) is (see
example 1
...
3
...

(7
...
2 PROPERTIES OF THE BROWNIAN MOTION
The first problem which has to be addressed is whether there exists a stochastic process having properties 1 to 3
...
Wiener
in 1923
...
This is done by showing that Brownian motion can be represented
as the limit of a discrete-time random walk, where the size of the steps tends to 0 and
the number of steps per unit time is speeded up
...
Modifying
the random walk described in example 4
...
Thus, if X(t) is the position of the particle at time t and X(0) = 0,
X(t) = (X 1 + X 2 +
...
5)

where
Xi =

+1 if the i th jump goes to the right
−1 if the i th jump goes to the left

and [t /Δt] denotes the greatest integer less than or equal to t /Δt
...

Formula (1
...
5) yields
E(X(t)) = 0 ,

Var(X(t)) = (Δx) 2 [t/Δt]
...
Then, taking the limit as Δt → 0 in (7
...

354

STOCHASTIC PROCESSES

Due to its construction, {X(t), t ≥ 0} has independent and homogeneous increments
...

Therefore, the stochastic process of the 'infinitesimal random walk' {X(t), t ≥ 0} is a
Brownian motion
...
Some of them will be considered in this chapter
...

Theorem 7
...

b) {B(t), t ≥ 0} is a martingale
...

d) {B(t), t ≥ 0} is a Gaussian process
...
1),
E((B(t) − B(s)) 2 ) = Var(B(t) − B(s)) = σ 2 t − s
...
6)

Hence,
lim E ⎛ [B(t + h) − B(t)] 2 ⎞ = lim σ 2 h = 0
...
9
...

b) Since a Brownian motion {B(t), t ≥ 0} has independent increments, for s < t,
E(B(t) B(y), y ≤ s)) = E(B(s) + B(t) − B(s) B(y), y ≤ s))
= B(s) + E(B(t) − B(s) B(y), y ≤ s))
= B(s) + E(B(t) − B(s))
= B(s) + 0 − 0 = B(s)
...

c) Any stochastic process with independent increments is a Markov process
...
, t n be any sequence of real numbers with 0 < t 1 < t 2 <
...
It
has to be shown that for all n = 1, 2,
...
, B(t n ))
has an n-dimensional normal distribution
...
2 (section 1
...
3), since each B(t i ) can be represented as a sum of independent,
normally distributed random variables (increments) in the following way:
B(t i ) = B(t 1 ) + (B(t 2 ) − B(t 1 )) +
...
, n
...
2 Let {S(t), t ≥ 0} be the standardized Brownian motion
...

Proof a) For s < t,
2
2
E(e α S(t)−α t /2 S(y), y ≤ s) = E(e α[S(s)+ S(t)−S(s)]−α t /2 S(y), y ≤ s)
2
= e α S(s)−α t /2 E(e α [ S(t)−S(s)] S(y), y ≤ s)
2
= e α S(s)−α t /2 E ⎛ e α [ S(t)−S(s)] ⎞
...
4) with σ = 1,
E ⎛ e α [S(t)−S(s)] ⎞ = e
⎠
⎝

+ 1 α 2 (t−s)
2

...

(7
...

There is an obvious analogy between the exponential and the variance martingale
defined in theorem 7
...
3 and 6
...

The relationship (7
...
7) with regard to α once and twice, respectively, and letting α = 0, 'proves' once
more that {S(t), t ≥ 0} and {S 2 (t) − t, t ≥ 0} are martingales
...
7) three and four times, generates the martingales
{S 3 (t) − 3 t S(t), t ≥ 0} and {S 4 (t) − 6 t S 2 (t) + 3 t 2 , t ≥ 0}
...
times
...

More exactly, the probability that a sample path of a Brownian motion is continuous
is equal to 1
...
' In view of this, it may surprise that the sample paths of a Brownian
motion are nowhere differentiable
...
6): For any sample path b = b(t) and any sufficiently small, but
positive Δt, the difference
Δb = b(t + Δt) − b(t)
is approximately equal to σ Δt
...

Δt

Hence, for Δt → 0, the difference quotient Δb/Δt is likely to tend to infinity for any
nonnegative t
...
(For proofs, see e
...
Kannan [43]
...

⎝2 ⎠
⎝ 2 ⎠

(7
...
Hence, any sample path of a Brownian motion is of unbounded variation
...
What geometric structure is such a sample path supposed to have? The most
intuitive explanation is that the sample paths of any Brownian motion are strongly
dentate (in the sense of the structure of leaves), but this structure must continue to
the infinitesimal
...
The numerous and rapid bombardments of particles in liquids or
gases by the surrounding molecules cannot lead to a smooth sample path
...
Hence, the Brownian motion process cannot be a mathematically exact model for describing the movement of
particles in these media
...
(For modeling the
velocity of particles in liquids or gases the Ornstein-Uhlenbeck process has been
developed, see section 7
...
2
...

© 2006 by Taylor & Francis Group, LLC

7 BROWNIAN MOTION

357

7
...
From
property 3 of definition 7
...

f t (x) =
(7
...
To determine the parameters of this distribution, next the joint density f s,t (x 1 , x 2 ) of (B(s), B(t)) will be derived
...

Hence,
f s,t (x 1 , x 2 ) = f s (x 1 ) f t−s (x 2 − x 1 )
...
10)

(This derivation can easily be made rigorously
...
9) into (7
...
(7
...
66)
shows that {B(s), B(t)} has a joint normal distribution with correlation coefficient
ρ = + s /t ,

0 < s < t
...

Since the roles of s and t can be changed,
C(s, t) = σ 2 min (s, t)
...
12)

However, in view of the independence of the increments of the Brownian motion, it
is easier to directly determine the covariance function of {B(t), t ≥ 0} : For 0 < s ≤ t ,
C(s, t) = Cov (B(s), B(t)) = Cov (B(s), B(s) + B(t) − B(s))
= Cov (B(s), B(s)) + Cov (B(s), B(t) − B(s))
= Cov (B(s), B(s))
...

Let 0 < s < t
...
59), the conditional density of B(s) given B(t) = b is
f B(s) (x B(t) = b) =

f s,t (x, b)

...
13)

Substituting (7
...
11) into (7
...

⎝
s
t ⎠ ⎬
⎪
⎭
t (t − s)

(7
...

t

(7
...

Let f t 1 ,t 2 ,
...
, x n ) be the n-dimensional density of the random vector
(B(t 1 ), B(t 2 ),
...
< t n < ∞
...
10), by induction,
f t 1 ,t 2 ,
...
, x n ) = f t 1 (x 1 ) f t 2 −t 1 (x 2 − x 1 )
...

With f t (x) given by (7
...
,t n (x 1 , x 2 ,
...
16)

⎧
⎡ x 2 (x 2 −x ) 2
(x n −x ) 2 ⎤ ⎫
⎪
⎪
exp ⎨ − 1 2 ⎢ t 1 + t −t1 +
...

=
(2π) n/2 σ n t 1 (t 2 − t 1 )
...
16)
has the form (1
...
This proves once more that the Brownian motion is a Gaussian
process
...
Actually, since the trend function of a Brownian
motion is identically zero, the Brownian motion is completely characterized by its
covariance function
...

© 2006 by Taylor & Francis Group, LLC

7 BROWNIAN MOTION

359

Example 7
...

Letting in (7
...

Mean value and variance of B(t) are
E(B(t)) = 0, Var (B(t)) = σ 2 t (1 − t),

0 ≤ t ≤ 1
...
Hence, for 0 < s < t < 1,

f (B(s),B(t)) (x 1 , x 2 ) =

2
1−s
exp − 1 2 ⎡ s (t t− s) x 2 − t − s x 1 x 2 +
x2 ⎤
⎢
1
(t − s)(1−t) 2 ⎥
⎦
2σ ⎣

2πσ 2 s (t − s)(1 − t)

...
66) shows that correlation and covariance function of the
Brownian bridge are
ρ(s, t) =

s (1 − t)
,
t (1 − s)

C(s, t) = σ 2 s (1 − t),

0 < s < t ≤ 1
...

Hence, it is uniquely determined by its covariance function
...
4 FIRST PASSAGE TIMES
By definition, the Brownian motion {B(t), t ≥ 0} starts at B(0) = 0
...
Since the sample paths of the Brownian motion are continuous functions,
L(x) is uniquely characterized by B(L(x)) = x and can, therefore, be defined as
L(x) = min {t, B(t) = x}, x ∈ (−∞, +∞)
...
2 Illustration of the first passage time and the reflection principle

Next the probability distribution of L(x) is derived on condition x > 0 : Application
of the the total probability rule yields
P(B(t) ≥ x) = P(B(t) ≥ x L(x) ≤ t) P(L(x) ≤ t)

(7
...

The second term on the right hand side of this formula vanishes, since, by definition
of the first passage time,
P(B(t) ≥ x L(x) > t) = 0
for all t > 0
...

2

(7
...
2: Two sample paths of the Brownian motion,
which coincide up to reaching level x and which after L(x) are mirror symmetric
with respect to the straight line b(t) ≡ x, have the same chance of occurring
...
) This heuristic argument is known as
the reflection principle
...
9), (7
...
18),
2

∞ − u
2
2
∫ e 2 σ t du
...
Therefore,
2

F L(x) (t) =

© 2006 by Taylor & Francis Group, LLC

∞ − u
2
2
∫ e 2 σ t du ,
2πt σ x

t > 0
...
2
...
2)
...

x
σ t

Hence, the distribution function of the first passage time L(x) can be written as
⎛ x ⎞⎤
⎡
F L(x) (t) = 2 ⎢ 1 − Φ ⎜
⎟⎥,
⎝σ t ⎠ ⎦
⎣

t > 0,

(7
...
Differentiation with respect to t yields the probability density of L(x) :
f L(x) (t) =

2 ⎫
⎧
x
exp ⎨ − x ⎬ ,
⎩ 2 σ2 t ⎭
2π σ t 3/2

t > 0
...
20)

The parameters E(L(x)) and Var(L(x)) do not exist, i
...
they are infinite
...

(7
...
19), the probability distribution of M(t) is obtained as follows:
1 − F M(t) (x) = P(M(t) ≥ x) = P(L(x) ≤ t)
...
19), distribution function and probability density of M(t) are for t > 0,
⎛
⎞
F M(t) (x) = 2 Φ ⎜ x ⎟ − 1 ,
⎝σ t ⎠
f M(t) (x) =

2
2
2
e −x /(2 σ t) ,
2π t σ

x ≥ 0,
x ≥ 0
...
22)
(7
...
22): For all finite x ,
lim P(M(t) < x) = 0
...
24)

Example 7
...
At the start, the measurement is absolutely correct
...
Let B(t) be the random deviation of the temperature indicated by the sensor at time t from the true temperature
...
1

⎛ in ⎡ o C/ 24 h ⎤ ⎞
...
e
...
617)] = 0
...

⎝ 0
...
05 during its operating time, then the sensor has to be exchanged by a
new one after a time τ 0
...
05 ) = 0
...

According to (7
...
05 satisfies equation
⎡
⎛
⎞⎤
5
2 ⎢1 − Φ⎜
⎢
⎥
⎟ ⎥ = 0
...
1 τ 0
...
975) = 1
...

0
...
05
Thus, τ 0
...

The following example presents another, more prestigious application of the probability distribution of M(t)
...
3 Let p (1, d] be the probability that the Brownian motion {B(t), t ≥ 0}
crosses the t-axis at least once in the interval (1, d], 1 < d
...
22), for any b > 0,
P(B(t) = 0 for a t with 1 < t ≤ d B(1) = b)
= P(B(t) = 0 for a t with 1 < t ≤ d B(1) = −b)
= P(B(t) ≤ −b for a t with 0 < t ≤ d − 1)
= P(B(t) ≥ b for a t with 0 < t ≤ d − 1)
= P(M(d − 1) ≥ b)
=

© 2006 by Taylor & Francis Group, LLC

∞ −

u2

2
2
∫ e 2 σ (d−1) du ,
2π (d − 1) σ b

(7
...

Since b is a value the random variable B(1) can assume, the mean value of the random probability
P(B(t) = 0 for a t with 1 < t ≤ d B(1))
is the desired probability p (1, d]
...
25) and (7
...

∫ ∫
π d − 1 σ2 0 b

By substituting
u = x σ d − 1 and y = b /σ
in the inner and outer integral, respectively,
2
p (1, d] = π

∞

∫

0

∞

∫
y

e

−

x 2 +y 2
2

dx dy
...
Then the
domain of the (x, y) - integration has to be transformed as follows:
⎧
⎨ 0 < y < ∞,
⎩

⎫
< x < ∞⎬ →
d−1
⎭
y

⎧
1
<ϕ<
⎨ 0 < r < ∞, arctan
d−1
⎩

Since
∞

∫0

2
r e −r /2 dr = 1,

the desired probability becomes
2
p (1,d] = π

∞

∫

π/2

∫

0 arctan

2
e −r /2 r dϕ dr
1
d−1

2⎡
= π ⎢ π − arctan 1
⎢
⎢2
d−1
⎣
2
= 1 − π arctan

⎤∞
⎥ r e −r 2 /2 dr
⎥ ∫
⎥
⎦ 0
1
d−1

2
= π arccos 1
...

2⎬
⎭

364

STOCHASTIC PROCESSES

By introducing the time unit c , 0 < c < d, i
...
replacing d with d /c, this formula
yields the probability p (c, d] that the Brownian motion crosses the t-axis at least once
in the interval (c, d] :
2
p (c, d] = π arccos c
...

d

(7
...
e
...
Then the
random event 'τ ≤ c ' with c < d occurs if and only if there is no time point t in (c, d]
satisfying B(t) = 0
...
26), for 0 < c < d,
2
P(τ ≤ c) = π arcsin

c
...

Example 7
...

t

Then the probability p a,b that {B(t), t ≥ 0} assumes value a before value b is
p a,b = P(L(a) < L(b)) = P(L(a, b) = L(a))
(Figure 7
...

To determine p a,b , note that L(a, b) is a stopping time for {B(t), t ≥ 0}
...
24), E( L(a, b)) is finite
...
3 is applicable and yields
0 = E(B(L(a, b))) = a p a,b + b (1 − p a,b )
...

a+ b

For determining the mean value of L(a, b), the martingale {Y(t), t ≥ 0} with
Y(t) = 12 B 2 (t) − t
σ

© 2006 by Taylor & Francis Group, LLC

(7
...
3 First-passage times with regard to an interval

is used (theorem 7
...
In this case, theorem 6
...

⎝σ
⎠
Hence,
⎛
⎞
E( L(a, b)) = E 12 B 2 ( L(a, b))
⎝σ
⎠
= 12 ⎡ p a,b a 2 + (1 − p a,b ) b 2 ⎤
...
27),
E(L) = 12 a b
...
28)

As an application of the situation considered in this example, assume that the total
profit which a speculator makes with a certain investment develops according to a
Brownian motion process {B(t), t ≥ 0} , i
...
B(t) is the cumulative 'profit', the speculator has achieved at time t (possibly negative)
...
With reference to example 7
...

Or, if in the same example the tolerance region for B(t) is
[−5 o C, 5 o C] ,
then B(t) on average leaves this region for the first time after
E(L) = 25/0
...

© 2006 by Taylor & Francis Group, LLC

366

STOCHASTIC PROCESSES

7
...
5
...
Some transformations again lead to the Brownian motion
...
3 compiles three transformations of this type
...
3 If {S(t), t ≥ 0} is the standard Brownian motion, then each of the following stochastic processes is also the standard Brownian motion:
(1) {X(t), t ≥ 0} with X(t) = c S(t /c 2 ), c > 0,
(2) {Y(t), t ≥ 0} with Y(t) = S(t + h) − S(h), h > 0,
(3) {Z(t), t ≥ 0} with Z(t) =

t S(1/t) for t > 0

...
1
...
Since the
Brownian motion has independent, normally distributed increments, the processes
(1) to (3) have the same property
...
Therefore, it remains to show that the increments of the processes (1) to (3) are homogeneous
...
1), it suffices to prove that the variances of the increments of the
processes (1) to (3) in any interval [s, t] with s < t are equal to t − s
...
12)
...

⎣c
c
c ⎦
(2)

Var (Y(t) − Y(s)) = E([S(t + h) − S(s + h)] 2 )
= E(S 2 (t + h)) − 2 Cov (S(s + h) S(t + h)) + E(S 2 (s + h))
= (t + h) − 2(s + h) + (s + h) = t − s
...

s
t
t
Thus, the theorem is proved
...

t→∞ t

(7
...
) If t is replaced with 1/t , then taking the
limit as t → ∞, is equivalent to taking the limit as t → 0
...

t→0

(7
...
29) is that any Brownian motion {B(t), t ≥ 0} crosses the t-axis
with probability 1 at least once in the interval [s, ∞), s > 0, and, therefore, even
countably infinite times
...
Therefore, for any s > 0,
no matter how small s is, a Brownian motion {B(t), t ≥ 0} crosses the t-axis in (0, s]
countably infinite times with probability 1
...
5
...
Its trend and variance function are
m(t) = E(X(t)) =

∞

x2

−
2
2t
2
∫ x e 2σ t dx = σ π , t ≥ 0,
2π t σ 0

Var (X(t)) = E(X 2 (t)) − [E(X(t))] 2 = σ 2 t − σ 2 2t = (1 − 2/π) σ 2 t
...
This can be seen as follows: For
0 ≤ t 1 < t 2 <
...
, X(t n ) = x n ⎞
⎠

= P(−y ≤ B(t) ≤ +y B(t 1 ) = ±x 1 , B(t 2 ) = ±x 2 ,
...

Hence, for 0 ≤ s < t, the transition probabilities
P(X(t) ≤ y X(s) = x)

© 2006 by Taylor & Francis Group, LLC

368

STOCHASTIC PROCESSES

of the reflected Brownian motion are determined by the increment of the Brownian
motion in [s, t] if it starts at time s at state x
...

2π τ σ −y

Equivalently,
P(X(t) ≤ y X(s) = x) = Φ

⎛ y−x ⎞
⎛ y+x ⎞
−Φ −
;
⎝σ τ ⎠
⎝ σ τ ⎠

x, y ≥ 0; τ = t − s
...

7
...
3 Geometric Brownian Motion
A stochastic process {X(t), t ≥ 0} with
X(t) = e B(t)

(7
...

Unlike the Brownian motion, the sample paths of a geometric Brownian motion cannot become negative
...

According to (7
...

(7
...
32) the parameter α with an integer n yields all the moments of
X(t) :
E(X n (t)) = e

+ 1 n2σ2t
2

; n = 1, 2,
...
33)

In particular, mean value and second moment of X(t) are
E(X(t)) = e

+ 1 σ2t
2

,

2
E(X 2 (t)) = e +2σ t
...
34)

From (7
...
19):
2

2

Var(X(t)) = e t σ (e t σ − 1)
...

© 2006 by Taylor & Francis Group, LLC

(7
...
5
...
To overcome this unrealistic situation,
Ornstein and Uhlenbeck developed a stochastic process for modeling the velocity of
tiny particles in liquids and gases
...
2 Let {B(t), t ≥ 0} be a Brownian motion with parameter σ
...
36)

is said to be an Ornstein-Uhlenbeck process with parameters σ and α, α > 0
...
9):
f U(t) (x) =

1 e −x 2 /(2 σ 2 ) ,
2π σ

− ∞ < x < ∞
...

In particular, the trend function of the Ornstein-Uhlenbeck process is identically 0
and U(t) is standard normal if {B(t), t ≥ 0} is the standard Brownian motion
...
(This is a corollary from theorem 1
...
) Hence, the multidimensional
distributions of the Ornstein-Uhlenbeck process are multidimensional normal distributions
...
Thus, the Ornstein-Uhlenbeck process, like the Brownian motion, is a Markov process
...

(7
...
12),
C(s, t) = Cov (U(s), U(t)) = E(U(s)U(t))
= e −α(s+t ) E(B(e 2α s ) B(e 2αt ))
= e −α(s+t ) Cov (B(e 2 αs ), B(e 2 αt ))
= e −α(s + t) σ 2 e 2 α s = σ 2 e − α(t − s)
...
Therefore, as a
Gaussian process, it is also strongly stationary
...
In contrast to the Brownian motion, the Ornstein-Uhlenbeck process has the following properties:
1) The increments of the Ornstein-Uhlenbeck process are not independent
...

7
...
5

Brownian Motion with Drift

7
...
5
...
3 A stochastic process {D(t), t ≥ 0} is called Brownian motion with drift
if it has the following properties:
1) D(0) = 0
...

3) Every increment D(t) − D(s) has a normal distribution with mean value μ (t − s)
and variance σ 2 t − s
...
38)

where {B(t), t ≥ 0} is the Brownian motion
...
Thus, a Brownian motion with drift arises by superimposing a Brownian motion on a deterministic function
...

If properties 2) and 3) are fulfilled, but the process starts at time t = 0 at level u,
u ≠ 0, then the resulting stochastic process {D u (t), t ≥ 0} is called a shifted Brownian motion with drift
...

The one-dimensional density functions of the Brownian motion with drift are
f D(t) (x) =

1
e
2πt σ

−

(x−μ t) 2
2 σ2 t

;

− ∞ < x < ∞, t > 0
...
39)

Brownian motion processes with drift are, amongst other applications, used for modeling wear parameters, maintenance cost rates, productivity criteria and capital increments over given time periods as well as for modeling physical noise
...

© 2006 by Taylor & Francis Group, LLC

7 BROWNIAN MOTION

371

d(t)
m(t) = μt

t

0

Figure 7
...
Then,
L(x) = min {t, D(t) = x}, x ∈ (−∞, +∞)
...

t
(For more general assumptions guaranteeing the validity of this formula, see Franz
[30]
...

(7
...
) For symmetry reasons, the probability density of the first passage time L(x) of a Brownian motion with drift starting
at u can be obtained from (7
...

The probability distribution given by the density (7
...
2
...
2)
...

μ3

(7
...
40) simplifies to the first passage time density (7
...
If x < 0 and μ < 0, formula (7
...

Let
F L(x) (t) = P(L(x) ≤ t) and F L(x) (t) = 1 − F L(x) (t) ,

© 2006 by Taylor & Francis Group, LLC

t ≥ 0
...
40) yields
⎛x − μt⎞
⎛ x + μt⎞
F L(x) (t) = Φ ⎜
⎟ − e −2 x μ Φ ⎜ −
⎟,
⎝ t σ⎠
⎝
t σ⎠

t > 0
...
42)

If the second term on the right-hand side of (7
...
179) as a limit
distribution of first passage times of compound renewal processes (theorem 3
...

After some tedious algebra, the Laplace transform of f L(x) (t) is seen to be
⎧
⎛
⎞⎫
∞
E ⎛ e −sL(x) ⎞ = ∫ 0 e −s t f L(x) (t) dt = exp ⎨ − x
2 σ2s + μ2 − μ ⎬
...
43)

Theorem 7
...

t∈(0,∞)

Then,
⎧ 1
for x > 0 and μ > 0
⎪

...
44)

Proof In view of (7
...
44) for μ < 0
...
2) is stopped at time L(x)
...

σ
σ
⎣
⎦

Hence,
α

E(Y(L(x))) = e σ x E ⎛ exp
⎝
α

+e σ x E ⎛ exp
⎝

α μ
2
σ − α /2 L(x) L(x) < ∞) P(L(x) < ∞)

α μ
2
σ − α /2 L(x) L(x) = ∞) P(L(x) = ∞)
...
Then the second term disappears and theorem 6
...

⎝

Since P(M > x) = P(L(x) < ∞), letting α ↓ 2 μ /σ yields the desired result
...

(7
...
5 (Leaving an interval ) Analogously to example 7
...

Thus, p a,b is the probability that {D(t), t ≥ 0} hits level a before level b
...
2 with
S(t) =

D(t) − μt
σ

is stopped at time L(a, b)
...
3,
⎛
⎧
α 2 L(a,b) ⎫ ⎞
1 = E ⎜ exp ⎨ α (D( L(a, b)) − μ L(a, b)) −
⎬⎟
...

σ
2 ⎦
⎝
⎠
⎣

Let α = −2μ /σ
...

⎝
⎠

Solving this equation for p a,b yields
p a,b =

1 − e −2μb/σ
2

2

e −2μa/σ − e −2μb/σ

2

...
46)

If μ < 0 and b tends to −∞ in (7
...
44) with x = a
...
46) yields the corresponding probability p a,b by replacing a and b with
a − u and b − u, respectively (u can be negative):
2

2

−2μu/σ − e −2μb/σ
p a,b = P(L(a) < L(b) D u (0)) = e

...
Then the stochastic process {X(t), t ≥ 0} with
X(t) = e D(t)

(7
...
If the drift μ is 0, then {X(t), t ≥ 0} is
simply the geometric Brownian motion as defined by (7
...

The Laplace transform of D(t) is obtained by multiplying (7
...

(7
...

(7
...
19),
2
2
Var(X(t)) = e t (2μ+σ ) (e t σ − 1)
...

7
...
5
...
The concept of a
risky security comprises all risky assets, e
...
shares and precious metals
...
A
call (put) option gives its holder the right to buy (to sell)
...
An American option can be exercised at any time
point to its expiration, a European option can only be exercised at the time point of
its expiration
...
Hence,
the following examples focus on determining the mean (expected) payoff of a holder
...
If
X(τ) ≤ x s , then the owner will not exercise because this would make no financial
© 2006 by Taylor & Francis Group, LLC

7 BROWNIAN MOTION

375

sense
...
Thus, owners of European call or put options
will achieve the respective random payoffs (notation: z + = max(z, 0))
(X(τ) − x s ) + and (x s − X(τ)) +
...
Due to interest and inflation rates, the
value which a certain amount of money has today, will not be the value which the
same amount of money has tomorrow
...

The following examples deal with option pricing under rather simplistic assumptions
...
g
...

x(t)
x

payoff
xs
x0

m(t) = μt

0

L( x)

t

Figure 7
...
6 The price of a share at time t is given by a shifted Brownian motion
{X(t) = D x 0 (t), t ≥ 0} with negative drift μ and volatility σ 2 = Var(B(1)) :
X(t) = x 0 + D(t) = x 0 + μt + B(t)
...
50)

Thus, x 0 is the initial price of the share: x 0 = X(0)
...
The option has no finite
expiry date
...
He makes up his mind to exercise the option at that time point, when the share price for the first time reaches value
x with x > x s
...
5)
...
Equivalently,
p(x) is the probability that the Brownian motion with drift {D(t), t ≥ 0} will ever
reach level x − x 0
...
44) if there x is replaced with x − x 0
...

(7
...
52)
x ∗ = x s + 1/λ
...
53)

...
The discounted payoff from exercising the option at time t on condition that the share has at
time t price x with x > x s is e −αt (x − x s )
...

Hence, the holder's mean discounted payoff is
∞

G α (x) = (x − x s )∫ 0 e −αt f L (x−x ) (t) dt ,
D
0

(7
...
40) with x replaced by x − x 0
...
54) is equal to the Laplace transform of f L (x−x ) (t) with parameter
D

s = α
...
43),

0

⎧ x − x0 ⎛
⎞⎫
G α (x) = (x − x s ) exp ⎨ −
2 σ2α + μ2 − μ ⎬
...
55)

The functional structures of the mean undiscounted payoff and the mean discounted
payoff as given by (7
...
55), respectively, are identical
...
52) and (7
...
56)
γ= 1
2 σ2α + μ2 − μ
...

Example 7
...
50) eventually become negative with probability
one, the share price model (7
...
In such a situation it seems to be more realistic to model the
share price development, apart from a constant factor, by a geometric Brownian
motion with drift:
X(t) = x 0 e D(t) , t ≥ 0
...
In particular, the price of the share at
time t = 0 is again equal to x 0
...

Therefore, by (7
...

⎝ ⎠
If the holder exercises the option as soon as the share price is x , his mean payoff is
x0 λ
G(x) = (x − x s ) ⎛ x ⎞
...

s
λ−1

(7
...
58)

To ensure that x ∗ > x s > 0, an additional assumption has to be made:
λ = 2 μ /σ 2 > 1
...

G(x ∗ ) = ⎛ λx 1 ⎞
⎝λ⎠
⎝ s ⎠

(7
...
e
...
Using this and processing
as in the previous example, the mean discounted payoff is seen to be
x0 γ
G α (x) = (x − x s ) ⎛ x ⎞
⎝ ⎠

(7
...
56)
...
57)
and the mean discounted payoff (7
...
Hence, the corresponding optimal values x ∗ and G α (x∗) are given by (7
...
59) if in these formulas λ is replaced with γ
...

As in the previous example, a positive drift parameter µ need not be excluded
...
8 (Formula of Black-Scholes-Merton) A European call option is considered with strike price x s and expiration date τ
...

© 2006 by Taylor & Francis Group, LLC

378

STOCHASTIC PROCESSES

The holder will buy if X(τ) > x s
...

The holder's mean discounted profit is denoted as
G α (τ, μ, σ) = E([e −α τ (X(τ) − x s )] + )
...
61)

In view of D(τ) = N(μτ, σ 2 τ),
G α (τ; μ, σ) = e −α τ

Substituting u =

∞

∫

ln(x s /x 0 )

(x 0 e y − x s )

y − μτ 2 ⎫
⎧
exp ⎨ − 1 ⎛ σ ⎞ ⎬ dy
⎝
⎠ ⎭
⎩ 2τ
2πσ 2 τ
1

[ln(x s /x 0 ) − μτ]
y − μτ
and letting c =
yields
σ τ
σ τ
∞

∞

2
2
G α (τ; μ, σ) = x 0 e (μ−α)τ 1 ∫ e u σ τ e −u /2 du − x s e −ατ 1 ∫ e −u /2 du
...

c−σ τ

Hence,
2
G α (τ; μ, σ) = x 0 e (μ−α+σ /2)τ

∞
∞
2
2
1
e −y /2 dy − x s e −α τ 1 ∫ e −u /2 du
∫
2π c−σ τ
2π c

2
= x 0 e (μ−α+σ /2)τ Φ(σ τ − c) − x s e −α τ (Φ(−c))
...
In view of theorem 7
...

Under this condition, the mean discounted payoff of the holder is given by the Formula of Black-Scholes-Merton
∼
(7
...

(Black and Scholes [10], Merton [61])
...
The formula of

© 2006 by Taylor & Francis Group, LLC

7 BROWNIAN MOTION

379

Black-Scholes-Merton gives the fair price of the option
...
Of course, this statement is only theory, since
the price development of the underlying risky security will never strictly follow a
geometric Brownian motion with drift
...

7
...
5
...
9 and 7
...
It is a formal disadvantage of this model assumption that cumulative
repair costs modeled in this way do not have nondecreasing sample paths
...
Both have 'reasonable' properties with respect to the application considered
...

In all examples, the following basic situation is considered: A system starts working
at time t = 0
...
The sample paths of the stochastic process {X(t), t ≥ 0} are assumed to
be continuous and its trend function m(t) = E(X(t)), t ≥ 0, to be progressively (faster
than linear) increasing
...
With regard to
cost and length, all replacement cycles are independent of each other
...

In this section, replacement policies based on limiting the cumulative repair cost X(t)
and the cumulative repair cost per unit time (in what follows called repair cost rate)
R(t) = X(t) /t are considered
...
The repair-replacement process continues to infinity
...

Let K 1 (τ) be the maintenance cost rate if the system is always replaced after τ time
units
...
79),
m(τ) + c
K 1 (τ) =

...
63)
τ
That value of τ minimizing K 1 (τ) is called the economic lifetime of the underlying
system and denoted as τ ∗
...

© 2006 by Taylor & Francis Group, LLC

380

STOCHASTIC PROCESSES

Policy 2 The system is replaced by a new one as soon as the cumulative repair cost
X(t) reaches a given positive level x
...
Under policy 2, the maintenance cost rate has structure
K 2 (x) = x + c
...
64)

Policy 3 The system is replaced by a new one as soon as the repair cost rate
R(t) = X(t) /t
reaches a given positive level r
...
65)

where L R (r) is the first passage time of the stochastic process {R(t), t ≥ 0} with regard to level r
...
65) and (7
...
8)
...
9 The cumulative repair cost X(t) is assumed to have structure
X(t) = x 0 ⎡ e D(t) − 1 ⎤ ,
⎣
⎦

(7
...
Since for a level x with 0 < x 0 < x 0 ,
⎛x+x ⎞
X(t) = x if and only if D(t) = ln x 0 ,
⎝ 0 ⎠
by (7
...

⎝ 0 ⎠
Therefore, under policy 2,
K 2 (x) =

x + c μ
...

⎝ 0 ⎠ x + x0
A unique solution x = x ∗ exists and the corresponding maintenance cost rate is
K 2 (x ∗ ) = (x ∗ + x 0 ) μ
...
49) yields
2
m(t) = E(X(t)) = x 0 ⎛ e (μ+σ /2) t − 1 ⎞ , t ≥ 0
...
63) is
2
x 0 ⎡ e (μ+σ /2)τ − 1 ⎤ + c
⎢
⎥
⎦
K 1 (τ) = ⎣

...
67)

There exists a unique τ = τ ∗ mimimizing K 1 (τ)
...

τ

Since m(τ, σ) ≥ m(τ, 0) for all σ , there holds
K 1 (τ, σ) ≥ K 1 (τ, 0)
...
Hence,
K 1 (τ ∗ (σ), σ) ≥ K 1 (τ ∗ (0), 0) = K 2 (x ∗ )
...
Thus, policy 2 equalizes the cost-increasing influence of random fluctuations of individual repair costs, which are ignored under
policy 1
...
Moreover, the
efficiency of policy 2 relative to policy 1 increases with increasing σ
...
10 Let the repair cost rate R(t) = X(t) /t be given by
R(t) = r 0 B 4 (t); r 0 > 0, t ≥ 0 ,
where {B(t), t ≥ 0} is the Brownian motion with parameter σ
...

⎝ 0⎠

Hence, the mean value of the first passage time of the stochastic process {R(t), t ≥ 0}
with regard to level r is given by (7
...

Thus, when applying policy 3, the corresponding maintenance cost rate (7
...

r

(7
...
89 ⎛ c 2 r 0 σ 4 ⎞

...
69)

Comparison to policy 1 Since B(t) = N(0, σ 2 t), the trend function of the cumulative
repair cost process {X(t), t ≥ 0} with X(t) = r 0 t B 4 (t) is
m(t) = r 0 t E(B 4 (t)) = 3 r 0 σ 4 t 3 , t ≥ 0
...
63) is
K 1 (τ) = 3 r 0 σ 4 τ 2 + c
...
70)

Minimizing (7
...
73 ⎛ c 2 r 0 σ 4 ⎞
...
71)

With K 3 (r ∗ ) given by (7
...
71),
K 3 (r ∗ )
= 0
...

K 1 (τ ∗ )
Hence, applying the optimal repair cost rate limit r ∗ instead of the economic lifetime
τ ∗ reduces the total maintenance cost on average by 31%
...

Example 7
...
Then,
P(X(t) ≤ x) = P(L X (x) ≥ t)
...

In what follows, policy 2 is applied on condition that X(t) has a Rayleigh distribution with probability density
f t (x) =

2 x exp ⎧ − ⎛ x ⎞ 2 ⎫ ;
⎨ ⎝ y⎠ ⎬
λ2 t2 y
⎩ λt
⎭

Then,

x ≥ 0, y > 1, λ > 0
...

Integration yields
1 1/y
E(L X (x)) = ⎛ λ ⎞ Γ ⎛ 1 − 1 ⎞ x 1/y = k 1 x 1/y
...
64) yields the
optimal limit x ∗ and the corresponding maintenance cost rate K 2 (x ∗ ) :
x∗ =

c ,
y−1

K 2 (x ∗ ) =

y ⎛ c ⎞ (y−1)/y

...

2

Minimizing the corresponding maintenance cost rate (7
...

For all y > 1, the inequality K 2 (x ∗ ) < K 1 (τ ∗ ) is equivalent to
2 < U(x),
π

0
...
72)

where
1

U(x) = [Γ(x)] 2(1−x)
...
5 ≤ x < 1] with
U(0
...
5772 is the Euler number
...
72) holds for all y > 1 so
that, as in example 7
...
In particular, if 1
...

© 2006 by Taylor & Francis Group, LLC

384

STOCHASTIC PROCESSES

The examples analyzed indicate that policies 2 and 3 belong to the most cost efficient
replacement policies
...
A great advantage to the 'repair cost limit
replacement policy' considered in section 3
...
6
...
Hence, from the modeling point of view
and with regard to their applicability, policies 2 and 3 are superior to the 'repair cost
limit replacement policy'
...

7
...
5
...
But if a random variable X is the first passage time of a
Brownian motion process with drift, then X has an inverse Gaussian distribution and
the parameters of this distribution can also be estimated on the basis of samples
generated by scanning sample paths of the underlying process
...

Let {D u (t), t ≥ 0} be a shifted Brownian motion with drift which starts at value
D u (0) = u
and let
d i = d i (t); i = 1, 2,
...
The sample path d i = d i (t) is scanned at time points
t i1 , t i2 ,
...
< t m i and m i ≥ 2, i = 1, 2,
...

i
The outcomes are
d ij = d(t ij ) ; j = 1, 2,
...
, n
...

Further, let
Δd ij = d ij − d i j−1 ; Δt ij = t ij − t ij−1
with j = 2, 3,
...
, n
...
, n
...

ti 1
Δt i j
⎪ i =1
⎪
i =1 j =2
⎩
⎭

Unfortunately, these estimators are biased
...

i

i

If u is random, then the maximum-likelihood estimator of its mean value is
n

u=

n

n

Σ i=1 d i 1 t −1 − n Σ i=1 d i m i ⎛ Σ i=1 t i m i ⎞
⎝
⎠
i1
−1
n
n
Σ i=1 t −1 − n 2 ⎛ Σ i=1 t i m i ⎞
⎝
⎠
i1

−1

...
73)

The following maximum-likelihood estimators were derived on condition that u is a
random variable
...
Let
the time points at which the sample path is scanned and the corresponding outcomes
be t 1 , t 2 ,
...
, d m , respectively
...

Σ
⎨
⎬
t1
Δt j
m−2 ⎪
⎪
j =2
⎩
⎭

Special case m i = 1; i = 1, 2,
...
This requires to drop
the assumption m i ≥ 2 stated above
...
The bias-corrected maximumlikelihood estimators of µ and σ 2 are
m

μ=

Σ i=1 d i − m u
m
Σ i=1 t i

© 2006 by Taylor & Francis Group, LLC

,

σ2 =

2
1 m (d i − μ t i − u )
...
74)

386

STOCHASTIC PROCESSES

Example 7
...
He assumed that the stochastic wear process develops
according to a Brownian motion with drift starting at u :
D u (t) = u + D(t), t ≥ 0
...
73) and (7
...
145 [μm],

μ = 0
...
137 [μm 2 /h]
...
145 + 0
...
137 S(t) ,

(7
...

Hint If the model (7
...
0029 t − 36
...
137 t

has a standard normal distribution for all t (according to property 3 of definition 7
...

In particular, this must hold for all measurement points t i
...
75) can
be supported or rejected by a chi-square goodness of fit test
...
Then the
lifetime of such a wear part is the first passage time L D u (w) of the stochastic process {D u (t), t ≥ 0} with regard to level w = 1000
...
41), estimates for mean value, variance and standard deviation of the first passage time L D u = L D u (1000) are
E(L D u ) ≈ 1000 − 36
...
0029
Var(L D u ) ≈

(1000 − 36
...
137
(0
...
41425 ⋅ 10 9 ⎡ h 2 ⎤ ,
⎣ ⎦

Var(L D u ) ≈ 73, 581 [h]
...
With the survival function
given by (7
...

Since
e −2(w−u ) μ ≈ e −5
...
76)

7 BROWNIAN MOTION

387

the second term in (7
...
Therefore, equation (7
...
77)

where z ε is the ε− percentile of the standard normal distribution
...
77) is
⎛ z σ ⎞ 2 z σ w − u ⎛ zε σ ⎞ 2
τε = w − u + 1 ⎜ ε ⎟ − ε
+⎜
⎟
...
95 , then z 0
...
65 so that τ 0
...
Thus, with
probability 0
...

The Brownian motion with drift was firstly investigated by Schrödinger [72] and
Smoluchowski [75]
...

Folks and Chhikara [18] give a survey of the theory and discuss numerous applications: distribution of the water level of dams, duration of strikes, length of employment times of people in a company, wind velocity, and cost caused by system breakdowns
...
As a distribution of first passage times, the inverse
Gaussian distribution naturally plays a significant role as a statistical model for lifetimes of systems which are subject to drift failures, see Kahle and Lehmann [42]
...

7
...
6

Integral Transformations

7
...
6
...

Hence, the integrals
t

b(t) = ∫ 0 b(y) dy
...
They are realizations of the random integral
t

U(t) = ∫ 0 B(y) dy
...
78)

The stochastic process {U(t), t ≥ 0} is called integrated Brownian motion
...
Analogously to the definition of
the Riemann integral, for any n-dimensional vector (t 1 , t 2 ,
...
< t n = t and Δt i = t i+1 − t i ; i = 0, 1, 2,
...

n→∞ ⎩ i=0
⎭

(7
...
) The random variable U(t), being the limit of a sum of independent, normally distributed random variables, is itself normally distributed
...
2, the integrated Brownian motion is a Gaussian process
...
In view of
t
t
t
E ⎛ ∫ 0 B(y) dy ⎞ = ∫ 0 E(B(y)) dy = ∫ 0 0 dy ≡ 0 ,
⎝
⎠
the trend function of the integrated Brownian motion {U(t), t ≥ 0} is 0:
m(t) = E(U(t)) ≡ 0
...

Since
E(B(y), B(z)) = Cov(B(y), B(z)) = σ 2 min (y, z) ,
it follows that
t s

C(s, t) = σ 2 ∫ 0 ∫ 0 min(y, z) dy dz
s s

t s

= σ 2 ∫ 0 ∫ 0 min(y, z) dy dz + σ 2 ∫ s ∫ 0 min(y, z) dy dz
t s
s z
s
= σ 2 ∫ 0 ⎡ ∫ 0 y dy + ∫ z z dy ⎤ dz + σ 2 ∫ s ∫ 0 y dy dz
⎣
⎦
3
2
= σ 2 s + σ 2 s (t − s)
...

7 BROWNIAN MOTION

389

Letting s = t yields
2
Var(U(t)) = σ t 3
...
But it can be shown that for any τ the process {V(t), t ≥ 0} with

V(t) = U(t + τ) − U(t)
is stationary
...
)
7
...
6
...
However, a definition via an integral is possible
...
, t n
any sequence of numbers satisfying
a = t 0 < t 1 <
...
, n − 1
...

⎭

lim

n→∞
⎩ i=0
max Δt i →0

(7
...
,n

The sum in (7
...

Δt i

Taking the limit on both sides as in (7
...

(7
...
80)
...
From (7
...

⎝
⎠

(7
...

Passing in this equation to the limit as in (7
...

(7
...
81) motivates the following definition
...
3 (White noise) Let {B(t), t ≥ 0} be the Brownian motion
...

(7
...
84) anyway
...
3 can be interpreted as a 'generalized derivative' of
B(t), because it exists although the differential quotient does not exist
...
To get
an idea of the nature of the white noise process {X(t), t ≥ 0}, a heuristic argument is
presented by 'deriving' the covariance function of {X(t), t ≥ 0} : Assuming that the order of 'generalized differentiation' and integration can be exchanged, one obtains for
all s and t with s ≠ t,
∂B(s) ∂B(t) ⎞
C(s, t) = Cov(X(s), X(t)) = Cov ⎛
,
⎝ ∂s
∂t ⎠
∂ ∂ Cov(B(s), B(t))
=
∂s ∂t
= ∂ ∂ min(s, t)
...

∂s ∂t
∂s

If s > t, then
C(s, t) = ∂ ∂ t = ∂ 1 = 0
...

(7
...
Thus, white noise can be interpreted as
the 'most random stochastic process', and this property explains its favourite role as a
process for modeling random noise, which is superimposed on a useful signal
...
85), white noise cannot exist in the real world
...
Its role can be compared with the concept of
the 'point mass' in mechanics, which also only exists in theory
...
The times in which the pulses rise and fall are so short
that they cannot be registered by measuring instruments
...

In practice, a weakly stationary stochastic process {X(t), t ≥ 0} can approximately be
considered white noise if the covariance between X(t) and X(t + τ) tends extremely
fast to 0 with increasing τ
...
The process {X(t), t ≥ 0} is known to be
weakly stationary with a covariance function of type
C(τ) = e −b τ ,
where
b ≥ 10 19 sec −1
...

A similar fast drop of the covariance function can be observed if {X(t), t ≥ 0} describes the electromotive force in a conductor, which is caused by the thermal movement of electrons
...
6 EXERCISES
Note In all exercises, {B(t), t ≥ 0} is the Brownian motion with Var(B(1)) = σ 2
...
1) Verify that the probability density f t (x) of B(t),
f t (x) =

1 e −x 2 /(2 σ 2 t) ,
2πt σ

t > 0,

satisfies the thermal conduction equation
∂ f t (x)
∂ 2 f t (x)
=c

...
2) Determine the conditional probability density of B(t) given B(s) = y, 0 ≤ s < t
...
3)* Prove that the stochastic process {B(t), 0 ≤ t ≤ 1} given by
B(t) = B(t) − t B(1)
is the Brownian bridge
...
4) Let {B(t), 0 ≤ t ≤ 1} be the Brownian bridge
...

7
...

7
...
Determine mean value and variance of
X(n) = B(1) + B(2) +
...

Hint Make use of formula (1
...

7
...

7
...
e show that
E(X(t) X(y), y ≤ s) = X(s), s < t
...
9) Prove that the increments of the Ornstein-Uhlenbeck process are not independent
...
10)* Starting from x = 0 , a particle makes independent jumps of length
Δx = σ Δt
to the right or to the left every Δt time units
...

μ
Show that as Δt → 0 the position of the particle at time t is governed by a Brownian
motion with drift with parameters µ and σ
...
11) Let {D(t), t ≥ 0} be a Brownian motion with drift with parameters μ and σ
...

⎝
⎠
7
...

Hint Make use of formula (7
...

7
...

7
...
The price X(t) of the underlying risky security at time t
is given by
X(t) = x 0 e B(t)
...

1) What is the speculator's mean discounted payoff G α (x) under a constant discount
rate α ?
2) What is the speculator's payoff G(x) without discounting?
In both cases, cost of acquiring the option is not included in the speculator's payoff
...
15) The price X(t) of a risky security at time t is
X(t) = x 0 e μt+B(t)+a B(t) , t ≥ 0, 0 < a ≤ 1,
with a negative drift parameter µ
...
The option has no finite expiration date
...

Otherwise, i
...
if the price of the risky security never reaches level x, the speculator
will never exercise
...

7
...
At time point t = 0 a speculator acquires an American
call option on this share with finite expiry date τ
...

(1) Why does the assumption make sense?
(2) When should the speculator exercise to make maximal mean undiscounted profit?
7
...
Thus, the option can only be exercised at time τ at price
x s , independently of its market value at time τ
...
If X(τ) > x s , the speculator will exercise the option
...
As in example 7
...

1) What will be the mean undiscounted payoff of the speculator (cost of acquiring
the option not included)?
2) Under otherwise the same assumptions, what is the investor's mean undiscounted
profit if
X(t) = x 0 + B(t) and x 0 = x s ?

© 2006 by Taylor & Francis Group, LLC

7 BROWNIAN MOTION

395

7
...
Assume
R(t) = r 0 B 2 (t) , r 0 > 0
...

(1) Given a constant replacement cost c, determine a level r = r ∗ which is optimal
with respect to the long-run total maintenance cost per unit time K(r)
...
)
(2) Compare K(r ∗ ) to the minimal long-run total maintenance cost per unit time
K(τ ∗ ) which arises by applying the corresponding economic lifetime τ ∗
...
19)* Let {S(t), t ≥ 0} be the standard Brownian motion and
t

X(t) = ∫ 0 S(s) ds
...

(2) Verify that
3
E(X(t) S(t) = x) = t x and Var(X(t) S(t) = x) = t
...

7
...
19
...

© 2006 by Taylor & Francis Group, LLC

397

ANSWERS TO SELECTED EXERCISES
Chapter 1
1
...
Then the
sample space M consists of all the 2 3 = 8 vectors (z 1 , z 2 , z 3 ) with
⎧ 1 if person i has gene g
zi = ⎨
; i = 1, 2, 3
...

B ∪ C = M\A, (A ∪ B) ∩ C = A ∪ B
1
...
6, 0
...
8
1
...
925, 0
...
85
...
965, 0
...
5) 0
...
61, 0
...
68, 0
...
881, 0
...
8205
1
...
0902
1
...
8) (1) and (2): don't check (3) check
1
...
6475 (2) 0
...
10) (1) 0
...
2978
1
...
12) (1) 0
...
9744
1
...
Probability distribution of X:
{ p i = P(X = x i ) = n i /n ; i = 1, 2,
...
(2) 0
...
5
1
...
16)
1
...
18)
1
...
20)
1
...
22)
1
...
24)
1
...
26)

45
...
3421
15
...
0151
0
...
1329
0
...
8701 (2) 0
...
191
0
...
4493
(1) c = 3/64 (2) c = 1/6 (3) c = 1
0
...
6931, 0
...
4

© 2006 by Taylor & Francis Group, LLC

398

STOCHASTIC PROCESSES

1
...
54
1
...
6 (2) 11/9, 23/81 (3) 1, 1
1
...
1009 b) 0
...
9963
...
30) (1) F(x) = (x − 2) 3 ⎡ 10 − 15(x − 2) + 6(x − 2) 2 ⎤ , 2 ≤ x ≤ 3,
⎣
⎦
(2) 0
...
5
1
...
56, 6
...
3
1
...
0475 (2) 0
...
33) (1) 0
...
68 %
1
...
1524 (2) 125
...
35) f (p) =
0, otherwise
∞
1
...
37) 0
...
38) [0, ∞)
1
...
2, p 1 = 0
...
3}, {q 0 = 0
...
6, q 2 = 0
...

1
...

1
...

1
...
792

1
...
032, 0
...
46) n min = 43
...
47) 0
...

, (2) M Z (z) = ⎛
1
...
50) p 1 = p 2 =
...

1
...

1
...
53) (1) n 0 = 11, 280
© 2006 by Taylor & Francis Group, LLC

(2) n 0 = 2167
...
1) not stationary
2
...
3) (1) m(t) ≡ 0, C(τ) = 1 E(A 2 ) cos ωτ, ρ(τ) = cos ωτ
...
5) C(τ) = 1
2

2
n
2
Σ i=1 a i cos ωτ , ρ(τ) = cos ωτ

2
...

,
elsewhere

(3) no

2
...

2
...

2
...
1) (1) 0
...
4422
3
...
3) 0
...
4)
3
...
9084 (2) E(Y) = 1/4 min, Var(Y) = (1/4) 2
0
...
6)

λ 2 /λ 1

⎧ λ (π − τ )cos τ + sin(π − τ ) 0 ≤ τ ≤ π
⎪
C(τ) = ⎨ 2
elsewhere
⎪ 0,
⎩
λ
3
...
8)

3
...
89

c p ⎤ 1/β
3
...
14) (1) P(N L (t) = n) = 1 ⎡ 1 − e −t Σ k=0 t k /k! ⎤ ; k = 0, 1,
...
17) (1) K(c) =

...
18) (1) n ∗ = 86 (2) n ∗ = 88
3
...
22) 0
...
25) (2) H(n) =
1−p
3
...
27) μ ∫ 0 F(x) dx

F(t − x + y)
,
F(t − x)
F(x + y) − F(x)
(2) P(A(t) ≤ y B(t) = x) =
F(x)

3
...
30) 1 (λx + 2) e −λx
3

3
...
9841 (2) 0
...
33) (1) K(τ) =
τ
∫ 0 F(t) dt
τ

(2) λ(τ)∫ 0 F(t) dt − F(τ) = c/(1 − c) with 0 < c = c p /c e < 1 −
(3) τ ∗ = z ⎡ c(2 − c) − c ⎤ with 0 < c = c p /c e < 1
⎦
1−c ⎣
3
...
35) (1) 0
...
36) (1) 0
...
1342
3
...
5 [$/ h] (2) ≈ 0

−

x
13,600

Chapter 4
4
...
5, 0
...
25, 0
...
5, 0
...
58 0
...
3 ⎞

⎜
⎟
4
...
32 0
...
4 ⎟
⎜
⎟
⎝ 0
...
18 0
...
42,

4
...
2864 (3) π 0 = 0
...
3
© 2006 by Taylor & Francis Group, LLC

0

1
μ λ(∞)

ANSWERS

401

4
...
5) (2) π i = 0
...
See exercise 4
...

4
...
8) (2) P = ⎜
⎜
⎜
⎜
⎝

0
...
6
0

0
...
4
0

0
0
...
2

0
0
...
8

⎞
⎟
⎟,
⎟
⎟
⎟
⎠

π 1 = 3/8, π 2 = π 3 = 1/8, π 4 = 3/8

4
...

4
...
13) (3) π 0 = 50/150, π 1 = 10/150, π 2 = 40/150, π 3 = 13/150, π 4 = 37/150
4
...
15) π i = p(1 − p) i ; i = 0, 1,
...
18) (1) positive recurrent (2) transient
p

4
...

ii
Chapter 5
5
...
3) π 0 =

2λ ,
2λ + 3μ

π1 =

2μ
,
2λ + 3μ

π2 =

μ
2λ + 3μ

5
...
5) state (i, j): i, j respective states of unit 1 and 2: 0 down, 1 operating
λ1

(0,1)

λ2

(1,1)

(0,0)
λ2

(1,0)

λ1

5
...
9) p 0 (t) = e −2t , p 1 (t) = 2 ⎛ e −2t − e −3t ⎞ , p 2 (t) = 3e −t (1 − e −t ) 2
⎝
⎠
1
5
...
+ 1 ⎞
⎝
2
n−1 ⎠

5
...

1
5
...
+ 1 ⎞ , n ≥ 1
⎝ 2n 2n−1
⎝ j ⎠
n+1 ⎠

5
...
56 (2) 50 weeks
(Hint: p 0 (t) = P(cable completely broken at time t) is given by an Erlang distribution with parameters n = 5 and λ = 0
...
)
5
...
14
5
...
20)

(1)

3λ

3λ

0

1
μ

(2)

π loss = π 3 =

5
...
75ρ 3
1 + 3ρ + 4
...
75ρ 3

2μ
,

ρ = λ/μ

13
...
5ρ 2 + 13
...
23) π loss = 0
...
6149
5
...
26) (1) π 0 = π 1 = π 2 = π 3 = 1/4 (2) 1
...
28) see example 5
...
32) (1)
nλ(1 − p)

n

(n − 1)λ(1 − p)

...
5 − p]

(2) 1 − F s (t) = P(X s > t) = p 1 (t) + p 2 (t),
5
...
37) 0
...
4144, 0
...
38) (2) states: 0 working, 1 repair after type 2 failure, 2 repair after type 1 failure
P(X = 0) = 360/372, P(X = 1) = 4/372, P(X = 2) = 8/372
5
...
40) The stationary state probabilities of the embedded Markov chain are:
λ + λ0
λ
π0 =
, π1 =
,
−λ 1 μ )
2λ 0 + λ (3 − e
2λ 0 + λ (3 − e −λ 1 μ )
π2 =

λ 0 + λ(1 − e −λ 1 μ )

2λ 0 + λ (3 − e −λ 1 μ )

...
26, whereas μ 1 is
−λ 1 μ
μ1 = 1 − e

...
1) no, since E(Y i2 ) > 0 (see example 6
...

6
...
1 is applicable with X i = Y i − E(Y i ) since E(X i ) = 0
...
3) (1) T = 2, (2) T > 2,

(3) T < 2
...
4) σ 2 = −2 μ (condition μ < 0 is necessary)

n
2p − 1
6
...
8703 , (2) p −1000 = 0
...
5) E(N) =

(3) E(N) = 64
...

6
...

6
...
12 or proof of theorem 7
...
2) f t (x B(s) = y) =
7
...
6) E(X(n)) = 0,

⎛ ( x − y) 2 ⎞
1
exp ⎜ −
⎟,
⎝ 2(t − s) σ 2 ⎠
2 π ( t − s) σ

0≤s
⎧
⎫
1
x2
exp ⎨ − 1
⎬ , − ∞ < x < +∞
2π(t + 3s) σ
⎩ 2 (t + 3s) σ 2 ⎭

Var (X(n)) =

n (n + 1) (2n + 1) 2
σ
6

2
7
...
13) (1) 12 x 2 (see example 7
...
14) 1) G a (x) = (x − x s ) ⎛ x0 ⎞ with γ =
⎝ ⎠
s

2α
σ

2 ) G ( x) = x − x s

7
...
42) with λ =

2μ

(1 + a ) 2 σ 2
(2) Optimal level x∗ given by formula (7
...

γ=

7
...
17) 1) G = σ τ c Φ(c) + σ

© 2006 by Taylor & Francis Group, LLC

1 2
τ e − 2 c , where c = x 0 +μ τ−x s 2) G = σ τ
σ τ
2π
2π

405

REFERENCES
[1] Asmussen, S
...
Singapore, London, 2000
...
, Théorie de la spéculation,
´
Ann
...
de l' E cole Normale Supér
...

[3] Barlow, R
...
and Proschan, F
...

[4] Barlow, R
...
and Proschan, F
...

[5] Beichelt, F
...
Operationsforschung und Statistik, 7, 927, 1976
...
, Stochastische Prozesse für Ingenieure,
B
...
Teubner, Stuttgart, 1997
...
and Fatti, P
...

[8] Beichelt, F
...
C
...
, Teubner-Taschenbuch der Stochastik,
B
...
Teubner, Stuttgart (2003)
...
N
...
K
...
,
Wiley, New York, 2002
...
and Scholes, M
...

Journal of Political Economy 81, 637, 1973
...
, Elements of Stochastic Modeling,
World Scientific, Singapore, 2003
...
and Potters, M
...

[13] Brandt, A
...
, and Lisek, B
...

[14] Brown, M
...
, 22, 93, 1980
...
, A brief account of microscopial observations made in the months of
June, July, and August, 1827, on particles contained in the pollen of plants; and on
the general existence of active molecules in organic and inorganic bodies,
Phil
...
, Series 2, 161, 1828
...
and Zastawniak, T
...

© 2006 by Taylor & Francis Group, LLC

406

STOCHASTIC PROCESSES

[17] Capasso, V
...
, An Introduction to Continuous-Time Stochastic
Processes, Birkhäuser, 2005
...
S
...
L
...

[19] Chung, K
...
, Markov Chains with Stationary Transition Probabilities,
Springer-Verlag, Berlin, 1960
...
L
...
, Elementary Probability,
Springer, New York, Berlin, Heidelberg, 2003
...
R
...

J
...
Statist
...
B, 17, 129, 1955
...
and Leadbetter, M
...
, Stationary and Related Stochastic Processes,
Wiley, New York, 1967
...
L
...

[24] Dubourdieu, J
...
Trim
...
des Actuaires Français, 49, 76, 1938
...
, Essentials of Stochastic Processes,
Springer, New York, Berlin, 1999
...
, Über die von der molekularkinetischen Theorie der Wärme
geforderte Bewegung von in ruhenden Flüssigkeiten suspendierten Teilchen
...

[27] Feller, W
...
I,
rd
3 ed
...

[28] Feller, W
...
II,
Wiley & Sons, New York, 1971
...
et al
...

[30] Franz, J
...
Operationsforsch
...

[31] Gardner, W
...
, Introduction to Random Processes with Applications to
Signals and Systems, Mc Graw-Hill Publishing Company, New York, 1989
...
and Pujolle, G
...

[33] Gnedenko, B
...
and König, D
...

[34] Grandell, J
...

[35] Grandell J
...

© 2006 by Taylor & Francis Group, LLC

REFERENCES

407

[36] Grimmett, G
...
and Stirzaker, D
...
, Probability and Random Processes,
3rd ed
...

[37] Guo, R
...
, and Love, E
...
Oper
...
(ORION) 16, 2, 87, 2000
...
, Cumulative shock models, Adv
...
Prob
...

[39] Hellstrom, C
...
, Probability and Stochastic Processes for Engineers,
Macmillan Publishing Company, New York, London, 1984
...
J
...
E
...

[41] Kaas, R
...
(authors are Kaas, Goovaerts, Dhaene, and Denuit),
Modern Actuarial Risk Theory, Springer, New York, 2004
...
and Lehmann, A
...
, von Collani, E
...
, and Jensen, U
...
,
Birkhäuser, Boston, Basel, 1998
...
, An Introduction to Stochastic Processes, North Holland;
New York, 1979
...
K
...
B
...
, Contributions to Hardware and Software Reliability, World Scientific, Singapore, London, 1999
...
and Taylor, H
...
, A Second Course to Stochastic Processes,
Academic Press, New York, 1981
...
and Taylor, H
...
, An Introduction to Stochastic Modeling,
Academic Press, New York, 1994
...
G
...
Math
...

[48] Kijima, M
...
, and Suzuki, Y
...
Europ
...
Oper
...

[49] Kijima, M
...

[50] Kijima, M
...

[51] König, D
...
, Zufällige Punktprozesse
...

[52] Kulkarni, V
...
, Modeling and Analysis of Stochastic Systems,
Chapman & Hall, New York, London, 1995
...
G
...

[54] Lawler, G
...

[55] Lorden, G
...
Math
...

[56] Lundberg, O
...

[57] Makabe, H
...
, A new policy for preventive maintenance
...
Oper
...
Soc
...

[58] Mann, H
...
and Whitney, D
...
, On a test whether one of two random variables
is stochastically larger than the other,
Ann
...
Statistics, 18, 50, 1947
...
T
...
Appl
...
, 24, 245, 1973
...
, Kerstan, J
...
, Unbegrenzt teilbare Punktprozesse,
Akademie-Verlag, Berlin, 1974,
English edition: Infinitely Divisible Point Processes, Wiley, New York, 1978
...
C
...

[62] Müller, A
...
, Comparison Methods for Stochastic Models and
Risks, Wiley & Sons, 2002
...
K
...

[64] Paul, W
...
, Stochastic Processes
...

[65] Pieper, V
...

[66] Resnick, S
...
, Adventures in Stochastic Processes, Birkhäuser, Basel, 1992
...
et al (authors are: Rolski, Schmidli, Schmidt, Teugels),
Stochastic Processes for Insurance and Finance, Wiley, NewYork, 1999
...
and Rammler, E
...
Inst
...

nd

[69] Ross, S
...
, Stochastic Processes, 2 ed
...

th

[70] Ross, S
...
, Introduction to Probability Models, 8 ed
...

© 2006 by Taylor & Francis Group, LLC

REFERENCES

409

[71] Scheike, T
...
, A boundary-crossing result for Brownian motion
...
Appl
...
, 29, 448, 1992
...
, Zur Theorie der Fall- und Steigversuche an Teilchen mit
Brownscher Bewegung,
Physikalische Zeitschrift, 16, 289, 1915
...
, The Inverse Gaussian Distribution,
Springer, New York, Berlin, 1999
[74] Shafer, G
...
, Probability and Finance
...

[75] Smoluchowski, M
...

Physikalische Zeitschrift, 16, 318, 1915
...
L
...

[77] Solov'ev, A
...
, Rascet i ocenka characteristic nadez nosti, Izd
...

[78] Stigman, K
...

[79] Stoyan, D
...

[80] Stoyan, D
...

[81] Tijms, H
...
, Stochastic Models-An Algorithmic Approach
...

[82] Tijms, H
...
, A First Course in Stochastic Models,
Wiley & Sons, New York, 2003
...
and Nishida, T
...

[84] van Dijk, N
...

[85] Vinogradov, O
...
, O primenenijach odnoj formuly obrasc enija preobrasovanija
laplasa,
Teorija verojatnost
...
, 21, 857, 1976
...
, An Introduction to Queueing Networks,
Prentice Hall, Englewood Cliffs, 1988
...
, Differential space
J
...
and Phys
...

[88] Williams, D
...

© 2006 by Taylor & Francis Group, LLC

410

STOCHASTIC PROCESSES

[89] Willmot, G
...
and Lin, X
...
, Lundberg Approximations for Compound
Distributions with Insurance Applications,
Springer, New York, 2001
...
D
...
J
...

© 2006 by Taylor & Francis Group, LLC

Title: Stochastic Processes Beichelt.pdf
Description: This book contains detailed description of stochastic processes with examples

Buy These Notes Preview

Notesale: Turn your study into money

Already a Member? >

Search for notes by fellow students, in your own course and all over the country.

My Basket

Document Preview