Memorylessness
In probability and statistics, memorylessness is a property of certain probability distributions: the exponential distributions of non-negative real numbers and the geometric distributions of non-negative integers.
The property is most easily explained in terms of "waiting times." Suppose that a random variable, X, is defined to be the time elapsed in a shop from 9 am on a certain day until the arrival of the first customer: thus X is the time a server waits for the first customer. The "memoryless" property makes a comparison between the probability distributions of the time a server has to wait from 9 am onwards for his first customer, and the time that the server still has to wait for the first customer on those occasions when no customer has arrived by any given later time: the property of memorylessness is that these distributions of "time from now to the next customer" are exactly the same.
As another example, suppose X is the lifetime of a car engine given in terms of number of miles driven. If the engine has lasted 200,000 miles, then, based on our intuition, it is clear that the probability that the engine lasts another 100,000 miles is not the same as the engine lasting 100,000 miles from the first time it was built. However, memorylessness states that the two probabilities are the same (And if our intuition is right, the distribution that describes the lifetime of a large set of these engines does not have the memorylessness property). In essence, we 'forget' what state the car is in. In other words, the probabilities are not influenced by how much time has elapsed. [1]
In the context of Markov processes, memorylessness refers to the Markov property, a different concept which implies that the properties of random variables related to the future depend only on relevant information about the current time, not on information from further in the past. The present article describes the use outside the Markov property, limited to conditional probability distributions.
Discrete memorylessness
Suppose X is a discrete random variable whose values lie in the set { 0, 1, 2, ... }. The probability distribution of X is memoryless precisely if for any m, n in { 0, 1, 2, ... }, we have
Here, Pr(X > m + n | X > m) denotes the conditional probability that the value of X is larger than m + n, given that it is larger than or equal to m.
The only memoryless discrete probability distributions are the geometric distributions, which feature the number of independent Bernoulli trials needed to get one "success," with a fixed probability p of "success" on each trial. In other words those are the distributions of waiting time in a Bernoulli process.
A frequent misunderstanding
"Memorylessness" of the probability distribution of the number of trials X until the first success means that
It does not mean that
which would be true only if the events X > 40 and X > 30 were independent.
Continuous memorylessness
Suppose X is a continuous random variable whose values lie in the non-negative real numbers [0, ∞). The probability distribution of X is memoryless precisely if for any non-negative real numbers t and s, we have
This is similar to the discrete version except that s and t are constrained only to be non-negative real numbers instead of integers. Rather than counting trials until the first "success," for example, we may be marking time until the arrival of the first phone call at a switchboard.
The memoryless distribution is an exponential distribution
The only memoryless continuous probability distributions are the exponential distributions, so memorylessness completely characterizes the exponential distributions among all continuous ones. The property is derived through the following proof:
To see this, first define the survival function, G, as
Note that G(t) is then monotonically decreasing. From the relation
and the definition of conditional probability, it follows that
This gives the functional equation, which is, by definition a result of the memorylessness property.
From this, we must have:
In general:
The only continuous function that will satisfy this equation for any positive, rational, real is:
Where
Therefore, since is a probability and must have , then any memorylessness function must be an exponential.
Put a different way, is a monotone decreasing function (meaning that for times , then ).
The functional equation alone will imply that G restricted to rational multiples of any particular number is an exponential function. Combined with the fact that G is monotone, this implies that G over its whole domain is an exponential function.
Notes
References
- Feller, W. (1971) Introduction to Probability Theory and Its Applications, Vol II (2nd edition),Wiley. Section I.3 ISBN 0-471-25709-5