I hold both a bachelor's and a master's degree in applied mathematics.
The variance is the second most important measure of a probability distribution, after the mean. It quantifies the spread of the outcomes of a probability distribution. If the variance is low, then the outcomes are close together, while distributions with a high variance have outcomes that can be far apart from each other.
To understand the variance, you need to have some knowledge about the expectation and probability distributions. If you don't have this knowledge, I suggest reading my article about the mean of a probability distribution.
What Is the Variance of a Probability Distribution?
The variance of a probability distribution is the mean of the squared distance to the mean of the distribution. If you take multiple samples of probability distribution, the expected value, also called the mean, is the value that you will get on average. The more samples you take, the closer the average of your sample outcomes will be to the mean. If you would take infinitely many samples, then the average of those outcomes will be the mean. This is called the law of large numbers.
An example of a distribution with a low variance is the weight of the same chocolate bars. Although the packing will say the same weight for all—let's say 500 grams—in practice, however, there will be slight variations. Some will be 498 or 499 grams, others maybe 501 or 502. The mean will be 500 grams, but there is some variance. In this case, the variance will be very small.
However, if you look at every outcome individually, then it is very likely that this single outcome is not equal to the mean. The average of the squared distance from a single outcome to the mean is called the variance.
An example of a distribution with a high variance is the amount of money spent by customers of a supermarket. The mean amount is maybe something like $25, but some might only buy one product for $1, while another customer organizes a huge party and spends $200. Since these amounts are both far away from the mean, the variance of this distribution is high.
This leads to something that might sound paradoxical. But if you take a sample of a distribution in which the variance is high, you don't expect to see the expected value.
Formal Definition of the Variance
The variance of a random variable X is mostly denoted as Var(X). Then:
Var(X) = E[(X - E[X])2] = E[X2] - E[X]2
This last step can be explained as follows:
E[(X - E[X])2] = E[X2 - 2XE[X] + E[X]2] = E[X2] -2 E[XE[X]] + E[E[X]]2
Since the expectation of the expectation is equal to the expectation, namely E[E[X]] = E[X], this simplifies to the expression above.
Calculating the Variance
If you want to calculate the variance of a probability distribution, you need to calculate E[X2] - E[X]2. It is important to understand that these two quantities are not the same. The expectation of a function of a random variable is not equal to the function of the expectation of this random variable. To calculate the expectation of X2, we need the law of the unconscious statistician. The reason for this strange name is that people tend to use it as if it was a definition, while in practice it is the result of a complicated proof.
The law states that the expectation of a function g(X) of a random variable X is equal to:
Σg(x)*P(X=x) for discrete random variables.
∫g(x)f(x) dx for continuous random variables.
This helps us to find E[X2], as this is the expectation of g(X) where g(x) = x2. X2 is also called the second moment of X, and in general, Xn is the n'th moment of X.
Some Examples of Calculations of the Variance
As an example, we will look at the Bernouilli distribution with success probability p. In this distribution, only two outcomes are possible, namely 1 if there is a success and 0 if there is no success. Therefore:
E[X] = Σx P(X=x) = 1*p + 0*(1-p) = p
E[X2] = Σx2 P(X=x) = 12*p + 02*(1-p) = p
So the variance is p - p2. So when we look at a coinflip where we win $1 if it comes heads and $0 if it comes tails we have p = 1/2. Therefore the mean is 1/2 and the variance is 1/4.
Another example could be the poisson distribution. Here we know that E[X] = λ. To find E[X2] we must calculate:
E[X2] = Σx2 P(X=x) = Σx2*λx*e-λ/x! = λe-λΣx*λx-1/(x-1)! = λe-λ(λeλ+ eλ) = λ2+λ
How to exactly solve this sum is pretty complicated and goes beyond the scope of this article. In general, calculating expectations for higher moments can involve some complicated complications.
This allows us to calculate the variance as it is λ2+λ - λ2 = λ. So for the poisson distribution, the mean and variance are equal.
An example of a continuous distribution is the exponential distribution. It has expectation 1/λ. The expectation of the second moment is:
E[X2] = ∫x2 λe-λxdx.
Again, solving this integral requires advanced calculations involving partial integration. If you would do this, you get 2/λ2. Therefore, the variance is:
2/λ2 - 1/λ2 = 1/λ2.
Properties of the Variance
Since the variance is a square by definition, it is nonnegative, so we have:
Var(X) ≥ 0 for all X.
If Var(X) = 0, then the probability that X is equal to a value must be equal to one for some a. Or stated differently, if there is no variance, then there must be only one possible outcome. The opposite is also true, when there is only one possible outcome the variance is equal to zero.
Other properties regarding additions and scalar multiplication give:
Var(aX) = a2Var(X) for any scalar a.
Var(X + a) = Var(X) for any scalar a.
Var(X+Y) = Var(X) + Var(Y) + Cov(X,Y).
Here Cov(X,Y) is the covariance of X and Y. This is a measure of dependence between X and Y. If X and Y are independent, then this covariance is zero and then the variance of the sum is equal to the sum of the variances. But when X and Y are dependent, the covariance must be taken into account.
This content is accurate and true to the best of the author’s knowledge and is not meant to substitute for formal and individualized advice from a qualified professional.