This lecture shows how to derive confidence intervals for the mean of a normal distribution.
We tackle two different cases:
when the variance of the distribution is known;
when the variance is unknown.
In each case we derive the level of confidence and we discuss how it is set.
We conclude with two solved exercises.
The theory needed to fully understand the derivations can be found in the lecture on interval estimation.
We start from the simpler case in which the variance is known.
We observe the realizations of independent random variables , ..., , all having a normal distribution with
unknown mean ;
known variance .
To construct a confidence interval for the mean , we use the sample mean
The confidence interval iswhere is a strictly positive constant.
We explain below how is chosen.
The coverage probability is the probability that the confidence interval will include the true mean .
The coverage probability of iswhere is a standard normal random variable.
The coverage probability can be written aswhere we have defined Given the assumptions made above, the sample mean has a normal distribution with mean and variance , as demonstrated in the lecture on Point estimation of the mean. If we de-mean a normal random variable and we dividing it by the square root of its variance, we obtain a standard normal random variable. Therefore, the variable has a standard normal distribution.
The coverage probability does not depend on the unknown parameter .
Therefore, the level of confidence coincides with the coverage probability:where is a standard normal random variable.
The level of confidence is chosen by the statistician, who adjusts the constant accordingly.
If the level of confidence is set equal to , then where is the cumulative distribution function of a standard normal random variable.
The level of confidence can be written aswhere we have used the fact thatby the symmetry of the standard normal distribution around . Therefore,
We now relax the assumption that the variance of the distribution is known.
We observe the realizations of independent random variables , ..., , all having a normal distribution with
unknown mean ;
unknown variance .
To construct a confidence interval for the mean , we use the sample meanand the adjusted sample variance
The confidence interval for the mean is:where is a strictly positive constant.
The coverage probability of the confidence interval iswhere is a standard Student's t random variable with degrees of freedom.
The coverage probability can be written aswhere we have definedNow, rewrite aswhere we have definedGiven the assumptions made above, the adjusted sample variance has a Gamma distribution with parameters and , as demonstrated in the lecture on Point estimation of variance. Therefore, the random variable has a Gamma distribution with parameters and . Moreover, the random variable has a standard normal distribution (see the previous section). Hence, is the ratio between a standard normal random variable and the square root of a Gamma random variable with parameters and . As a consequence, has a standard Student's t distribution with degrees of freedom (see the lecture on the Student's t distribution for a proof of this fact).
The coverage probability does not depend on the unknown parameters and .
Therefore, the level of confidence coincides with the coverage probability:where has a standard Student's t distribution with degrees of freedom.
As before, the constant is adjusted so as to achieve the desired level of confidence.
If the latter is equal to , then where is the cumulative distribution function of a standard Student's t random variable with degrees of freedom.
The proof is identical to that we have shown above for the case of known variance. In fact, also the t distribution is symmetric around .
Below you can find some exercises with explained solutions.
Suppose that you observe a sample of 100 independent draws from a normal distribution having unknown mean and known variance .
Denote the draws by , ..., .
Their sample mean is
Find a confidence interval for having coverage probability.
For a given sample size , the interval estimatorhas coverage probabilitywhere is a standard normal random variable and is a strictly positive constant. Thus, we need to find such thatButwhere the last equality stems from the fact that the standard normal distribution is symmetric around zero. Therefore must be such thatorUsing normal distribution tables or a computer program to find the value of (see the lecture entitled Normal distribution - Values), we obtainThus, the confidence interval for is
Suppose you observe a sample of 100 independent draws from a normal distribution having unknown mean and unknown variance .
Denote the draws by , ..., .
The sample mean is
The adjusted sample variance is
Set the level of confidence at 99% and find a confidence interval for the mean .
For a given sample size
,
the interval
estimatorhas
coverage
probabilitywhere
is a standard Student's t random variable with
degrees of freedom and
is a strictly positive constant. Thus, we need to find
such
thatButwhere
the last equality stems from the fact that the standard Student's t
distribution is symmetric around zero. Therefore
must be such
thator:Using
a computer program to find the value of
(for example, with the MATLAB command tinv(0.995,99)
),
we
obtainThus,
the confidence interval for
is
Please cite as:
Taboga, Marco (2021). "Confidence interval for the mean", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/set-estimation-mean.
Most of the learning materials found on this website are now available in a traditional textbook format.