Index > Glossary

Covariance formula

by Marco Taboga, PhD

A covariance formula is an equation used to define or calculate the covariance between two variables.

There are several formulae that can be used, depending on the situation.

Table of contents

General formula
Formula for discrete variables
1. Example
Formula for continuous variables
1. How to compute the double integral
2. Example
Covariance formula based on moments
1. Example
2. Use with moment generating function
Formulae for the sample covariance
1. Unbiased sample covariance
2. Example
More details, proofs and exercises
Keep reading the glossary

General formula

We begin with a general formula, used to define the covariance between two random variables and :where:

denotes the covariance;
denotes the expected value operator.

This is a definition and it is useful because of its generality. However, you need to use the equations below if you need to compute covariance in practice.

Formula for discrete variables

When the two random variables are discrete, the above formula can be written as [eq2] where:

$R_{XY}$ is the set of all couples of values of and that can possibly be observed;
is the joint probability mass function, which gives the probability of observing a specific couple ;
the summation symbol indicates that we need to perform a sum over all the values that and can take jointly.

In other words, we sum the products of the deviations of the two random variables from their respective means. Each product is weighted by a probability.

Example

Suppose that the probability mass function is [eq5]

The support $R_{XY}$ contains three possible couples: [eq6]

The calculations are performed as follows: [eq7]

Formula for continuous variables

When the two random variables are continuous, the covariance formula involves a double integral:where:

is the joint probability density function of and ;
both the integrals are between and .

How to compute the double integral

The double integral is computed in two steps:

we calculate the inner integral:which will be found to be a function of only because is "integrated out";
we compute the outer integral

Example

Let the joint probability density function be [eq12]

In order to compute the expected values, we first need to find the marginal density functions: [eq13]

We can now work out the covariance: [eq14]

Covariance formula based on moments

Instead of using the formulae above to find the covariance, it is often easier to use the following equivalent equation based on moments and cross moments:

Example

In the previous example, after finding the expected values of and , we could have done: [eq16]

Use with moment generating function

When we know the joint moment generating function of and , we can use it to compute the moments , and and then plug their values in the formula above.

Formulae for the sample covariance

Until now, we have discussed how to calculate the covariance between two random variables.

However, there is another concept, that of sample covariance, which is used to measure the degree of association between two observed variables in a sample of data.

Given observed couples their sample covariance is calculated as [eq21] where and are the sample means of the two variables: [eq22]

Unbiased sample covariance

An alternative to the formula above is the so-called unbiased sample covariance [eq23]

The only difference is that we divide by instead of dividing by .

If the observed couples are independent draws from the joint distribution of two random variables and , then $s_{xy}$ is an unbiased estimator of .

Example

In this example, there are four observed couples, whose values are reported in the columns of the table below.

The last two rows of the table are used to calculate the means and the sample covariance (biased and unbiased).

Observation number	x_j	Deviation of x_j from mean	y_j	Deviation of y_j from mean	Product of deviations
1	1	-1	5	2	-2
2	3	1	0	-3	-3
3	0	-2	-1	-4	8
4	4	2	8	5	10
Sum	8	0	12	0	13
Divide sum by n	2		3		13/4
Divide sum by n-1					13/3

More details, proofs and exercises

More details about these formulae - including proofs and solved exercises - can be found in the lecture on Covariance.

Keep reading the glossary

Previous entry: Countable additivity

Next entry: Covariance stationary

How to cite

Please cite as:

Taboga, Marco (2021). "Covariance formula", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/glossary/covariance-formula.