Partitioned linear regression is a technique used to subdivide the independent variables in two groups and estimate their coefficients in two separate steps.
Partitioned regression is often used to solve problems in which estimating all the regression coefficients together would be too computationally intensive.
Consider the linear regression model in matrix form:where:
is the vector of observations of the dependent variable;
is the matrix of regressors ( observations and regressors);
is the vector of regression coefficients;
is the vector of error terms.
We divide the regressors in two groups:
group contains the first regressors;
group contains the remaining ;
Obviously .
We use the subdivision into two groups to partition the vectors and matrices that appear in the regression equation:
The dimensions of the blocks are as follows:
and are and respectively;
and are and respectively.
Remember that, when is full-rank, the OLS estimator of the vector can be written as
Also the OLS estimator can be partitioned as
If we multiply both sides of the OLS formula by , we obtain the so-called normal equations:
In partitioned form, the normal equations become
By using the multiplication rule for partitioned matrices, we obtainwhich can be written as two separate equations:
These two equations are used to derive most of the results about partitioned regressions.
Here is the main result about partitioned regressions, proved in this section and explained in the next one:
The first normal equation, derived previously, isWe write it asorThe second normal equation isWe substitute the expression for :which is equivalent toorWe defineso thatorThe matrix is idempotent:It is also symmetric, as can be easily verified:Therefore, we havewhere we have defined
The calculations need to be performed in reverse order, starting from the last equation.
Working out the formulae above is equivalent to deriving the OLS estimators in three steps:
we regress and the columns of on ; the residuals from these regressions are and ;
we find by regressing on ;
we calculate by regressing the residuals on .
Let us start from the first step. When we regress on , the OLS estimator of the regression coefficients isand the residuals areSimilarly, when we regress the columns of on , the OLS estimators of the regression coefficients are the columns of the matrixand the vectors of residuals are the columns of the matrix In the second step, we regress on . The OLS estimator of the regression coefficients is But we have proved above that this is also the OLS estimator of in our partitioned regression. In the third step, we regress on . The OLS estimator of the regression coefficients isBut we have proved above that this is also the OLS estimator of in our partitioned regression.
The fact that can be calculated by regressing on is often called Frisch-Waugh-Lovell theorem.
As an example, we discuss the so-called demeaned regression.
Suppose that the first column of is a vector of ones (corresponding to the so-called intercept).
We partition the design matrix aswhere is the vector of ones and contains all the other regressors.
Let us see what happens in the 3 steps explained above.
In the first step, we regress on .
The OLS estimate of the regression coefficient iswhere is the sample mean of .
Therefore,
In other words, is the demeaned version of .
Similarly, is the demeaned version of :where is a row vector that contains the sample means of the columns of .
In the second step, we regress on and we obtain as a result .
The vector is equal to the OLS estimator of the regression coefficients of in the original regression of on and .
Thus, we have an important rule here: running a regression with an intercept is equivalent to demeaning all the variables and running the same regression without the intercept.
In the third step, we calculate (the intercept) by regressing the residuals on .
We already know that regressing a variable on is the same as calculating its sample mean.
Therefore, the intercept is equal to the sample mean of the residuals .
Please cite as:
Taboga, Marco (2021). "Partitioned regression", Lectures on probability theory and mathematical statistics. Kindle Direct Publishing. Online appendix. https://www.statlect.com/fundamentals-of-statistics/partitioned-regression.
Most of the learning materials found on this website are now available in a traditional textbook format.