The QR decomposition (or QR factorization) allows us to express a matrix having linearly independent columns as the product of 1) a matrix Q having orthonormal columns and 2) an upper triangular matrix R.
In order to fully understand how the QR decomposition is obtained, we should be familiar with the Gram-Schmidt process.
Remember that the Gram-Schmidt process is a procedure used to transform a set of linearly independent vectors into a set of orthonormal vectors (i.e., a set of vectors that have unit norm and are orthogonal to each other).
In the case of a
matrix
,
denote its columns by
.
If these columns are linearly independent, they can be transformed into a set
of orthonormal column vectors
by using the Gram-Schmidt process, which alternates normalization steps and
projection steps:
we start with the
normalizationwhere
denotes the norm of
;
we project
on
:
where
is the inner product between
and
and
is the residual of the projection, orthogonal to
;
we normalize the
residual:
we project
on
and
:
where
the residual
is orthogonal to
and
;
we keep on alternating normalization steps (where projection residuals are
divided by their norms) and projection steps (where
is projected on
)
until we have produced a set of orthonormal vectors
.
Note that the residuals can be expressed in terms of normalized vectors
asfor
,
where we have
defined
Thus, the projections can be written
as
The orthonormal vectors can be adjoined to form a
matrix
whose
columns are orthonormal.
The coefficients of the projections can be collected in an
upper triangular
matrix
By computing the matrix
product between
and
,
we recover the projections in equation (1). As a matter of fact, each column
of the product
is a linear combination of
the columns of
with coefficients taken from the corresponding column of
(see the lecture on
matrix
products and linear combinations).
Therefore, we have
that
We now provide a formal statement of the QR decomposition.
Proposition
Let
be a
matrix. If the columns of
are linearly independent, then
can be factorized
as
where
is a
matrix whose columns form an orthonormal set, and
is an
upper triangular matrix whose diagonal entries are strictly positive.
In the previous section we have already
shown a constructive proof of how the QR decomposition is obtained. The only
important detail we have not mentioned is that the linear independence of the
columns of
guarantees that the residuals of the projections performed in the Gram-Schmidt
process are different from zero. As a consequence, the normalized
vectors
are
well-defined because the norms
are strictly positive. Moreover, the entries on the main diagonal of
are strictly positive.
Note that
is invertible because a triangular matrix is invertible if its diagonal
entries are strictly positive.
It is time to make an example.
Example
Define the
matrix
The
norm of the first column
is
so
that the first normalized vector
is
The
inner product between
and
is
The
projection of the second column on
is
and
the residual of the projection
is
The
norm of the residual
is
Thus,
Let
us verify that
and
are
orthogonal:
We
now have performed all the calculations that lead to the QR
factorization
The
matrix with orthonormal columns
is
and
the upper triangular matrix
is
Let
us check that indeed the product of
and
equals
:
The QR decomposition is unique.
Proposition
Under the assumptions of the previous proposition, the QR decomposition is
unique, that is, the matrices
and
satisfying the stated properties are unique.
Suppose
where
is a second decomposition into a matrix
having orthonormal columns and an upper triangular matrix
having strictly positive diagonal elements. Since the columns of
are orthonormal, we have
that
where
is the conjugate transpose
of
,
and
is the
identity matrix (see Non-square
matrices with orthonormal columns). By the same
token,
If
we pre-multiply both sides of the
equality
by
we
get
or
If
we instead pre-multiply the equality by
we
obtain
or
By
plugging equation (3) into equation (2), we
obtain
The
latter equation implies that, for
,
the
-th
row of
can be written as a linear combination of all the rows of
with coefficients taken from the
-th
row of the matrix
(see
Matrix
multiplication and linear combinations). But
is triangular with strictly positive diagonal entries, so its rows are
linearly independent and they form a basis for the space of
vectors. As a consequence, the only way to represent the
-th
row of
as a linear combination of all the rows of
is to put a unitary coefficient on the
-th
row itself and a zero coefficient on all the other rows (by the
uniqueness of the
representation in terms of a basis). In other words, the
-th
row of
is the
-th
vector of the canonical basis.
Since this is true for
,
we have
that
or
Thus,
is a unitary matrix (its conjugate transpose is equal to its inverse).
Moreover, we have
that
Since
the inverse of an upper triangular matrix (UT) is UT and the product of two UT
matrices is UT,
is UT. It is also invertible, which means that its diagonal entries are
strictly positive. To sum up,
is both unitary and UT with strictly positive diagonal entries. Therefore, by
a result on unitary and triangular
matrices, we have that
and,
as a
consequence,
and
Thus,
the two matrices involved in the QR decomposition are unique.
An important fact that we have discussed in the previous proof but we have not
separately stated until now is that the
matrix in the decomposition is such
that
where
is the conjugate transpose
of
.
As a
consequence,
If
has only real entries, then the conjugate transpose coincides with the
transpose and the two equations above
become
and
When the matrix
being decomposed is a square
matrix,
then
where
and
are both square
matrices.
But a square matrix
having orthonormal columns is a unitary matrix.
Therefore, the QR decomposition of a square matrix having linearly independent columns is the product of a unitary matrix and an upper triangular matrix with strictly positive entries.
The QR method is often used to estimate linear regressions.
In a linear regression we have an
vector
of outputs and an
matrix of inputs whose columns are assumed to be linearly independent. We need
to find the
coefficient vector
that minimizes the mean squared errors made by using the fitted
values
to
predict the actual values
.
The well-known solution to this problem is the so-called
ordinary least
squares (OLS)
estimator
We can simplify the formula for the OLS estimator, avoid to invert a matrix
and thus reduce the computational burden (and the possible numerical
instabilities) by computing the QR decomposition of
:
where
is
and
is
.
Then, the OLS estimator
becomesor
The latter way of writing the solution is more convenient: since
is upper triangular, we do not need to invert it, but we can use the
back-substitution algorithm to
find the solution
.
Below you can find some exercises with explained solutions.
Compute the QR decomposition
of
The norm of the first column of
is
Thus,
the first orthonormal vector
is
The
inner product between the second column of
and
is
The
projection of
on
is
and
the residual of the projection
is
The
norm of the residual
is
The
second orthonormal vector
is
Thus,
the QR decomposition
is given
by
and
Please cite as:
Taboga, Marco (2021). "QR decomposition", Lectures on matrix algebra. https://www.statlect.com/matrix-algebra/QR-decomposition.
Most of the learning materials found on this website are now available in a traditional textbook format.