Gram-Schmidt Orthogonalisation Process

Let $ V$ be a finite dimensional inner product space. Suppose $ {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n$ is a linearly independent subset of $ V.$ Then the Gram-Schmidt orthogonalisation process uses the vectors $ {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n$ to construct new vectors $ {\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_n$ such that $ \langle {\mathbf v}_i, {\mathbf v}_j
\rangle = 0$ for $ i \neq j,$ $ \Vert {\mathbf v}_i \Vert = 1$ and $ {\mbox{Span }}
\{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_i \}
= {\mbox{Span }} \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_i \}$ for $ i=1,2,\ldots,n.$ This process proceeds with the following idea.

Figure 5.1: Gram-Schmidt Process
\includegraphics[scale=1]{gramschmidt.eps}

Suppose we are given two vectors $ {\mathbf u}$ and $ {\mathbf v}$ in a plane. If we want to get vectors $ {\mathbf z}$ and $ {\mathbf y}$ such that $ {\mathbf z}$ is a unit vector in the direction of $ {\mathbf u}$ and $ {\mathbf y}$ is a unit vector perpendicular to $ {\mathbf z},$ then they can be obtained in the following way:
Take the first vector $ {\mathbf z}= \displaystyle
\frac{{\mathbf u}}{ \Vert {\mathbf u}\Vert}.$ Let $ \theta$ be the angle between the vectors $ {\mathbf u}$ and $ {\mathbf v}.$ Then $ \cos(\theta) = \displaystyle
\frac{\langle {\mathbf u}, {\mathbf v}\rangle}{\Vert u \Vert \; \Vert v \Vert }.$ Defined $ \alpha = \Vert {\mathbf v}\Vert \; \cos(\theta) = \displaystyle\frac{\langle
{...
...v}\rangle}{ \Vert {\mathbf u}\Vert} = \langle {\mathbf z}, {\mathbf v}\rangle .$ Then $ {\mathbf w}= {\mathbf v}- \alpha \; {\mathbf z}$ is a vector perpendicular to the unit vector $ {\mathbf z}$ , as we have removed the component of $ {\mathbf z}$ from $ {\mathbf v}$ . So, the vectors that we are interested in are $ {\mathbf z}$ and $ {\mathbf y}= \displaystyle \frac{{\mathbf w}}{\Vert {\mathbf w}\Vert}.$

This idea is used to give the Gram-Schmidt Orthogonalisation process which we now describe.

THEOREM 5.2.1 (Gram-Schmidt Orthogonalisation Process)   Let $ V$ be an inner product space. Suppose $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n\} $ is a set of linearly independent vectors of $ V.$ Then there exists a set $ \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_n \}$ of vectors of $ V$ satisfying the following:
  1. $ \Vert {\mathbf v}_i \Vert = 1$ for $ 1 \leq i \leq n,$
  2. $ \langle {\mathbf v}_i, {\mathbf v}_j
\rangle = 0$ for $ 1 \leq i, j \leq n, \; i \ne j$ and
  3. $ L ({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_i) = L ( {\mathbf u}_1, {\mathbf u}_2,
\ldots, {\mathbf u}_i)$ for $ 1 \leq i \leq n.$

Proof. We successively define the vectors $ {\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_n$ as follows.
$ {\mathbf v}_1 = \displaystyle\frac{ {\mathbf u}_1}{\Vert {\mathbf u}_1 \Vert }.$
Calculate $ {\mathbf w}_2 = {\mathbf u}_2 - \langle {\mathbf u}_2, {\mathbf v}_1 \rangle {\mathbf v}_1,$ and let $ {\mathbf v}_2 = \displaystyle\frac{{\mathbf w}_2}{ \Vert {\mathbf w}_2 \Vert}.$
Obtain $ {\mathbf w}_3 = {\mathbf u}_3 - \langle {\mathbf u}_3, {\mathbf v}_1 \rangle {\mathbf v}_1 - \langle {\mathbf u}_3,
{\mathbf v}_2 \rangle {\mathbf v}_2,$ and let $ {\mathbf v}_3 = \displaystyle\frac{{\mathbf w}_3}{ \Vert
{\mathbf w}_3 \Vert}.$
In general, if $ {\mathbf v}_1, {\mathbf v}_2, {\mathbf v}_3, {\mathbf v}_4, \ldots, {\mathbf v}_{i-1}$ are already obtained, we compute

$\displaystyle {\mathbf w}_i = {\mathbf u}_i - \langle {\mathbf u}_i, {\mathbf v...
... - \cdots - \langle {\mathbf u}_i, {\mathbf v}_{i-1} \rangle {\mathbf v}_{i-1},$ (5.2.1)

and define

$\displaystyle {\mathbf v}_i= \frac{{\mathbf w}_i}{ \Vert {\mathbf w}_i \Vert}. $

We prove the theorem by induction on $ n,$ the number of linearly independent vectors.

For $ n = 1,$ we have $ {\mathbf v}_1 = \displaystyle\frac{ {\mathbf u}_1}{\Vert {\mathbf u}_1 \Vert }.$ Since $ {\mathbf u}_1 \neq {\mathbf 0}, \; {\mathbf v}_1 \neq {\mathbf 0}$ and

$\displaystyle \Vert{\mathbf v}_1\Vert^2 = \langle {\mathbf v}_1, {\mathbf v}_1 ...
...{ \langle
{\mathbf u}_1, {\mathbf u}_1 \rangle}{\Vert{\mathbf u}_1\Vert^2} = 1.$

Hence, the result holds for $ n=1.$

Let the result hold for all $ k \leq n-1.$ That is, suppose we are given any set of $ k, \; 1 \leq k \leq n-1$ linearly independent vectors $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_k\}$ of $ V.$ Then by the inductive assumption, there exists a set $ \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_k\}$ of vectors satisfying the following:

  1. $ \Vert {\mathbf v}_i \Vert = 1$ for $ 1 \leq i \leq k,$
  2. $ \langle {\mathbf v}_i, {\mathbf v}_j
\rangle = 0$ for $ 1 \leq i \neq j \leq k,$ and
  3. $ L ({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_i) = L ( {\mathbf u}_1, {\mathbf u}_2,
\ldots, {\mathbf u}_i)$ for $ 1 \leq i \leq k.$

Now, let us assume that we are given a set of $ n$ linearly independent vectors $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n\} $ of $ V.$ Then by the inductive assumption, we already have vectors $ {\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{n-1}$ satisfying

  1. $ \Vert {\mathbf v}_i \Vert = 1$ for $ 1 \leq i \leq n-1,$
  2. $ \langle {\mathbf v}_i, {\mathbf v}_j
\rangle = 0$ for $ 1 \leq i \neq j \leq n-1,$ and
  3. $ L ({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_i) = L ( {\mathbf u}_1, {\mathbf u}_2,
\ldots, {\mathbf u}_i)$ for $ 1 \leq i \leq n-1.$
Using (5.2.1), we define

$\displaystyle {\mathbf w}_n = {\mathbf u}_n - \langle {\mathbf u}_n, {\mathbf v...
... - \cdots - \langle {\mathbf u}_n, {\mathbf v}_{n-1} \rangle {\mathbf v}_{n-1}.$ (5.2.2)

We first show that $ {\mathbf w}_n \not\in L({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{n-1})$ . This will also imply that $ {\mathbf w}_n \neq {\mathbf 0}$ and hence $ {\mathbf v}_n = \displaystyle\frac{{\mathbf w}_n}{\Vert {\mathbf w}_n \Vert}$ is well defined.

On the contrary, assume that $ {\mathbf w}_n \in L({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{n-1}).$ Then there exist scalars $ {\alpha}_1, {\alpha}_2, \ldots, {\alpha}_{n-1}$ such that

$\displaystyle {\mathbf w}_n = {\alpha}_1{\mathbf v}_1 + {\alpha}_2 {\mathbf v}_2 + \cdots + {\alpha}_{n-1} {\mathbf v}_{n-1}.$

So, by (5.2.2)

$\displaystyle {\mathbf u}_n = \bigl({\alpha}_1 + \langle {\mathbf u}_n, {\mathb...
...-1} + \langle {\mathbf u}_n, {\mathbf v}_{n-1} \rangle\bigr) {\mathbf v}_{n-1}.$

Thus, by the third induction assumption,

$\displaystyle {\mathbf u}_n \in L({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{n-1}) = L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_{n-1}).$

This gives a contradiction to the given assumption that the set of vectors $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n\} $ is linear independent.

So, $ {\mathbf w}_n \neq {\mathbf 0}$ . Define $ {\mathbf v}_n = \displaystyle\frac{{\mathbf w}_n}{\Vert {\mathbf w}_n \Vert}$ . Then $ \Vert {\mathbf v}_n \Vert = 1$ . Also, it can be easily verified that $ \langle {\mathbf v}_n, {\mathbf v}_i \rangle = 0$ for $ 1 \le i \le n-1$ . Hence, by the principle of mathematical induction, the proof of the theorem is complete. height6pt width 6pt depth 0pt

We illustrate the Gram-Schmidt process by the following example.

EXAMPLE 5.2.2   Let $ \{(1,-1,1,1), (1,0,1,0), (0,1,0,1) \}$ be a linearly independent set in $ {\mathbb{R}}^4({\mathbb{R}}).$ Find an orthonormal set $ \{{\mathbf v}_1, {\mathbf v}_2, {\mathbf v}_3\}$ such that $ L(\; (1,-1,1,1), (1,0,1,0), (0,1,0,1)\; ) = L( {\mathbf v}_1, {\mathbf v}_2, {\mathbf v}_3 ).$
Solution: Let $ {\mathbf u}_1 = (1,0,1,0).$ Define $ {\mathbf v}_1 = \displaystyle
\frac{(1,0,1,0)}{\sqrt{2}}.$ Let $ {\mathbf u}_2 = (0,1,0,1).$ Then

$\displaystyle {\mathbf w}_2 = (0,1,0,1) - \langle (0,1,0,1),
\displaystyle\frac{(1,0,1,0)}{\sqrt{2}} \rangle {\mathbf v}_1 = (0,1,0,1).$

Hence, $ {\mathbf v}_2 = \displaystyle\frac{ (0,1,0,1)}{\sqrt{2}}.$ Let $ {\mathbf u}_3 = (1,-1,1,1).$ Then
$\displaystyle {\mathbf w}_3$ $\displaystyle =$ $\displaystyle (1,-1,1,1) - \langle (1,-1,1,1), \frac{(1,0,1,0)}{\sqrt{2}}
\rang...
...hbf v}_1 - \langle (1,-1,1,1), \frac{(0,1,0,1)}{\sqrt{2}} \rangle {\mathbf v}_2$  
  $\displaystyle =$ $\displaystyle (0,-1,0,1)$  

and $ {\mathbf v}_3 = \displaystyle\frac{(0,-1,0,1)}{ \sqrt{2}}.$

Remark 5.2.3  
  1. Let $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_k\}$ be any basis of a $ k$ -dimensional subspace $ W$ of $ {\mathbb{R}}^n.$ Then by Gram-Schmidt orthogonalisation process, we get an orthonormal set $ \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_k \} \subset {\mathbb{R}}^n$ with $ W = L ( {\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_k),$ and for $ 1 \leq i \leq k,$

    $\displaystyle L ( {\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_i)= L ( {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_i).$

  2. Suppose we are given a set of $ n$ vectors, $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n\} $ of $ V$ that are linearly dependent. Then by Corollary 3.2.5, there exists a smallest $ k, \; 2 \leq k \leq n$ such that

    $\displaystyle L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_k) = L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_{k-1}).$

    We claim that in this case, $ {\mathbf w}_k = {\mathbf 0}.$

    Since, we have chosen the smallest $ k$ satisfying

    $\displaystyle L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_i) = L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_{i-1}),$

    for $ 2 \leq i \leq n,$ the set $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_{k-1}\}$ is linearly independent (use Corollary 3.2.5). So, by Theorem 5.2.1, there exists an orthonormal set $ \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{k-1}\}$ such that

    $\displaystyle L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_{k-1}) = L({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{k-1}).$

    As $ {\mathbf u}_k \in L({\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_{k-1}),$ by Remark 5.1.15

    $\displaystyle {\mathbf u}_k = \langle {\mathbf u}_k, {\mathbf v}_1 \rangle {\ma...
... + \cdots + \langle {\mathbf u}_k, {\mathbf v}_{k-1} \rangle {\mathbf v}_{n-1}.$

    So, by definition of $ {\mathbf w}_k, \; {\mathbf w}_k = {\mathbf 0}.$

    Therefore, in this case, we can continue with the Gram-Schmidt process by replacing $ {\mathbf u}_k$ by $ {\mathbf u}_{k+1}.$

  3. Let $ S$ be a countably infinite set of linearly independent vectors. Then one can apply the Gram-Schmidt process to get a countably infinite orthonormal set.
  4. Let $ \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_k\}$ be an orthonormal subset of $ {\mathbb{R}}^n.$ Let $ {\cal B}= ({\mathbf e}_1, {\mathbf e}_2, \ldots, {\mathbf e}_n)$ be the standard ordered basis of $ {\mathbb{R}}^n.$ Then there exist real numbers $ {\alpha}_{ij},
\; 1 \leq i \leq k, \; 1 \leq j \leq n$ such that

    $\displaystyle [{\mathbf v}_i]_{{\cal B}} = ({\alpha}_{1i}, {\alpha}_{2i}, \ldots, {\alpha}_{ni})^t.$

    Let $ A = [ {\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_k].$ Then in the ordered basis $ {\cal B},$ we have

    $\displaystyle A = \begin{bmatrix}{\alpha}_{11} & {\alpha}_{12} & \cdots &
{\alp...
...\vdots \\
{\alpha}_{n1} & {\alpha}_{n2} & \cdots & {\alpha}_{nk} \end{bmatrix}$

    is an $ n \times k$ matrix.

    Also, observe that the conditions $ \Vert {\mathbf v}_i \Vert = 1$ and $ \langle {\mathbf v}_i, {\mathbf v}_j
\rangle = 0$ for $ 1 \leq i \neq j \leq n,$ implies that

    $\displaystyle \left. \begin{array}{ll} & 1 = \Vert{\mathbf v}_i\Vert = \Vert{\m...
...\rangle = \sum\limits_{s=1}^n {\alpha}_{s i}{\alpha}_{s j}. \end{array}\right\}$ (5.2.3)

    Note that,
    $\displaystyle A^t A$ $\displaystyle =$ \begin{displaymath}\begin{bmatrix}{\mathbf v}_1^t \\ {\mathbf v}_2^t \\ \vdots \...
... v}_2\rangle & \cdots & \Vert{\mathbf v}_k\Vert^2
\end{bmatrix}\end{displaymath}  
      $\displaystyle =$ $\displaystyle \begin{bmatrix}1 & 0 & \cdots & 0 \\
0 & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & 1 \end{bmatrix} = I_k.$  

    Or using (5.2.3), in the language of matrices, we get

    $\displaystyle A^t A = \begin{bmatrix}{\alpha}_{11} & {\alpha}_{21} & \cdots &
{...
... \\
{\alpha}_{n1} & {\alpha}_{n2} & \cdots & {\alpha}_{nk} \end{bmatrix}= I_k.$

Perhaps the readers must have noticed that the inverse of $ A$ is its transpose. Such matrices are called orthogonal matrices and they have a special role to play.

DEFINITION 5.2.4 (Orthogonal Matrix)   A $ n \times n$ real matrix $ A$ is said to be an orthogonal matrix if $ A \;A^{t} = A^{t} A = I_n.$

It is worthwhile to solve the following exercises.

EXERCISE 5.2.5  
  1. Let $ A$ and $ B$ be two $ n \times n$ orthogonal matrices. Then prove that $ A B$ and $ B A$ are both orthogonal matrices.
  2. Let $ A$ be an $ n \times n$ orthogonal matrix. Then prove that
    1. the rows of $ A$ form an orthonormal basis of $ {\mathbb{R}}^n.$
    2. the columns of $ A$ form an orthonormal basis of $ {\mathbb{R}}^n.$
    3. for any two vectors $ {\mathbf x}, {\mathbf y}\in {\mathbb{R}}^{n \times 1},\;$ $ \langle A {\mathbf x}, A {\mathbf y}\rangle = \langle {\mathbf x}, {\mathbf y}\rangle.$
    4. for any vector $ {\mathbf x}\in {\mathbb{R}}^{n \times 1},\; $ $ \Vert A {\mathbf x}\Vert = \Vert {\mathbf x}\Vert.$
  3. Let $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n\} $ be an orthonormal basis of $ {\mathbb{R}}^n.$ Let $ {\cal B}= ({\mathbf e}_1, {\mathbf e}_2, \ldots, {\mathbf e}_n)$ be the standard basis of $ {\mathbb{R}}^n.$ Construct an $ n \times n$ matrix $ A$ by

    $\displaystyle A = [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n] = \begi...
...s & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn}
\end{bmatrix}$

    where

    $\displaystyle {\mathbf u}_i = \sum\limits_{j=1}^n a_{ji} {\mathbf e}_j, \; {\mbox{ for }} \;
1 \leq i \leq n.$

    Prove that $ A^t A = I_n.$ Hence deduce that $ A$ is an orthogonal matrix.
  4. Let $ A$ be an $ n \times n$ upper triangular matrix. If $ A$ is also an orthogonal matrix, then prove that $ A = I_n.$

THEOREM 5.2.6 (QR Decomposition)   Let $ A$ be a square matrix of order $ n.$ Then there exist matrices $ Q$ and $ R$ such that $ Q$ is orthogonal and $ R$ is upper triangular with $ A = Q R.$

In case, $ A$ is non-singular, the diagonal entries of $ R$ can be chosen to be positive. Also, in this case, the decomposition is unique.

Proof. We prove the theorem when $ A$ is non-singular. The proof for the singular case is left as an exercise.

Let the columns of $ A$ be $ {\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_n.$ The Gram-Schmidt orthogonalisation process applied to the vectors $ {\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_n$ gives the vectors $ {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n$ satisfying

$\displaystyle \left.\begin{array}{cc} & L({\mathbf u}_1, {\mathbf u}_2, \ldots,...
...}_j \rangle = 0, \end{array} \right\} \; {\mbox{ for }} 1 \leq i \neq j \leq n.$ (5.2.4)

Now, consider the ordered basis $ {\cal B}= ( {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n).$ From (5.2.4), for $ 1 \leq i \leq n,$ we have $ L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_i) = L({\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_i).$ So, we can find scalars $ {\alpha}_{ji}, 1 \leq j \leq i$ such that

$\displaystyle {\mathbf x}_i = {\alpha}_{1i} {\mathbf u}_1 + {\alpha}_{2i} {\mat...
... \bigl[({\alpha}_{1i}, \ldots, {\alpha}_{ii}, 0 \ldots, 0)^t \bigr]_{{\cal B}}.$ (5.2.5)

Let $ Q = [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n].$ Then by Exercise 5.2.5.3, $ Q$ is an orthogonal matrix. We now define an $ n \times n$ upper triangular matrix $ R$ by

$\displaystyle R = \begin{bmatrix}{\alpha}_{11} & {\alpha}_{12} & \cdots & {\alp...
...ts & \vdots & \ddots & \vdots \\
0 & 0 & \cdots & {\alpha}_{nn} \end{bmatrix}.$

By using (5.2.5), we get
$\displaystyle Q R$ $\displaystyle =$ $\displaystyle [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n] \begin{bmat...
...dots & \vdots &
\ddots & \vdots \\ 0 & 0 & \cdots & {\alpha}_{nn} \end{bmatrix}$  
  $\displaystyle =$ $\displaystyle \biggl[ {\alpha}_{11} {\mathbf u}_1, \; {\alpha}_{12} {\mathbf u}...
...pha}_{22}{\mathbf u}_2, \ldots,
\sum_{i=1}^n {\alpha}_{in}{\mathbf u}_i \biggr]$  
  $\displaystyle =$ $\displaystyle [{\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_n] = A.$  

Thus, we see that $ A = QR,$ where $ Q$ is an orthogonal matrix (see Remark 5.2.3.4) and $ R$ is an upper triangular matrix.

The proof doesn't guarantee that for $ 1 \leq i \leq n,$ $ {\alpha}_{ii}$ is positive. But this can be achieved by replacing the vector $ {\mathbf u}_i$ by $ - {\mathbf u}_i$ whenever $ {\alpha}_{ii}$ is negative.

Uniqueness: suppose $ Q_1 R_1 = Q_2 R_2$ then $ Q_2^{-1} Q_1 = R_2 R_1^{-1}.$ Observe the following properties of upper triangular matrices.

  1. The inverse of an upper triangular matrix is also an upper triangular matrix, and
  2. product of upper triangular matrices is also upper triangular.
Thus the matrix $ R_2 R_1^{-1}$ is an upper triangular matrix. Also, by Exercise 5.2.5.1, the matrix $ Q_2^{-1} Q_1$ is an orthogonal matrix. Hence, by Exercise 5.2.5.4, $ R_2 R_1^{-1} = I_n.$ So, $ R_2 = R_1$ and therefore $ Q_2 = Q_1.$ height6pt width 6pt depth 0pt





Suppose we have matrix $ A = [{\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_k]$ of dimension $ n \times k$ with $ {\mbox{rank }} (A) = r.$ Then by Remark 5.2.3.2, the application of the Gram-Schmidt orthogonalisation process yields a set $ \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_r\}$ of orthonormal vectors of $ {\mathbb{R}}^n.$ In this case, for each $ i, \; 1 \leq i \leq r,$ we have

$\displaystyle L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_i) = L({\math...
...}_2,
\ldots, {\mathbf x}_j), \; {\mbox{ for some }} \; j, \;\; i \leq j \leq k.$

Hence, proceeding on the lines of the above theorem, we have the following result.

THEOREM 5.2.7 (Generalised QR Decomposition)   Let $ A$ be an $ n \times k$ matrix of rank $ r.$ Then $ A = QR,$ where
  1. $ Q$ is an $ n \times r$ matrix with $ Q^t Q = I_r.$ That is, the columns of $ Q$ form an orthonormal set,
  2. If $ Q = [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_r],$ then $ L({\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_r) = L({\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_k),$ and
  3. $ R$ is an $ r \times k$ matrix with $ {\mbox{ rank }}(R) = r.$

EXAMPLE 5.2.8  
  1. Let $ A = \begin{bmatrix}1 & 0 & 1 & 2 \\ 0 & 1 & -1 & 1 \\
1 & 0 & 1 & 1 \\ 0 & 1 & 1 &1 \end{bmatrix}.$ Find an orthogonal matrix $ Q$ and an upper triangular matrix $ R$ such that $ A = Q R.$
    Solution: From Example 5.2.2, we know that

    $\displaystyle {\mathbf v}_1 = \frac{1}{\sqrt{2}}(1,0,1,0), \; {\mathbf v}_2 = \frac{1}{\sqrt{2}}(0,1,0,1), \; {\mathbf v}_3 = \frac{1}{\sqrt{2}}(0,-1,0,1).$ (5.2.6)

    We now compute $ {\mathbf w}_4.$ If we denote $ {\mathbf u}_4 = (2,1,1,1)^t$ then by the Gram-Schmidt process,

    $\displaystyle {\mathbf w}_4$ $\displaystyle =$ $\displaystyle {\mathbf u}_4 - \langle {\mathbf u}_4, {\mathbf v}_1\rangle {\mat...
...ngle
{\mathbf v}_2 - \langle {\mathbf u}_4, {\mathbf v}_3 \rangle {\mathbf v}_3$  
      $\displaystyle =$ $\displaystyle \frac{1}{2}(1,0,-1,0)^t.$ (5.2.7)

    Thus, using (5.2.6) and (5.2.7), we get

    $\displaystyle Q = \bigl[{\mathbf v}_1, {\mathbf v}_2, {\mathbf v}_3, {\mathbf v...
...-1}{\sqrt{2}} \\
0 & \frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}} & 0
\end{bmatrix}$

    and

    $\displaystyle R = \begin{bmatrix}\sqrt{2} & 0 & \sqrt{2} & \frac{3}{\sqrt{2}} \...
...t{2}
\\ 0 & 0 & \sqrt{2} & 0 \\ 0 & 0 & 0 & \frac{-1}{\sqrt{2}} \end{bmatrix}.$

    The readers are advised to check that $ A = QR$ is indeed correct.
  2. Let $ A = \begin{bmatrix}1 & 1 & 1 & 0 \\ -1 & 0 & -2 & 1 \\
1 & 1 & 1 & 0 \\ 1 & 0 & 2 & 1 \end{bmatrix}.$ Find a $ 4 \times 3$ matrix $ Q$ satisfying $ Q^t Q = I_3$ and an upper triangular matrix $ R$ such that $ A = Q R.$
    Solution: Let us apply the Gram Schmidt orthogonalisation to the columns of $ A$ . Or equivalently to the rows of $ A^t$ . So, we need to apply the process to the subset $ \{(1,-1,1,1), (1,0,1,0), (1,-2,1,2), (0,1,0,1) \}$ of $ {\mathbb{R}}^4.$

    Let $ {\mathbf u}_1 = (1,-1,1,1).$ Define $ {\mathbf v}_1 = \displaystyle \frac{{\mathbf u}_1}{2}.$ Let $ {\mathbf u}_2 = (1,0,1,0).$ Then

    $\displaystyle {\mathbf w}_2 = (1,0,1,0) - \langle {\mathbf u}_2, {\mathbf v}_1 \rangle {\mathbf v}_1 =
(1,0,1,0) - {\mathbf v}_1 = \frac{1}{2}(1,1,1,-1).$

    Hence, $ {\mathbf v}_2 = \displaystyle\frac{ (1,1,1,-1)}{2}.$ Let $ {\mathbf u}_3 = (1, -2, 1, 2).$ Then

    $\displaystyle {\mathbf w}_3 = {\mathbf u}_3 - \langle {\mathbf u}_3, {\mathbf v...
... {\mathbf v}_2 = {\mathbf u}_3 - 3 {\mathbf v}_1 + {\mathbf v}_2 = {\mathbf 0}.$

    So, we again take $ {\mathbf u}_3 = (0,1,0,1).$ Then

    $\displaystyle {\mathbf w}_3 = {\mathbf u}_3 - \langle {\mathbf u}_3, {\mathbf v...
...thbf v}_2 = {\mathbf u}_3 - 0 {\mathbf v}_1 - 0 {\mathbf v}_2 = {\mathbf u}_3. $

    So, $ {\mathbf v}_3 = \displaystyle\frac{(0,1,0,1)}{ \sqrt{2}}.$ Hence,

    $\displaystyle Q = [{\mathbf v}_1, {\mathbf v}_2, {\mathbf v}_3] = \begin{bmatri...
...{bmatrix}2 & 1 & 3 & 0
\\ 0 & 1 & -1 & 0 \\ 0 & 0 & 0 & \sqrt{2} \end{bmatrix}.$

    The readers are advised to check the following:
    1. $ {\mbox{rank }}(A) = 3,$
    2. $ A = QR$ with $ Q^t Q = I_3$ , and
    3. $ R$ a $ 3 \times 4$ upper triangular matrix with $ {\mbox{rank }}(R) = 3.$

EXERCISE 5.2.9  
  1. Determine an orthonormal basis of $ {\mathbb{R}}^4$ containing the vectors $ (1,-2,1,3)$ and $ (2,1,-3,1).$
  2. Prove that the polynomials $ 1, x, \frac{3}{2}x^2 - \frac{1}{2},
\frac{5}{2} x^3 - \frac{3}{2} x$ form an orthogonal set of functions in the inner product space $ C[-1, 1]$ with the inner product $ \langle f, g
\rangle = \int_{-1}^1 f(t) \overline{g(t)} dt.$ Find the corresponding functions, $ f(x)$ with $ \Vert f(x)\Vert = 1.$
  3. Consider the vector space $ C[-\pi, \pi]$ with the standard inner product defined in the above exercise. Find an orthonormal basis for the subspace spanned by $ x, \; \sin x$ and $ \sin(x + 1).$
  4. Let $ M$ be a subspace of $ {\mathbb{R}}^n$ and $ \dim M = m.$ A vector $ x \in
{\mathbb{R}}^n$ is said to be orthogonal to $ M$ if $ \langle x, y \rangle
= 0 $ for every $ y \in M.$
    1. How many linearly independent vectors can be orthogonal to $ M ?$
    2. If $ M = \{ (x_1, x_2, x_3) \in {\mathbb{R}}^3 : x_1 + x_2 + x_3 = 0 \},$ determine a maximal set of linearly independent vectors orthogonal to $ M$ in $ {\mathbb{R}}^3.$
  5. Determine an orthogonal basis of vector subspace spanned by
    $ \{(1,1,0,1), (-1,1,1,-1), (0,2,1,0), (1,0,0,0) \}$ in $ {\mathbb{R}}^4.$
  6. Let $ S = \{(1,1,1,1), (1,2,0,1), (2,2,4,0) \}.$ Find an orthonormal basis of $ L(S)$ in $ {\mathbb{R}}^4.$
  7. Let $ {\mathbb{R}}^n$ be endowed with the standard inner product. Suppose we have a vector $ {\mathbf x}^t = (x_1, x_2, \ldots, x_n) \in {\mathbb{R}}^n,$ with $ \Vert {\mathbf x}\Vert = 1.$ Then prove the following:
    1. the set $ \{ {\mathbf x}\}$ can always be extended to form an orthonormal basis of $ {\mathbb{R}}^n.$
    2. Let this basis be $ \{{\mathbf x}, {\mathbf x}_2, \ldots, {\mathbf x}_n\}.$ Suppose $ {\cal B}= ({\mathbf e}_1, {\mathbf e}_2, \ldots, {\mathbf e}_n)$ is the standard basis of $ {\mathbb{R}}^n.$ Let $ A = \biggl[ [{\mathbf x}]_{{\cal B}}, \; [{\mathbf x}_2]_{{\cal B}}, \; \ldots, \; [{\mathbf x}_n]_{{\cal B}}
\biggr].$ Then prove that $ A$ is an orthogonal matrix.
  8. Let $ {\mathbf v}, {\mathbf w}\in {\mathbb{R}}^n, n \ge 1$ with $ \Vert {\mathbf u}\Vert = \Vert {\mathbf w}\Vert = 1.$ Prove that there exists an orthogonal matrix $ A$ such that $ A {\mathbf v}= {\mathbf w}$ . Prove also that $ A$ can be chosen such that $ \det(A) = 1.$

A K Lal 2007-09-12