Orthogonal Projections and Applications

Recall that given a $ k$ -dimensional vector subspace of a vector space $ V$ of dimension $ n,$ one can always find an $ (n-k)$ -dimensional vector subspace $ W_0$ of $ V$ (see Exercise 3.3.20.9) satisfying

$\displaystyle W + W_0 = V \;\; {\mbox{ and }} \;\; W \cap W_0 = \{ {\mathbf 0}\}.$

The subspace $ W_0$ is called the complementary subspace of $ W$ in $ V.$ We now define an important class of linear transformations on an inner product space, called orthogonal projections.

DEFINITION 5.3.1 (Projection Operator)   Let $ V$ be an $ n$ -dimensional vector space and let $ W$ be a $ k$ -dimensional subspace of $ V.$ Let $ W_0$ be a complement of $ W$ in $ V.$ Then we define a map $ P_{W} : V \longrightarrow V$ by

$\displaystyle P_{W}({\mathbf v}) = {\mathbf w}, \; {\mbox{ whenever }} \; {\mathbf v}= {\mathbf w}+ {\mathbf w}_0, \; {\mathbf w}\in W, \;
{\mathbf w}_0 \in W_0.$

The map $ P_W$ is called the projection of $ V$ onto $ W$ along $ W_0.$

Remark 5.3.2   The map $ P$ is well defined due to the following reasons:
  1. $ W + W_0 = V$ implies that for every $ {\mathbf v}\in V,$ we can find $ w \in W$ and $ w_0 \in W_0$ such that $ {\mathbf v}= {\mathbf w}+ {\mathbf w}_0.$
  2. $ W \cap W_0 = \{{\mathbf 0}\}$ implies that the expression $ {\mathbf v}= {\mathbf w}+ {\mathbf w}_0$ is unique for every $ {\mathbf v}\in V.$

The next proposition states that the map defined above is a linear transformation from $ V$ to $ V.$ We omit the proof, as it follows directly from the above remarks.

PROPOSITION 5.3.3   The map $ P_W: V \longrightarrow V$ defined above is a linear transformation.

EXAMPLE 5.3.4   Let $ V = {\mathbb{R}}^3$ and $ W = \{ (x,y,z) \in {\mathbb{R}}^3 : x + y - z = 0 \}.$
  1. Let $ W_0 = L( \; (1,2,2) \;).$ Then $ W \cap W_0 = \{{\mathbf 0}\}$ and $ W + W_0 = {\mathbb{R}}^3.$ Also, for any vector $ (x,y,z)\in {\mathbb{R}}^3,$ note that $ (x,y,z) = {\mathbf w}+ {\mathbf w}_0, $ where

    $\displaystyle {\mathbf w}= (z-y, 2z - 2x - y, 3z - 2x - 2y), \;{\mbox{ and }} \;
{\mathbf w}_0 = (x+y-z)(1,2,2).$

    So, by definition,

    $\displaystyle P_W ( (x,y,z) ) = (z-y, 2z - 2x - y, 3z - 2x - 2y) =
\begin{bmatr...
...2 & -1 &2 \\ -2 & -2 & 3 \end{bmatrix}\begin{bmatrix}x \\ y \\ z \end{bmatrix}.$

  2. Let $ W_0 = L( \; (1,1,1) \;).$ Then $ W \cap W_0 = \{{\mathbf 0}\}$ and $ W + W_0 = {\mathbb{R}}^3.$ Also, for any vector $ (x,y,z)\in {\mathbb{R}}^3,$ note that $ (x,y,z) = {\mathbf w}+ {\mathbf w}_0, $ where

    $\displaystyle {\mathbf w}= (z-y, z - x, 2z - x - y), \;{\mbox{ and }} \;
{\mathbf w}_0 = (x+y-z)(1,1,1).$

    So, by definition,

    $\displaystyle P_W (\; (x,y,z) \;) = (z-y, z - x, 2z - x - y) =
\begin{bmatrix}0...
...-1 & 0 &1 \\ -1 & -1 & 2 \end{bmatrix}\begin{bmatrix}x \\ y \\ z \end{bmatrix}.$

Remark 5.3.5  
  1. The projection map $ P_W$ depends on the complementary subspace $ W_0.$
  2. Observe that for a fixed subspace $ W,$ there are infinitely many choices for the complementary subspace $ W_0.$
  3. It will be shown later that if $ V$ is an inner product space with inner product, $ \langle \;, \;
\rangle,$ then the subspace $ W_0$ is unique if we put an additional condition that $ W_0 = \{ {\mathbf v}\in V \; : \langle {\mathbf v}, {\mathbf w}\rangle = 0\; {\mbox{ for all }} \;
{\mathbf w}\in W \}.$

We now prove some basic properties about projection maps.

THEOREM 5.3.6   Let $ W$ and $ W_0$ be complementary subspaces of a vector space $ V.$ Let $ P_W: V \longrightarrow V$ be a projection operator of $ V$ onto $ W$ along $ W_0.$ Then
  1. the null space of $ P_W,$ $ \; {\cal N}(P_W) = \{ {\mathbf v}\in V : P_W({\mathbf v}) = {\mathbf 0}\} =
W_0.$
  2. the range space of $ P_W, \;$ $ {\cal R}(P_W) = \{ P_W({\mathbf v}) : {\mathbf v}\in V \} = W.$
  3. $ P_W^2 = P_W.$ The condition $ P_W^2 = P_W$ is equivalent to $ P_W(I-P_W) = {\mathbf 0}= (I-P_W)P_W.$

Proof. We only prove the first part of the theorem.
Let $ {\mathbf w}_0 \in W_0.$ Then $ \; {\mathbf w}_0 = {\mathbf 0}+ {\mathbf w}_0$ for $ {\mathbf 0}\in W.$ So, by definition, $ P({\mathbf w}_0) = {\mathbf 0}.$ Hence, $ W_0 \subset {\cal N}(P_W).$

Also, for any $ {\mathbf v}\in V,$ let $ P_W({\mathbf v}) = {\mathbf 0}$ with $ {\mathbf v}= {\mathbf w}+ {\mathbf w}_0$ for some $ {\mathbf w}_0 \in W_0$ and $ {\mathbf w}\in W.$ Then by definition $ {\mathbf 0}= P_W({\mathbf v}) = {\mathbf w}.$ That is, $ {\mathbf w}= {\mathbf 0}$ and $ {\mathbf v}= {\mathbf w}_0.$ Thus, $ {\mathbf v}\in W_0.$ Hence $ {\cal N}(P_W) = W_0.$ height6pt width 6pt depth 0pt

EXERCISE 5.3.7  
  1. Let $ A$ be an $ n \times n$ real matrix with $ A^2 = A.$ Consider the linear transformation $ T_A: {\mathbb{R}}^n \longrightarrow {\mathbb{R}}^n,$ defined by $ T_A({\mathbf v}) = A {\mathbf v}$ for all $ {\mathbf v}\in {\mathbb{R}}^n.$ Prove that
    1. $ T_A \circ T_A = T_A$ (use the condition $ A^2 = A$ ).
    2. $ {\cal N}(T_A) \cap {\cal R}(T_A) = \{{\mathbf 0}\}.$
      Hint: Let $ {\mathbf x}\in {\cal N}(T_A) \cap {\cal R}(T_A). $ This implies $ T_A({\mathbf x}) = {\mathbf 0}$ and $ {\mathbf x}= T_A({\mathbf y})$ for some $ {\mathbf y}\in {\mathbb{R}}^n.$ So,

      $\displaystyle {\mathbf x}= T_A({\mathbf y}) = (T_A \circ T_A)({\mathbf y}) = T_A \bigl( T_A({\mathbf y}) \bigr) =
T_A ( {\mathbf x}) = {\mathbf 0}.$

    3. $ {\mathbb{R}}^n = {\cal N}(T_A) + {\cal R}(T_A).$
      Hint: Let $ \{{\mathbf v}_1, \ldots, {\mathbf v}_k\}$ be a basis of $ {\cal N}(T_A).$ Extend it to get a basis $ \{{\mathbf v}_1, \ldots, {\mathbf v}_k, {\mathbf v}_{k+1}, \ldots, {\mathbf v}_n\}$ of $ {\mathbb{R}}^n.$ Then by Rank-nullity Theorem 4.3.6, $ \{T_A({\mathbf v}_{k+1}), \ldots, T_A({\mathbf v}_n)\}$ is a basis of $ {\cal R}(T_A).$
    4. Define $ W = {\cal R}(T_A)$ and $ W_0 = {\cal N}(T_A).$ Then $ T_A$ is a projection operator of $ {\mathbb{R}}^n$ onto $ W$ along $ W_0.$

      Recall that the first three parts of this exercise was also given in Exercise 4.3.10.8.

  2. Find all $ 2 \times 2$ real matrices $ A$ such that $ A^2 = A.$ Hence or otherwise, determine all projection operators of $ {\mathbb{R}}^2.$

The next result uses the Gram-Schmidt orthogonalisation process to get the complementary subspace in such a way that the vectors in different subspaces are orthogonal.

DEFINITION 5.3.8 (Orthogonal Subspace of a Set)   Let $ V$ be an inner product space. Let $ S$ be a non-empty subset of $ V$ . We define

$\displaystyle S^{\perp} = \{ {\mathbf v}\in V \; : \langle {\mathbf v}, {\mathbf s}\rangle = 0 {\mbox{ for all }}
{\mathbf s}\in S \}.$

EXAMPLE 5.3.9   Let $ V = {\mathbb{R}}$ .
  1. $ S = \{0\}$ . Then $ S^{\perp} = {\mathbb{R}}$ .
  2. $ S = {\mathbb{R}}$ , Then $ S^{\perp} = \{0\}$ .
  3. Let $ S$ be any subset of $ {\mathbb{R}}$ containing a non-zero real number. Then $ S^{\perp} = \{0\}$ .

THEOREM 5.3.10   Let $ S$ be a subset of a finite dimensional inner product space $ V,$ with inner product $ \langle\; , \;\rangle.$ Then
  1. $ S^{\perp}$ is a subspace of $ V.$
  2. Let $ S$ be equal to a subspace $ W$ . Then the subspaces $ W$ and $ W^{\perp}$ are complementary. Moreover, if $ {\mathbf w}\in W$ and $ {\mathbf u}\in W^{\perp},$ then $ \langle {\mathbf u}, {\mathbf w}\rangle = 0$ and $ V = W + W^{\perp}.$

Proof. We leave the prove of the first part for the reader. The prove of the second part is as follows:
Let $ \dim(V) = n$ and $ \dim (W) = k.$ Let $ \{{\mathbf w}_1, {\mathbf w}_2, \ldots, {\mathbf w}_k\}$ be a basis of $ W.$ By Gram-Schmidt orthogonalisation process, we get an orthonormal basis, say, $ \{{\mathbf v}_1, {\mathbf v}_2, \ldots, {\mathbf v}_k\}$ of $ W.$ Then, for any $ {\mathbf v}\in V,$

$\displaystyle {\mathbf v}- \sum_{i=1}^k \langle {\mathbf v}, {\mathbf v}_i \rangle {\mathbf v}_i \in W^{\perp}.$

So, $ V \subset W + W^{\perp}.$ Also, for any $ {\mathbf v}\in W \cap W^{\perp},$ by definition of $ W^{\perp}, \;\; 0 = \langle {\mathbf v}, {\mathbf v}\rangle = \Vert {\mathbf v}\Vert^2.$ So, $ {\mathbf v}= {\mathbf 0}.$ That is, $ W \cap W^{\perp} = \{ {\mathbf 0}\}.$ height6pt width 6pt depth 0pt

DEFINITION 5.3.11 (Orthogonal Projection)   Let $ W$ be a subspace of a finite dimensional inner product space $ V,$ with inner product $ \langle\; , \;\rangle.$ Let $ W^{\perp}$ be the orthogonal complement of $ W$ in $ V.$ Define $ P_W: V \longrightarrow V$ by

$\displaystyle P_W({\mathbf v}) = {\mathbf w}\; {\mbox{ where }} \; {\mathbf v}=...
...ox{ with }} \;
{\mathbf w}\in W, \; {\mbox{ and }} \; {\mathbf u}\in W^{\perp}.$

Then $ P_W$ is called the orthogonal projection of $ V$ onto $ W$ along $ W^{\perp}.$

DEFINITION 5.3.12 (Self-Adjoint Transformation/Operator)   Let $ V$ be an inner product space with inner product $ \langle\; , \;\rangle.$ A linear transformation $ T: V \longrightarrow V$ is called a self-adjoint operator if $ \langle T({\mathbf v}), {\mathbf u}\rangle = \langle {\mathbf v},
T({\mathbf u}) \rangle$ for every $ {\mathbf u}, {\mathbf v}\in V.$

EXAMPLE 5.3.13  
  1. Let $ A$ be an $ n \times n$ real symmetric matrix. That is, $ A^t = A.$ Then show that the linear transformation $ T_A : {\mathbb{R}}^n \longrightarrow {\mathbb{R}}^n$ defined by $ T_A({\mathbf x}) = A {\mathbf x}$ for every $ {\mathbf x}^t \in {\mathbb{R}}^n$ is a self-adjoint operator.
    Solution: By definition, for every $ {\mathbf x}^t, {\mathbf y}^t \in {\mathbb{R}}^n,$

    $\displaystyle \langle T_A({\mathbf x}), {\mathbf y}\rangle = ({\mathbf y})^t A ...
... (A {\mathbf y})^t {\mathbf x}
= \langle {\mathbf x}, T_A({\mathbf y}) \rangle.$

    Hence, the result follows.
  2. Let $ A$ be an $ n \times n$ Hermitian matrix, that is, $ A^*
= A.$ Then the linear transformation $ T_A : {\mathbb{C}}^n \longrightarrow {\mathbb{C}}^n$ defined by $ T_A({\mathbf z}) = A {\mathbf z}$ for every $ {\mathbf z}^t \in {\mathbb{C}}^n$ is a self-adjoint operator.

Remark 5.3.14  
  1. By Proposition 5.3.3, the map $ P_W$ defined above is a linear transformation.
  2. $ P_W^2 = P_W, \; (I - P_W) P_W = {\mathbf 0}= P_W(I- P_W).$
  3. Let $ {\mathbf u}, {\mathbf v}\in V$ with $ {\mathbf u}= {\mathbf u}_1 + {\mathbf u}_2 $ and $ {\mathbf v}= {\mathbf v}_1 + {\mathbf v}_2$ for some $ {\mathbf u}_1, {\mathbf v}_1 \in W$ and $ {\mathbf u}_2, {\mathbf v}_2 \in W^{\perp}.$ Then we know that $ \langle {\mathbf u}_i, {\mathbf v}_j \rangle = 0$ whenever $ 1 \leq i \neq j \leq 2.$ Therefore, for every $ {\mathbf u}, {\mathbf v}\in V,$
    $\displaystyle \langle P_W({\mathbf u}), {\mathbf v}\rangle$ $\displaystyle =$ $\displaystyle \langle {\mathbf u}_1, {\mathbf v}\rangle =
\langle {\mathbf u}_1...
...hbf v}_1 \rangle =
\langle {\mathbf u}_1 + {\mathbf u}_2, {\mathbf v}_1 \rangle$  
      $\displaystyle =$ $\displaystyle \langle {\mathbf u}, P_W({\mathbf v}) \rangle.$  

    Thus, the orthogonal projection operator is a self-adjoint operator.
  4. Let $ {\mathbf v}\in V$ and $ {\mathbf w}\in W.$ Then $ P_W({\mathbf w}) = {\mathbf w}$ for all $ {\mathbf w}\in W.$ Therefore, using Remarks 5.3.14.2 and 5.3.14.3, we get
    $\displaystyle \langle {\mathbf v}- P_W({\mathbf v}), {\mathbf w}\rangle$ $\displaystyle =$ $\displaystyle \langle \bigl(I-P_W \bigr)({\mathbf v}),
P_W({\mathbf w}) \rangle = \langle P_W\bigl(I-P_W \bigr)({\mathbf v}), {\mathbf w}\rangle$  
      $\displaystyle =$ $\displaystyle \langle {\mathbf 0}({\mathbf v}), {\mathbf w}
\rangle = \langle {\mathbf 0}, {\mathbf w}\rangle = 0$  

    for every $ {\mathbf w}\in W.$
  5. In particular, $ \langle {\mathbf v}- P_W({\mathbf v}), P_W({\mathbf v}) - {\mathbf w}\rangle = 0$ as $ P_W({\mathbf v}) \in W$ . Thus, $ \langle {\mathbf v}- P_W({\mathbf v}), P_W({\mathbf v}) - {\mathbf w}^\prime \rangle = 0$ , for every $ {\mathbf w}^\prime \in W$ . Hence, for any $ {\mathbf v}\in V$ and $ {\mathbf w}\in W,$ we have
    $\displaystyle \Vert {\mathbf v}- {\mathbf w}\Vert^2$ $\displaystyle =$ $\displaystyle \Vert {\mathbf v}- P_W({\mathbf v}) + P_W({\mathbf v}) - {\mathbf w}\Vert^2$  
      $\displaystyle =$ $\displaystyle \Vert {\mathbf v}- P_W({\mathbf v})\Vert^2 + \Vert P_W({\mathbf v}) - {\mathbf w}\Vert^2$  
        $\displaystyle \hspace{1in} +
2 \langle {\mathbf v}- P_W({\mathbf v}), P_W({\mathbf v}) - {\mathbf w}\rangle$  
      $\displaystyle =$ $\displaystyle \Vert {\mathbf v}- P_W({\mathbf v}) \Vert^2 + \Vert P_W({\mathbf v}) - {\mathbf w}\Vert^2.$  

    Therefore,

    $\displaystyle \Vert {\mathbf v}- {\mathbf w}\Vert \geq \Vert {\mathbf v}- P_W({\mathbf v}) \Vert$

    and the equality holds if and only if $ {\mathbf w}= P_W({\mathbf v}).$ Since $ P_W({\mathbf v}) \in W,$ we see that

    $\displaystyle d({\mathbf v}, W) = \inf \; \{ \Vert{\mathbf v}- {\mathbf w}\Vert\; : {\mathbf w}\in W \} = \Vert {\mathbf v}- P_W({\mathbf v}) \Vert.$

    That is, $ P_W({\mathbf v})$ is the vector nearest to $ {\mathbf v}\in W.$ This can also be stated as: the vector $ P_W({\mathbf v})$ solves the following minimisation problem:

    $\displaystyle \inf_{{\mathbf w}\in W} \Vert {\mathbf v}- {\mathbf w}\Vert = \Vert {\mathbf v}- P_W({\mathbf v}) \Vert.$



Subsections
A K Lal 2007-09-12