Diagonalisable matrices

In this section, we will look at some special classes of square matrices which are diagonalisable. We will also be dealing with matrices having complex entries and hence for a matrix $ A=[a_{ij}],$ recall the following definitions.

DEFINITION 6.3.1 (Special Matrices)  

  1. $ A^* = ( {\overline{a_{ji}}} ),$ is called the conjugate transpose of the matrix $ A.$

    Note that $ A^* = {\overline{ A^{t}}} = {\overline {A}}^{t}.$

  2. A square matrix $ A$ with complex entries is called
    1. a Hermitian matrix if $ A^*
= A.$
    2. a unitary matrix if $ A \; A^* = A^* A = I_n.$
    3. a skew-Hermitian matrix if $ A^* = - A.$
    4. a normal matrix if $ A^* A =
A A^*.$
  3. A square matrix $ A$ with real entries is called
    1. a symmetric matrix if $ A^{t} = A.$
    2. an orthogonal matrix if $ A \;A^{t} = A^{t} A = I_n.$
    3. a skew-symmetric matrix if $ A^{t} = -A.$

Note that a symmetric matrix is always Hermitian, a skew-symmetric matrix is always skew-Hermitian and an orthogonal matrix is always unitary. Each of these matrices are normal. If $ A$ is a unitary matrix then $ A^* = A^{-1}.$

EXAMPLE 6.3.2  
  1. Let $ B= \begin{bmatrix}i & 1 \\ -1 & i
\end{bmatrix}.$ Then $ B$ is skew-Hermitian.
  2. Let $ A = \frac{1}{\sqrt{2}}\begin{bmatrix}1 & i \\ i & 1
\end{bmatrix}$ and $ B = \begin{bmatrix}1 & 1 \\ -1 & 1
\end{bmatrix}.$ Then $ A$ is a unitary matrix and $ B$ is a normal matrix. Note that $ \sqrt{2} A$ is also a normal matrix.

DEFINITION 6.3.3 (Unitary Equivalence)   Let $ A$ and $ B$ be two $ n \times n$ matrices. They are called unitarily equivalent if there exists a unitary matrix $ U$ such that $ A = U^* B U.$

Note that $ U^* = U^{-1}$ as $ U$ is a unitary matrix. So, $ A$ is unitarily similar to the matrix $ B$ .

EXERCISE 6.3.4  
  1. Let $ A$ be a square matrix such that $ U A U^*$ is a diagonal matrix for some unitary matrix $ U$ . Prove that $ A$ is a normal matrix.
  2. Let $ A$ be any matrix. Then $ A = \frac{1}{2}(A + A^*) + \frac{1}{2}(A - A^*) $ where $ \frac{1}{2}(A + A^*) $ is the Hermitian part of $ A$ and $ \frac{1}{2}(A - A^*) $ is the skew-Hermitian part of $ A.$
  3. Every matrix can be uniquely expressed as $ A = S + i T$ where both $ S$ and $ T$ are Hermitian matrices.
  4. Show that $ A - A^*$ is always skew-Hermitian.
  5. Does there exist a unitary matrix $ U$ such that $ U^{-1} A U = B$ where
    $ A = \begin{bmatrix}1 & 1 & 4\\ 0 &2 & 2\\ 0&0&3 \end{bmatrix}$ and $ B = \begin{bmatrix}2 & -1 & 3 \sqrt{2}\\ 0 &1 & \sqrt{2}\\ 0&0&3
\end{bmatrix}.$

PROPOSITION 6.3.5   Let $ A$ be an $ n \times n$ Hermitian matrix. Then all the eigenvalues of $ A$ are real.

Proof. Let $ (\lambda, {\mathbf x})$ be an eigenpair. Then $ A {\mathbf x}= \lambda {\mathbf x}$ and $ A = A^*$ implies

$\displaystyle {\mathbf x}^* A = {\mathbf x}^* A^* = (A {\mathbf x})^* = ({\lambda}{\mathbf x})^* = \overline{{\lambda}} {\mathbf x}^*.$

Hence

$\displaystyle \lambda {\mathbf x}^*{\mathbf x}= {\mathbf x}^* ({\lambda}{\mathb...
...}} {\mathbf x}^*) {\mathbf x}= {\overline{ \lambda}} {\mathbf x}^* {\mathbf x}.$

But $ {\mathbf x}$ is an eigenvector and hence $ {\mathbf x}\neq {\mathbf 0}$ and so the real number $ \Vert{\mathbf x}\Vert^2 = {\mathbf x}^* {\mathbf x}$ is non-zero as well. Thus $ \lambda =
{\overline{\lambda}}.$ That is, $ {\lambda}$ is a real number. height6pt width 6pt depth 0pt

THEOREM 6.3.6   Let $ A$ be an $ n \times n$ Hermitian matrix. Then $ A$ is unitarily diagonalisable. That is, there exists a unitary matrix $ U$ such that $ U^* A U = D;$ where $ D$ is a diagonal matrix with the eigenvalues of $ A$ as the diagonal entries.

In other words, the eigenvectors of $ A$ form an orthonormal basis of $ {\mathbb{C}}^n.$

Proof. We will prove the result by induction on the size of the matrix. The result is clearly true if $ n=1.$ Let the result be true for $ n = k-1.$ we will prove the result in case $ n = k.$ So, let $ A$ be a $ k \times k$ matrix and let $ (\lambda_1, {\mathbf x})$ be an eigenpair of $ A$ with $ \Vert {\mathbf x}\Vert = 1.$ We now extend the linearly independent set $ \{ {\mathbf x}\}$ to form an orthonormal basis $ \{{\mathbf x}, {\mathbf u}_2, {\mathbf u}_3,
\ldots, {\mathbf u}_k \}$ (using Gram-Schmidt Orthogonalisation) of $ {\mathbb{C}}^k$ .

As $ \{{\mathbf x}, {\mathbf u}_2, {\mathbf u}_3,
\ldots, {\mathbf u}_k \}$ is an orthonormal set,

$\displaystyle {\mathbf u}_i^* {\mathbf x}= 0 \;\; {\mbox{ for all }} \; i = 2, 3, \ldots, k.$

Therefore, observe that for all $ i, \; 2 \leq i \leq k,$

$\displaystyle (A {\mathbf u}_i)^* {\mathbf x}= ({\mathbf u}_i* A^*) {\mathbf x}...
..._i^* ({\lambda}_1 {\mathbf x}) = {\lambda}_1 ({\mathbf u}_i^* {\mathbf x}) = 0.$

Hence, we also have $ {\mathbf x}^* (A {\mathbf u}_i) = 0$ for $ 2 \leq i \leq k.$ Now, define $ U_1 = [ {\mathbf x}, \; {\mathbf u}_2, \; \cdots, {\mathbf u}_k ]$ (with $ {\mathbf x},
{\mathbf u}_2, \ldots, {\mathbf u}_k$ as columns of $ U_1$ ). Then the matrix $ U_1$ is a unitary matrix and
$\displaystyle U_1^{*} A U_1$ $\displaystyle =$ $\displaystyle U_1^* [ A {\mathbf x}\; A {\mathbf u}_2 \; \cdots A {\mathbf u}_k ]$  
  $\displaystyle =$ $\displaystyle \begin{bmatrix}{\mathbf x}^* \\ {\mathbf u}_2^* \\ \vdots \\ {\ma...
...mbda}_1 {\mathbf x}) & \cdots & {\mathbf u}_k^* (A {\mathbf u}_k)
\end{bmatrix}$  
  $\displaystyle =$ $\displaystyle \left[\begin{array}{c\vert c} \lambda_1 & {\mathbf 0}\\ \hline {\mathbf 0}& \\
\vdots & B \\ {\mathbf 0}& \end{array} \right],$  

where $ B$ is a $ (k-1) \times (k-1)$ matrix. As $ A^* = A$ ,we get $ (U_1^{*} A U_1)^* = U_1^{*} A U_1$ . This condition, together with the fact that $ {\lambda}_1$ is a real number (use Proposition 6.3.5), implies that $ B^* = B$ . That is, $ B$ is also a Hermitian matrix. Therefore, by induction hypothesis there exists a $ (k-1) \times (k-1)$ unitary matrix $ U_2$ such that

$\displaystyle U_2^{*} B U_2 = D_2 = {\mbox{diag}}(\lambda_2, \ldots,
\lambda_k).$

Recall that , the entries $ {\lambda}_i, \; $ for $ 2 \leq i
\leq k$ are the eigenvalues of the matrix $ B.$ We also know that two similar matrices have the same set of eigenvalues. Hence, the eigenvalues of $ A$ are $ \lambda_1, \lambda_2, \ldots, \lambda_k.$ Define $ U= U_1 \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2
\end{bmatrix}.$ Then $ U$ is a unitary matrix and
$\displaystyle U^{*} A U$ $\displaystyle =$ $\displaystyle \left( U_1 \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2
\end...
...left(U_1 \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2
\end{bmatrix}\right)$  
  $\displaystyle =$ $\displaystyle \left(\begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2^{*} \end{...
...ft( U_1 \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2
\end{bmatrix} \right)$  
  $\displaystyle =$ $\displaystyle \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2^{*} \end{bmatri...
...*} A U_1 \bigr) \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2
\end{bmatrix}$  
  $\displaystyle =$ $\displaystyle \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2^{*} \end{bmatri...
...in{bmatrix}{\lambda}_1 & {\mathbf 0}\\ {\mathbf 0}& U_2^{*} B U_2
\end{bmatrix}$  
  $\displaystyle =$ $\displaystyle \begin{bmatrix}{\lambda}_1 & {\mathbf 0}\\ {\mathbf 0}& D_2 \end{bmatrix}.$  

Thus, $ U^{*} A U$ is a diagonal matrix with diagonal entries $ \lambda_1, \lambda_2, \ldots, \lambda_k,$ the eigenvalues of $ A.$ Hence, the result follows. height6pt width 6pt depth 0pt

COROLLARY 6.3.7   Let $ A$ be an $ n \times n$ real symmetric matrix. Then
  1. the eigenvalues of $ A$ are all real,
  2. the corresponding eigenvectors can be chosen to have real entries, and
  3. the eigenvectors also form an orthonormal basis of $ {\mathbb{R}}^n.$

Proof. As $ A$ is symmetric, $ A$ is also an Hermitian matrix. Hence, by Proposition 6.3.5, the eigenvalues of $ A$ are all real. Let $ ({\lambda}, \; {\mathbf x})$ be an eigenpair of $ A.$ Suppose $ {\mathbf x}^t \in {\mathbb{C}}^n.$ Then there exist $ {\mathbf y}^t, {\mathbf z}^t \in {\mathbb{R}}^n$ such that $ {\mathbf x}= {\mathbf y}+ i {\mathbf z}.$ So,

$\displaystyle A {\mathbf x}= {\lambda}{\mathbf x}\Longrightarrow A ({\mathbf y}+ i {\mathbf z}) = {\lambda}( {\mathbf y}+ i {\mathbf z}).$

Comparing the real and imaginary parts, we get $ A {\mathbf y}= {\lambda}{\mathbf y}$ and $ A {\mathbf z}= {\lambda}{\mathbf z}.$ Thus, we can choose the eigenvectors to have real entries.

To prove the orthonormality of the eigenvectors, we proceed on the lines of the proof of Theorem 6.3.6, Hence, the readers are advised to complete the proof. height6pt width 6pt depth 0pt

EXERCISE 6.3.8  
  1. Let $ A$ be a skew-Hermitian matrix. Then all the eigenvalues of $ A$ are either zero or purely imaginary. Also, the eigenvectors corresponding to distinct eigenvalues are mutually orthogonal.
    [Hint: Carefully study the proof of Theorem 6.3.6.]
  2. Let $ A$ be an $ n \times n$ unitary matrix. Then
    1. the rows of $ A$ form an orthonormal basis of $ {\mathbb{C}}^n.$
    2. the columns of $ A$ form an orthonormal basis of $ {\mathbb{C}}^n.$
    3. for any two vectors $ {\mathbf x}, {\mathbf y}\in {\mathbb{C}}^{n \times 1},\;$ $ \langle A {\mathbf x}, A {\mathbf y}\rangle = \langle {\mathbf x}, {\mathbf y}\rangle.$
    4. for any vector $ {\mathbf x}\in {\mathbb{C}}^{n \times 1},\; $ $ \Vert A {\mathbf x}\Vert = \Vert {\mathbf x}\Vert.$
    5. for any eigenvalue $ \lambda$ $ A, \;$ $ \vert \lambda\vert = 1.$
    6. the eigenvectors $ {\mathbf x}, {\mathbf y}$ corresponding to distinct eigenvalues $ {\lambda}$ and $ \mu$ satisfy $ \langle {\mathbf x}, {\mathbf y}\rangle = 0.$ That is, if $ ({\lambda}, {\mathbf x})$ and $ (\mu, {\mathbf y})$ are eigenpairs, with $ {\lambda}\neq \mu,$ then $ {\mathbf x}$ and $ {\mathbf y}$ are mutually orthogonal.
  3. Let $ A$ be a normal matrix. Then, show that if $ (\lambda, {\mathbf x})$ is an eigenpair for $ A$ then $ ({\overline{\lambda}}, {\mathbf x})$ is an eigenpair for $ A^*.$
  4. Show that the matrices $ A = \begin{bmatrix}4&4\\ 0&4
\end{bmatrix}$ and $ B = \begin{bmatrix}10&9 \\ -4&-2
\end{bmatrix}$ are similar. Is it possible to find a unitary matrix $ U$ such that $ A = U^* B U?$
  5. Let $ A$ be a $ 2 \times 2$ orthogonal matrix. Then prove the following:
    1. if $ \det (A) = 1,$ then $ A = \begin{bmatrix}\cos \theta & - \sin \theta \\ \sin \theta & \cos \theta
\end{bmatrix}$ for some $ \theta, \;\; 0 \leq \theta < 2 \pi.$
    2. if $ \det A = -1,$ then there exists a basis of $ {\mathbb{R}}^2$ in which the matrix of $ A$ looks like $ \begin{bmatrix}
1 & 0 \\ 0 & -1 \end{bmatrix}.$

      Or equivalently, $ A = \begin{bmatrix}\cos \theta & \sin \theta \\ \sin \theta & - \cos \theta
\end{bmatrix}$ for some $ \theta, \;\; 0 \leq \theta < 2 \pi.$ In this case, prove that $ A$ reflects the vectors in $ {\mathbb{R}}^2$ about a line passing through origin. Also, determine this line.

  6. Let $ A = \begin{bmatrix}2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{bmatrix}.$ Determine $ A^{301}$ .
  7. Let $ A$ be a $ 3 \times 3$ orthogonal matrix. Then prove the following:
    1. if $ \det (A) = 1,$ then $ A$ is a rotation about a fixed axis, in the sense that $ A$ has an eigenpair $ (1, {\mathbf x})$ such that the restriction of $ A$ to the plane $ {\mathbf x}^{\perp}$ is a two dimensional rotation of $ {\mathbf x}^{\perp}.$
    2. if $ \det A = -1,$ then the action of $ A$ corresponds to a reflection through a plane $ P,$ followed by a rotation about the line through the origin that is perpendicular to $ P.$

Remark 6.3.9   In the previous exercise, we saw that the matrices $ A = \begin{bmatrix}4&4\\ 0&4
\end{bmatrix}$ and $ B = \begin{bmatrix}10&9 \\ -4&-2
\end{bmatrix}$ are similar but not unitarily equivalent, whereas unitary equivalence implies similarity equivalence as $ U^* = U^{-1}.$ But in numerical calculations, unitary transformations are preferred as compared to similarity transformations. The main reasons being:
  1. Exercise 6.3.8.2 implies that an orthonormal change of basis leaves unchanged the sum of squares of the absolute values of the entries which need not be true under a non-orthonormal change of basis.
  2. As $ U^* = U^{-1}$ for a unitary matrix $ U,$ unitary equivalence is computationally simpler.
  3. Also in doing ``conjugate transpose", the loss of accuracy due to round-off errors doesn't occur.

We next prove the Schur's Lemma and use it to show that normal matrices are unitarily diagonalisable.

LEMMA 6.3.10 (Schur's Lemma)   Every $ n \times n$ complex matrix is unitarily similar to an upper triangular matrix.

Proof. We will prove the result by induction on the size of the matrix. The result is clearly true if $ n=1.$ Let the result be true for $ n = k-1.$ we will prove the result in case $ n = k.$ So, let $ A$ be a $ k \times k$ matrix and let $ (\lambda_1, {\mathbf x})$ be an eigenpair for $ A$ with $ \Vert {\mathbf x}\Vert = 1.$ Then the linearly independent set $ \{ {\mathbf x}\}$ can be extended, using the Gram-Schmidt Orthogonalisation process, to get an orthonormal basis $ \{{\mathbf x}, {\mathbf u}_2, {\mathbf u}_3,
\ldots, {\mathbf u}_k \}$ of $ {\mathbb{C}}^n({\mathbb{C}})$ . Then $ U_1 = [ {\mathbf x}\; {\mathbf u}_2 \; \cdots {\mathbf u}_k ]$ (with $ {\mathbf x},
{\mathbf u}_2, \ldots, {\mathbf u}_k$ as the columns of the matrix $ U_1$ ) is a unitary matrix and
$\displaystyle U_1^{*} A U_1$ $\displaystyle =$ $\displaystyle U_1^* [ A {\mathbf x}\; A {\mathbf u}_2 \; \cdots A {\mathbf u}_k ]$  
  $\displaystyle =$ $\displaystyle \begin{bmatrix}{\mathbf x}^* \\ {\mathbf u}_2^* \\ \vdots \\ {\ma...
..._1 & * \\ \hline {\mathbf 0}& \\ \vdots & B \\ {\mathbf 0}&
\end{array} \right]$  

where $ B$ is a $ (k-1) \times (k-1)$ matrix. By induction hypothesis there exists a $ (k-1) \times (k-1)$ unitary matrix $ U_2$ such that $ U_2^{*} B U_2 $ is an upper triangular matrix with diagonal entries $ \lambda_2, \ldots, \lambda_k,$ the eigen values of the matrix $ B.$ Observe that since the eigenvalues of $ B$ are $ \lambda_2, \ldots, \lambda_k$ the eigenvalues of $ A$ are $ \lambda_1, \lambda_2, \ldots, \lambda_k.$ Define $ U= U_1 \begin{bmatrix}1 & {\mathbf 0}\\ {\mathbf 0}& U_2
\end{bmatrix}.$ Then check that $ U$ is a unitary matrix and $ U^{*} A U$ is an upper triangular matrix with diagonal entries $ \lambda_1, \lambda_2, \ldots, \lambda_k,$ the eigenvalues of the matrix $ A.$ Hence, the result follows. height6pt width 6pt depth 0pt

EXERCISE 6.3.11  
  1. Let $ A$ be an $ n \times n$ real invertible matrix. Prove that there exists an orthogonal matrix $ P$ and a diagonal matrix $ D$ with positive diagonal entries such that $ A A^t = P D P^{-1}$ .
  2. Show that matrices $ A = \begin{bmatrix}1 & 1 &
1\\ 0 & 2 & 1\\ 0 & 0 & 3 \end{bmatrix}$ and $ B =
\begin{bmatrix}2 & -1 & \sqrt{2}\\ 0 & 1 & 0\\ 0 & 0 & 3
\end{bmatrix}$ are unitarily equivalent via the unitary matrix $ U = \frac{1}{\sqrt{2}} \begin{bmatrix}1 & 1 & 0\\ 1 & -1
& 0\\ 0 & 0 & \sqrt{2} \end{bmatrix}.$ Hence, conclude that the upper triangular matrix obtained in the "Schur's Lemma" need not be unique.
  3. Show that the normal matrices are diagonalisable.
    [Hint: Show that the matrix $ B$ in the proof of the above theorem is also a normal matrix and if $ T$ is an upper triangular matrix with $ T^* T = T T^*$ then $ T$ has to be a diagonal matrix].

    Remark 6.3.12 (The Spectral Theorem for Normal Matrices)   Let $ A$ be an $ n \times n$ normal matrix. Then the above exercise shows that there exists an orthonormal basis $ \{{\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_n \}$ of $ {\mathbb{C}}^n({\mathbb{C}})$ such that $ A {\mathbf x}_i = \lambda_i {\mathbf x}_i$ for $ 1 \leq i \leq n.$

  4. Let $ A$ be a normal matrix. Prove the following:
    1. if all the eigenvalues of $ A$ are $ 0,$ then $ A = {\mathbf 0},$
    2. if all the eigenvalues of $ A$ are $ 1,$ then $ A = I.$
  5. Let $ A$ be an $ n \times n$ matrix. Prove that
    1. if $ A$ is Hermitian and $ {\mathbf x}A {\mathbf x}^* = 0$ for all $ {\mathbf x}\in {\mathbb{C}}^n$ then $ A = {\mathbf 0}$ .
    2. if $ A$ is a real, symmetric matrix and $ {\mathbf x}A {\mathbf x}^* = 0$ for all $ {\mathbf x}\in {\mathbb{R}}^n$ then $ A = {\mathbf 0}$ .

      Do these results hold for arbitrary matrices?

We end this chapter with an application of the theory of diagonalisation to the study of conic sections in analytic geometry and the study of maxima and minima in analysis.

A K Lal 2007-09-12