Diagonalisation

Let $ A$ be a square matrix of order $ n$ and let $ T_A: {\mathbb{F}}^n {\longrightarrow}{\mathbb{F}}^n$ be the corresponding linear transformation. In this section, we ask the question ``does there exist a basis $ {\cal B}$ of $ {\mathbb{F}}^n$ such that $ T_A[{\cal B},{\cal B}],$ the matrix of the linear transformation $ T_A,$ is in the simplest possible form."

We know that, the simplest form for a matrix is the identity matrix and the diagonal matrix. In this section, we show that for a certain class of matrices $ A,$ we can find a basis $ {\cal B}$ such that $ T_A[{\cal B},{\cal B}]$ is a diagonal matrix, consisting of the eigenvalues of $ A.$ This is equivalent to saying that $ A$ is similar to a diagonal matrix. To show the above, we need the following definition.

DEFINITION 6.2.1 (Matrix Diagonalisation)   A matrix $ A$ is said to be diagonalisable if there exists a non-singular matrix $ P$ such that $ P^{-1} A P$ is a diagonal matrix.

Remark 6.2.2   Let $ A$ be an $ n \times n$ diagonalisable matrix with eigenvalues $ {\lambda}_1, {\lambda}_2, \ldots, {\lambda}_n.$ By definition, $ A$ is similar to a diagonal matrix $ D.$ Observe that $ D = {\mbox{diag}}({\lambda}_1, {\lambda}_2, \ldots, {\lambda}_n)$ as similar matrices have the same set of eigenvalues and the eigenvalues of a diagonal matrix are its diagonal entries.

EXAMPLE 6.2.3   Let $ A= \left[\begin{array}{cc} 0 & 1 \\ -1 & 0
\end{array}\right].$ Then we have the following:
  1. Let $ V = {\mathbb{R}}^2.$ Then $ A$ has no real eigenvalue (see Example 6.1.8 and hence $ A$ doesn't have eigenvectors that are vectors in $ {\mathbb{R}}^2.$ Hence, there does not exist any non-singular $ 2 \times 2$ real matrix $ P$ such that $ P^{-1} A P$ is a diagonal matrix.
  2. In case, $ V = {\mathbb{C}}^2 ({\mathbb{C}}),$ the two complex eigenvalues of $ A$ are $ -i, i$ and the corresponding eigenvectors are $ (i, 1)^t$ and $ (-i, 1)^t,$ respectively. Also, $ (i, 1)^t$ and $ (-i, 1)^t$ can be taken as a basis of $ {\mathbb{C}}^2 ({\mathbb{C}}).$ Define a $ 2 \times 2$ complex matrix by $ U = \frac{1}{\sqrt{2}}\left[\begin{array}{cc}
i & -i \\ 1 & 1 \end{array}\right].$ Then

    $\displaystyle U^* A U = \left[\begin{array}{cc} -i & 0 \\ 0 & i
\end{array}\right].$

THEOREM 6.2.4   let $ A$ be an $ n \times n$ matrix. Then $ A$ is diagonalisable if and only if $ A$ has $ n$ linearly independent eigenvectors.

Proof. Let $ A$ be diagonalisable. Then there exist matrices $ P$ and $ D$ such that

$\displaystyle P^{-1} A P = D =
{\mbox{diag}}({\lambda}_1, {\lambda}_2, \ldots, {\lambda}_n). $

Or equivalently, $ A P = P D.$ Let $ P = [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n].$ Then $ A P = P D$ implies that

$\displaystyle A {\mathbf u}_i = d_i {\mathbf u}_i \;\; {\mbox{ for }} \;\; 1 \leq i \leq n.$

Since $ {\mathbf u}_i$ 's are the columns of a non-singular matrix $ P,$ they are non-zero and so for $ 1 \leq i \leq n,$ we get the eigenpairs $ (d_i, {\mathbf u}_i)$ of $ A.$ Since, $ {\mathbf u}_i$ 's are columns of the non-singular matrix $ P,$ using Corollary 4.3.9, we get $ {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n$ are linearly independent.

Thus we have shown that if $ A$ is diagonalisable then $ A$ has $ n$ linearly independent eigenvectors.

Conversely, suppose $ A$ has $ n$ linearly independent eigenvectors $ {\mathbf u}_i,
\; 1 \leq i \leq n$ with eigenvalues $ \lambda_i.$ Then $ A {\mathbf u}_i = \lambda_i {\mathbf u}_i.$ Let $ P = [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n].$ Since $ {\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n$ are linearly independent, by Corollary 4.3.9, $ P$ is non-singular. Also,

$\displaystyle A P$ $\displaystyle =$ $\displaystyle [A {\mathbf u}_1, A {\mathbf u}_2, \ldots, A {\mathbf u}_n]
= [{\...
..._1 {\mathbf u}_1, {\lambda}_2 {\mathbf u}_2, \ldots, {\lambda}_n {\mathbf u}_n]$  
  $\displaystyle =$ $\displaystyle [{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n]
\begin{bmat...
...a}_2 & 0 \\ \vdots & \ddots & \vdots \\ 0 & 0 & {\lambda}_n \end{bmatrix}= P D.$  

Therefore the matrix $ A$ is diagonalisable. height6pt width 6pt depth 0pt

COROLLARY 6.2.5   let $ A$ be an $ n \times n$ matrix. Suppose that the eigenvalues of $ A$ are distinct. Then $ A$ is diagonalisable.

Proof. As $ A$ is an $ n \times n$ matrix, it has $ n$ eigenvalues. Since all the eigenvalues of $ A$ are distinct, by Corollary 6.1.17, the $ n$ eigenvectors are linearly independent. Hence, by Theorem 6.2.4, $ A$ is diagonalisable. height6pt width 6pt depth 0pt

COROLLARY 6.2.6   Let $ A$ be an $ n \times n$ matrix with $ {\lambda}_1, {\lambda}_2, \ldots, {\lambda}_k$ as its distinct eigenvalues and $ p({\lambda}) $ as its characteristic polynomial. Suppose that for each $ i, \; 1 \le i \le k,
\; (x - {\lambda}_i)^{m_i}$ divides $ p({\lambda}) $ but $ (x - {\lambda}_i)^{m_i+1}$ does not divides $ p({\lambda}) $ for some positive integers $ m_i$ . Then

$\displaystyle A \;{\mbox{ is diagonalisable if and only if }} \; \dim\bigl(\ker(A - {\lambda}_i I)\bigr) = m_i
\;{\mbox{ for each }} \; i, \; 1 \le i \le k.$

Or equivalently $ A \;{\mbox{ is diagonalisable if and only if }} \; {\mbox{rank}}(A - {\lambda}_i I) = n - m_i
\;{\mbox{ for each }} \; i, \; 1 \le i \le k.$

Proof. As $ A$ is diagonalisable, by Theorem 6.2.4, $ A$ has $ n$ linearly independent eigenvalues. Also, $ \sum\limits_{i=1}^k m_i = n$ as $ \deg( p({\lambda})) = n$ . Hence, for each eigenvalue $ {\lambda}_i, \; 1 \le i \le k$ , $ A$ has exactly $ m_i$ linearly independent eigenvectors. Thus, for each $ i, \; 1 \le i \le k$ , the homogeneous linear system $ (A - {\lambda}_i I) {\mathbf x}= {\mathbf 0}$ has exactly $ m_i$ linearly independent vectors in its solution set. Therefore, $ \dim\bigl(\ker(A - {\lambda}_i I)\bigr) \ge m_i$ . Indeed $ \dim\bigl(\ker(A - {\lambda}_i I)\bigr) = m_i$ for $ 1 \le i \le k$ follows from a simple counting argument.

Now suppose that for each $ i, \; 1 \le i \le k, \;\dim\bigl(\ker(A - {\lambda}_i I)\bigr) = m_i$ . Then for each $ i, \; 1 \le i \le k$ , we can choose $ m_i$ linearly independent eigenvectors. Also by Corollary 6.1.17, the eigenvectors corresponding to distinct eigenvalues are linearly independent. Hence $ A$ has $ n = \sum\limits_{i=1}^k m_i$ linearly independent eigenvectors. Hence by Theorem 6.2.4, $ A$ is diagonalisable. height6pt width 6pt depth 0pt

EXAMPLE 6.2.7  
  1. Let $ A=\left[\begin{array}{ccc}2 & 1 & 1\\ 1 & 2 & 1\\ 0 & -1
& 1 \end{array}\right].$ Then $ \det ( A - {\lambda}I) = (2 -
{\lambda})^2 (1 - {\lambda}).$ Hence, $ A$ has eigenvalues $ 1, 2, 2.$ It is easily seen that $ \bigl(1, (1,0, -1)^t \bigr)$ and $ (\bigl( 2, (1,1,-1)^t \bigr)$ are the only eigenpairs. That is, the matrix $ A$ has exactly one eigenvector corresponding to the repeated eigenvalue $ 2.$ Hence, by Theorem 6.2.4, the matrix $ A$ is not diagonalisable.
  2. Let $ A=\left[\begin{array}{ccc}2 & 1 & 1\\ 1 & 2 & 1\\ 1 & 1
& 2 \end{array}\right].$ Then $ \det ( A - {\lambda}I) = (4 -
{\lambda})(1 - {\lambda})^2.$ Hence, $ A$ has eigenvalues $ 1, 1, 4.$ It can be easily verified that $ (1,-1,0)^t$ and $ (1,0,-1)^t$ correspond to the eigenvalue $ 1$ and $ (1,1,1)^t$ corresponds to the eigenvalue $ \; 4.$ Note that the set $ \{ (1, -1, 0)^t, (1, 0, -1)^t \}$ consisting of eigenvectors corresponding to the eigenvalue $ 1$ are not orthogonal. This set can be replaced by the orthogonal set $ \{(1,0,-1)^t, (1,-2,1)^t\}$ which still consists of eigenvectors corresponding to the eigenvalue $ 1$ as $ (1, -2, 1) = 2 (1,-1,0) - (1,0,-1)$ . Also, the set $ \{(1,1,1), (1,0,-1), (1,-2,1)\}$ forms a basis of $ {\mathbb{R}}^3.$ So, by Theorem 6.2.4, the matrix $ A$ is diagonalisable. Also, if $ U = \left[\begin{array}{ccc} \frac{1}{\sqrt{3}} &
\frac{1}{\sqrt{2}} & \frac{1...
...frac{1}{\sqrt{3}} & -\frac{1}{\sqrt{2}}
& \frac{1}{\sqrt{6}} \end{array}\right]$ is the corresponding unitary matrix then $ U^* A U = {\mbox{diag}}(4,1,1).$

    Observe that the matrix $ A$ is a symmetric matrix. In this case, the eigenvectors are mutually orthogonal. In general, for any $ n \times n$ real symmetric matrix $ A,$ there always exist $ n$ eigenvectors and they are mutually orthogonal. This result will be proved later.

EXERCISE 6.2.8  
  1. By finding the eigenvalues of the following matrices, justify whether or not $ A = P D P^{-1}$ for some real non-singular matrix $ P$ and a real diagonal matrix $ D.$
    $ i) \;\; \begin{bmatrix}\cos \theta & \sin \theta \\ - \sin \theta &
\cos \thet...
...{bmatrix}\cos \theta & \sin \theta \\ \sin \theta & - \cos \theta \end{bmatrix}$ for any $ \theta$ with $ 0 \leq \theta \leq 2 \pi.$
  2. Are the two matrices $ \begin{bmatrix}2 & 1 \\ -1 & 0 \end{bmatrix}$ and $ \begin{bmatrix}2 & i \\ i & 0 \end{bmatrix}$ diagonalisable?
  3. Find the eigenvalues and eigenvectors of $ A = [a_{ij}]_{n \times n}$ , where $ a_{ij} = a$ if $ i=j$ and $ b$ otherwise.
  4. Let $ A$ be an $ n \times n$ matrix and $ B$ an $ m \times m$ matrix. Suppose $ C = \begin{bmatrix}A & {\mathbf 0}\\ {\mathbf 0}& B
\end{bmatrix}.$ Then show that $ C$ is diagonalisable if and only if both $ A$ and $ B$ are diagonalisable.
  5. Let $ T :
{\mathbb{R}}^5 \longrightarrow {\mathbb{R}}^5$ be a linear transformation with $ {\mbox{rank }}(T - I) = 3$ and

    $\displaystyle {\cal N}(T) = \{ (x_1, x_2, x_3, x_4, x_5) \in {\mathbb{R}}^5
\mid x_1 + x_4 + x_5 = 0, \; x_2 + x_3 = 0 \}.$

    Then
    1. determine the eigenvalues of $ T?$
    2. find the number of linearly independent eigenvectors corresponding to each eigenvalue?
    3. is $ T$ diagonalisable? Justify your answer.
  6. Let $ A$ be a non-zero square matrix such that $ A^2 = {\mathbf 0}.$ Show that $ A$ cannot be diagonalised. [Hint: Use Remark 6.2.2.]
  7. Are the following matrices diagonalisable?
    $ i) \;\;\begin{bmatrix}1 & 3 & 2 & 1 \\ 0 & 2 & 3 & 1
\\ 0 & 0 & -1 & 1 \\ 0 & 0 & 0 & 4 \end{bmatrix}, \;\;\;\;$ $ ii)
\;\; \begin{bmatrix}1 & 0 & -1 \\ 0 & 1 & 0 \\ 0 & 0 & 2
\end{bmatrix},\;\;\;\;$ $ iii) \;\;\begin{bmatrix}1 & -3
& 3 \\ 0 & -5 & 6 \\ 0 & -3 & 4 \end{bmatrix}.$

A K Lal 2007-09-12