Introduction and Definitions

In this chapter, the linear transformations are from a given finite dimensional vector space $ V$ to itself. Observe that in this case, the matrix of the linear transformation is a square matrix. So, in this chapter, all the matrices are square matrices and a vector $ {\mathbf x}$ means $ {\mathbf x}=(x_1,x_2,\ldots,x_n)^t$ for some positive integer $ n.$

EXAMPLE 6.1.1   Let $ A$ be a real symmetric matrix. Consider the following problem:

$\displaystyle {\mbox{ Maximize (Minimize)}} \;\;{\mathbf x}^t A {\mathbf x}{\mb...
... }} {\mathbf x}
\in {\mathbb{R}}^n {\mbox{ and }} {\mathbf x}^t {\mathbf x}= 1.$

To solve this, consider the Lagrangian

$\displaystyle L({\mathbf x}, \lambda) = {\mathbf x}^t A {\mathbf x}- \lambda ( ...
...\sum_{i=1}^n\sum_{j=1}^n a_{ij} x_i x_j - \lambda (\sum_{i=1}^n x_i^2
\;\; -1).$

Partially differentiating $ L({\mathbf x}, \lambda)$ with respect to $ x_i$ for $ 1 \leq i \leq n,$ we get

$\displaystyle \frac{\partial L}{\partial x_1} = 2 a_{11} x_1 + 2 a_{12} x_2 + \cdots +
2 a_{1n} x_n - 2 \lambda x_1, $

$\displaystyle \frac{\partial L}{\partial x_2} = 2 a_{21} x_1 +
2 a_{22} x_2 + \cdots + 2 a_{2n} x_n - 2 \lambda x_2, $

and so on, till

$\displaystyle \frac{\partial L}{\partial x_n} = 2 a_{n1} x_1 + 2 a_{n2} x_2 + \cdots +
2 a_{nn} x_n - 2 \lambda x_n. $

Therefore, to get the points of extrema, we solve for

$\displaystyle (0,0,\ldots,0)^t = (\frac{\partial L}{\partial x_1},
\frac{\parti...
...rac{\partial L}{\partial {\mathbf x}} = 2 (A {\mathbf x}- \lambda {\mathbf x}).$

We therefore need to find a $ \lambda \in {\mathbb{R}}$ and $ {\mathbf 0}\neq {\mathbf x}\in {\mathbb{R}}^n$ such that $ A {\mathbf x}= \lambda {\mathbf x}$ for the extremal problem.

EXAMPLE 6.1.2   Consider a system of $ n$ ordinary differential equations of the form

$\displaystyle \frac{d \; {\mathbf y}(t)}{d t} = A {\mathbf y}, \; t \geq 0;$ (6.1.1)

where $ A$ is a real $ n \times n$ matrix and $ {\mathbf y}$ is a column vector.
To get a solution, let us assume that

$\displaystyle {\mathbf y}(t) = {\mathbf c}e^{ {\lambda}t}$ (6.1.2)

is a solution of (6.1.1) and look into what $ {\lambda}$ and $ {\mathbf c}$ has to satisfy, i.e., we are investigating for a necessary condition on $ {\lambda}$ and $ {\mathbf c}$ so that (6.1.2) is a solution of (6.1.1). Note here that (6.1.1) has the zero solution, namely $ y(t) \equiv 0$ and so we are looking for a non-zero $ {\mathbf c}.$ Differentiating (6.1.2) with respect to $ t$ and substituting in (6.1.1), leads to

$\displaystyle {\lambda}e^{{\lambda}t} {\mathbf c}= A e^{{\lambda}t} {\mathbf c}\;\; {\mbox{or equivalently}} \;\; (A - {\lambda}I) {\mathbf c}= {\mathbf 0}.$ (6.1.3)

So, (6.1.2) is a solution of the given system of differential equations if and only if $ {\lambda}$ and $ {\mathbf c}$ satisfy (6.1.3). That is, given an $ n \times n$ matrix $ A,$ we are this lead to find a pair $ ({\lambda}, {\mathbf c})$ such that $ {\mathbf c}\neq {\mathbf 0}$ and (6.1.3) is satisfied.

Let $ A$ be a matrix of order $ n.$ In general, we ask the question:
For what values of $ {\lambda}\in {\mathbb{F}},$ there exist a non-zero vector $ {\mathbf x}\in {\mathbb{F}}^n$ such that

$\displaystyle A {\mathbf x}= \lambda {\mathbf x}?$ (6.1.4)

Here, $ {\mathbb{F}}^n$ stands for either the vector space $ {\mathbb{R}}^n$ over $ {\mathbb{R}}$ or $ {\mathbb{C}}^n$ over $ {\mathbb{C}}.$ Equation (6.1.4) is equivalent to the equation

$\displaystyle (A - \lambda I) {\mathbf x}= {\mathbf 0}.$

By Theorem 2.5.1, this system of linear equations has a non-zero solution, if

$\displaystyle {\mbox{rank }} (A - \lambda I) < n, \;\; {\mbox{ or equivalently }}
\;\; \det(A - {\lambda}I) = 0.$

So, to solve (6.1.4), we are forced to choose those values of $ \lambda \in {\mathbb{F}}$ for which $ \det (A - \lambda I) = 0.$ Observe that $ \det(A - {\lambda}I)$ is a polynomial in $ {\lambda}$ of degree $ n.$ We are therefore lead to the following definition.

DEFINITION 6.1.3 (Characteristic Polynomial)   Let $ A$ be a matrix of order $ n.$ The polynomial $ \det(A - {\lambda}I)$ is called the characteristic polynomial of $ A$ and is denoted by $ p({\lambda}).$ The equation $ p({\lambda}) = 0$ is called the characteristic equation of $ A.$ If $ {\lambda}\in {\mathbb{F}}$ is a solution of the characteristic equation $ p({\lambda}) = 0,$ then $ {\lambda}$ is called a characteristic value of $ A.$

Some books use the term EIGENVALUE in place of characteristic value.

THEOREM 6.1.4   Let $ A= [a_{ij}]; \; a_{ij} \in {\mathbb{F}}, \; {\mbox{ for }} 1 \leq i, j \leq n.$ Suppose $ \lambda = \lambda_0 \in {\mathbb{F}}$ is a root of the characteristic equation. Then there exists a non-zero $ {\mathbf v}\in {{\mathbb{F}}}^n$ such that $ A {\mathbf v}= \lambda_0 {\mathbf v}.$

Proof. Since $ \lambda_0$ is a root of the characteristic equation, $ \det (A - \lambda_0 I) = 0.$ This shows that the matrix $ A -
\lambda_0 I$ is singular and therefore by Theorem 2.5.1 the linear system

$\displaystyle (A - \lambda_0 I_n) {\mathbf x}= {\mathbf 0}$

has a non-zero solution. height6pt width 6pt depth 0pt

Remark 6.1.5   Observe that the linear system $ A {\mathbf x}= {\lambda}{\mathbf x}$ has a solution $ {\mathbf x}={\mathbf 0}$ for every $ {\lambda}\in {\mathbb{F}}.$ So, we consider only those $ {\mathbf x}\in {\mathbb{F}}^n$ that are non-zero and are solutions of the linear system $ A {\mathbf x}= {\lambda}{\mathbf x}.$

DEFINITION 6.1.6 (Eigenvalue and Eigenvector)   If the linear system $ A {\mathbf x}= {\lambda}{\mathbf x}$ has a non-zero solution $ {\mathbf x}\in {\mathbb{F}}^n$ for some $ {\lambda}\in {\mathbb{F}},$ then
  1. $ \lambda \in {\mathbb{F}}$ is called an eigenvalue of $ A,$
  2. $ {\mathbf 0}\neq {\mathbf x}\in {\mathbb{F}}^n$ is called an eigenvector corresponding to the eigenvalue $ {\lambda}$ of $ A,$ and
  3. the tuple $ (\lambda, {\mathbf x})$ is called an eigenpair.

Remark 6.1.7   To understand the difference between a characteristic value and an eigenvalue, we give the following example.

Consider the matrix $ A = \begin{bmatrix}0 & 1 \\ -1 & 0 \end{bmatrix}.$ Then the characteristic polynomial of $ A$ is

$\displaystyle p({\lambda}) =
{\lambda}^2 + 1.$

Given the matrix $ A,$ recall the linear transformation $ T_A: {\mathbb{F}}^2 {\longrightarrow}{\mathbb{F}}^2$ defined by

$\displaystyle T_A({\mathbf x}) = A {\mathbf x}\;\;
{\mbox{ for every }} \;\; {\mathbf x}\in {\mathbb{F}}^2.$

  1. If $ \; {\mathbb{F}}= {\mathbb{C}},$ that is, if $ A$ is considered a COMPLEX matrix, then the roots of $ p({\lambda}) = 0$ in $ {\mathbb{C}}$ are $ \pm i.$ So, $ A$ has $ (i, (1,i)^t)$ and $ (-i, (i, 1)^t)$ as eigenpairs.
  2. If $ \; {\mathbb{F}}= {\mathbb{R}},$ that is, if $ A$ is considered a REAL matrix, then $ p({\lambda}) = 0$ has no solution in $ {\mathbb{R}}.$ Therefore, if $ {\mathbb{F}}= {\mathbb{R}},$ then $ A$ has no eigenvalue but it has $ \pm i$ as characteristic values.

Remark 6.1.8   Note that if $ (\lambda, {\mathbf x})$ is an eigenpair for an $ n \times n$ matrix $ A$ then for any non-zero $ c \in {\mathbb{F}},\;c \neq 0, \; (\lambda, c {\mathbf x})$ is also an eigenpair for $ A.$ Similarly, if $ {\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_r$ are eigenvectors of $ A$ corresponding to the eigenvalue $ {\lambda},$ then for any non-zero $ (c_1, c_2, \ldots, c_r) \in {\mathbb{F}}^r,$ it is easily seen that if $ \sum\limits_{i=1}^r c_i {\mathbf x}_i \neq {\mathbf 0}$ , then $ \sum\limits_{i=1}^r c_i {\mathbf x}_i$ is also an eigenvector of $ A$ corresponding to the eigenvalue $ {\lambda}.$ Hence, when we talk of eigenvectors corresponding to an eigenvalue $ \lambda,$ we mean LINEARLY INDEPENDENT EIGENVECTORS.

Suppose $ {\lambda}_0 \in {\mathbb{F}}$ is a root of the characteristic equation $ \det(A - {\lambda}_0 I) = 0.$ Then $ A - {\lambda}_0 I$ is singular and $ {\mbox{rank }}(A - {\lambda}_0 I) < n.$ Suppose $ {\mbox{rank }}(A - {\lambda}_0 I)=r < n.$ Then by Corollary 4.3.9, the linear system $ (A - {\lambda}_0 I) {\mathbf x}= {\mathbf 0}$ has $ n-r$ linearly independent solutions. That is, $ A$ has $ n-r$ linearly independent eigenvectors corresponding to the eigenvalue $ {\lambda}_0$ whenever $ {\mbox{rank }}(A - {\lambda}_0 I)=r < n.$

EXAMPLE 6.1.9  
  1. Let $ A = {\mbox{diag}}(d_1, d_2, \ldots, d_n)$ with $ d_i \in {\mathbb{R}}$ for $ 1 \leq i \leq n.$ Then $ p({\lambda}) = \prod_{i=1}^n ({\lambda}- d_i)$ is the characteristic equation. So, the eigenpairs are

    $\displaystyle (d_1, (1,0,\ldots, 0)^t), (d_2, (0,1,0,\ldots,0)^t), \ldots,
(d_n, (0,\ldots, 0,1)^t).$

  2. Let $ A = \begin{bmatrix}1 & 1
\\ 0 & 1 \end{bmatrix}.$ Then $ \det (A - \lambda I_2) =
(1-\lambda)^2.$ Hence, the characteristic equation has roots $ 1,
1.$ That is $ 1$ is a repeated eigenvalue. Now check that the equation $ (A - I_2) {\mathbf x}= {\mathbf 0}$ for $ {\mathbf x}= (x_1, x_2)^{t}$ is equivalent to the equation $ x_2 = 0.$ And this has the solution $ {\mathbf x}= (x_1, 0)^{t}.$ Hence, from the above remark, $ (1, 0)^{t}$ is a representative for the eigenvector. Therefore, HERE WE HAVE TWO EIGENVALUES MATHEND000# BUT ONLY ONE EIGENVECTOR.
  3. Let $ A
= \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix}.$ Then $ \det (A - \lambda I_2) =
(1-\lambda)^2.$ The characteristic equation has roots $ 1,
1.$ Here, the matrix that we have is $ I_2$ and we know that $ I_2 {\mathbf x}= {\mathbf x}$ for every $ {\mathbf x}^{t} \in {\mathbb{R}}^2$ and we can CHOOSE ANY TWO LINEARLY INDEPENDENT VECTORS $ {\mathbf x}^{t},
{\mathbf y}^{t} $ from $ {\mathbb{R}}^2$ to get $ (1, {\mathbf x})$ and $ (1, {\mathbf y})$ as the two eigenpairs.

    In general, if $ {\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_n$ are linearly independent vectors in $ {\mathbb{R}}^n,$ then $ (1, {\mathbf x}_1), \; (1, {\mathbf x}_2), \; \ldots, (1, {\mathbf x}_n)$ are eigenpairs for the identity matrix, $ I_n.$

  4. Let $ A = \begin{bmatrix}1 & 2 \\ 2 & 1 \end{bmatrix}.$ Then $ \det (A - \lambda I_2) = (\lambda- 3)(\lambda + 1).$ The characteristic equation has roots $ 3, -1.$ Now check that the eigenpairs are $ (3, (1,1)^{t}), $ and $ (-1, (1, -1)^{t}).$ In this case, we have TWO DISTINCT EIGENVALUES AND THE CORRESPONDING EIGENVECTORS ARE ALSO LINEARLY INDEPENDENT. The reader is required to prove the linear independence of the two eigenvectors.
  5. Let $ A = \begin{bmatrix}1 & -1 \\ 1 & 1 \end{bmatrix}.$ Then $ \det (A - \lambda I_2) = \lambda^2 - 2 \lambda + 2.$ The characteristic equation has roots $ 1 + i, 1 - i.$ Hence, over $ {\mathbb{R}},$ the matrix $ A$ has no eigenvalue. Over $ {\mathbb{C}},$ the reader is required to show that the eigenpairs are $ (1+i, (i,1)^t)$ and $ (1-i, (1,i)^t).$

EXERCISE 6.1.10  
  1. Find the eigenvalues of a triangular matrix.
  2. Find eigenpairs over $ {\mathbb{C}},$ for each of the following matrices:
    $ \begin{bmatrix}1 & 0 \\ 0 & 0 \end{bmatrix}, \hspace{0.1in}
\begin{bmatrix}1 &...
...bmatrix}\cos \theta & - \sin \theta \\ \sin \theta &
\cos \theta \end{bmatrix},$ and $ \;\;\begin{bmatrix}\cos \theta & \sin \theta \\ \sin \theta &
- \cos \theta \end{bmatrix}.$
  3. Let $ A$ and $ B$ be similar matrices.
    1. Then prove that $ A$ and $ B$ have the same set of eigenvalues.
    2. Let $ ({\lambda}, {\mathbf x})$ be an eigenpair for $ A$ and $ ({\lambda}, {\mathbf y})$ be an eigenpair for $ B.$ What is the relationship between the vectors $ {\mathbf x}$ and $ {\mathbf y}$ ?

      [Hint: Recall that if the matrices $ A$ and $ B$ are similar, then there exists a non-singular matrix $ P$ such that $ B = P A P^{-1}.$ ]

  4. Let $ A=(a_{ij})$ be an $ n \times n$ matrix. Suppose that for all $ i, \; 1 \leq i \leq n, \; \sum\limits_{j=1}^n
a_{ij} = a.$ Then prove that $ a$ is an eigenvalue of $ A.$ What is the corresponding eigenvector?
  5. Prove that the matrices $ A$ and $ A^t$ have the same set of eigenvalues. Construct a $ 2 \times 2$ matrix $ A$ such that the eigenvectors of $ A$ and $ A^t$ are different.
  6. Let $ A$ be a matrix such that $ A^2 = A$ ($ A$ is called an idempotent matrix). Then prove that its eigenvalues are either 0 or $ 1$ or both.
  7. Let $ A$ be a matrix such that $ A^k = {\mathbf 0}$ ($ A$ is called a nilpotent matrix) for some positive integer $ k \ge 1$ . Then prove that its eigenvalues are all 0 .

THEOREM 6.1.11   Let $ A=[a_{ij}]$ be an $ n \times n$ matrix with eigenvalues $ \lambda_1, \lambda_2, \ldots, \lambda_n,$ not necessarily distinct. Then $ \det (A) =
\prod\limits_{i=1}^n \lambda_i$ and $ {\mbox{ tr}}(A) =
\sum\limits_{i=1}^n a_{ii} = \sum\limits_{i=1}^n \lambda_i.$

Proof. Since $ \lambda_1, \lambda_2, \ldots, \lambda_n$ are the $ n$ eigenvalues of $ A,$ by definition,

$\displaystyle \det (A - \lambda I_n) = p({\lambda}) = (-1)^n (\lambda - \lambda_1) (\lambda - \lambda_2) \cdots (\lambda - \lambda_n).$ (6.1.5)

(6.1.5) is an identity in $ {\lambda}$ as polynomials. Therefore, by substituting $ \lambda = 0$ in (6.1.5), we get

$\displaystyle \det (A) = (-1)^n (-1)^n \prod_{i=1}^n \lambda_i = \prod_{i=1}^n
\lambda_i.$

Also,
$\displaystyle \det (A - \lambda I_n)$ $\displaystyle =$ $\displaystyle \begin{bmatrix}a_{11} - \lambda
& a_{12} & \cdots & a_{1n} \\ a_{...
... & \ddots & \vdots \\ a_{n1} & a_{n2} &
\cdots & a_{nn} - \lambda \end{bmatrix}$ (6.1.6)
  $\displaystyle =$ $\displaystyle a_0 - \lambda a_1 + \lambda^2 a_2 + \cdots$  
    $\displaystyle +(-1)^{n-1} {\lambda}^{n-1} a_{n-1} +
(-1)^n \lambda^n$ (6.1.7)

for some $ a_0, a_1, \ldots, a_{n-1} \in {\mathbb{F}}.$ Note that $ a_{n-1},$ the coefficient of $ (-1)^{n-1} {\lambda}^{n-1},$ comes from the product

$\displaystyle (a_{11} - {\lambda})(a_{22} - {\lambda}) \cdots (a_{nn} - {\lambda}).$

So, $ a_{n-1} = \sum\limits_{i=1}^n a_{ii} = {\mbox{tr}}(A)$ by definition of trace.

But , from (6.1.5) and (6.1.7), we get

    $\displaystyle a_0 - \lambda a_1 +
\lambda^2 a_2 + \cdots + (-1)^{n-1} {\lambda}^{n-1} a_{n-1} + (-1)^n \lambda^n$  
  $\displaystyle =$ $\displaystyle (-1)^n (\lambda - \lambda_1) (\lambda - \lambda_2) \cdots (\lambda -
\lambda_n).$ (6.1.8)

Therefore, comparing the coefficient of $ (-1)^{n-1} \lambda^{n-1},$ we have

$\displaystyle {\mbox{tr}}(A)= a_{n-1} = (-1) \{ (-1)\sum\limits_{i=1}^n
\lambda_i \} = \sum\limits_{i=1}^n \lambda_i.$

Hence, we get the required result. height6pt width 6pt depth 0pt

EXERCISE 6.1.12  
  1. Let $ A$ be a skew symmetric matrix of order $ 2n + 1.$ Then prove that 0 is an eigenvalue of $ A.$
  2. Let $ A$ be a $ 3 \times 3$ orthogonal matrix $ (A A^t = I)$ .If $ \det(A) = 1$ , then prove that there exists a non-zero vector $ {\mathbf v}\in {\mathbb{R}}^3$ such that $ A {\mathbf v}= {\mathbf v}.$

Let $ A$ be an $ n \times n$ matrix. Then in the proof of the above theorem, we observed that the characteristic equation $ \det(A - {\lambda}I) = 0$ is a polynomial equation of degree $ n$ in $ {\lambda}.$ Also, for some numbers $ a_0, a_1, \ldots, a_{n-1} \in {\mathbb{F}},$ it has the form

$\displaystyle {\lambda}^n + a_{n-1} {\lambda}^{n-1} + a_{n-2} {\lambda}^2 + \cdots a_1 {\lambda}+ a_0 = 0.$

Note that, in the expression $ \det(A - {\lambda}I)= 0, \;\; {\lambda}$ is an element of $ {\mathbb{F}}.$ Thus, we can only substitute $ {\lambda}$ by elements of $ {\mathbb{F}}.$

It turns out that the expression

$\displaystyle A^n + a_{n-1} A^{n-1} + a_{n-2} A^2 + \cdots a_1 A + a_0 I = {\mathbf 0}$

holds true as a matrix identity. This is a celebrated theorem called the Cayley Hamilton Theorem. We state this theorem without proof and give some implications.

THEOREM 6.1.13 (Cayley Hamilton Theorem)   Let $ A$ be a square matrix of order $ n.$ Then $ A$ satisfies its characteristic equation. That is,

$\displaystyle A^n + a_{n-1} A^{n-1} + a_{n-2} A^2 + \cdots a_1 A + a_0 I = {\mathbf 0}$

holds true as a matrix identity.

Some of the implications of Cayley Hamilton Theorem are as follows.

Remark 6.1.14  
  1. Let $ A = \left[\begin{array}{cc}0&1 \\ 0 & 0
\end{array}\right].$ Then its characteristic polynomial is $ p({\lambda}) = {\lambda}^2.$ Also, for the function, $ f(x) = x,$ $ \;f(0) = 0,$ and $ \; f(A) = A \neq {\mathbf 0}.$ This shows that the condition $ f({\lambda}) = 0$ for each eigenvalue $ {\lambda}$ of $ A$ does not imply that $ f(A) = {\mathbf 0}.$
  2. Suppose we are given a square matrix $ A$ of order $ n$ and we are interested in calculating $ A^{\ell}$ where $ \ell$ is large compared to $ n.$ Then we can use the division algorithm to find numbers $ {\alpha}_0, {\alpha}_1, \ldots,
{\alpha}_{n-1}$ and a polynomial $ f({\lambda})$ such that
    $\displaystyle {\lambda}^{\ell}$ $\displaystyle =$ $\displaystyle f({\lambda}) \bigl( {\lambda}^n + a_{n-1} {\lambda}^{n-1} + a_{n-2} {\lambda}^2 +
\cdots a_1 {\lambda}+ a_0 \bigr)$  
        $\displaystyle + {\alpha}_0 + {\lambda}{\alpha}_1 + \cdots + {\lambda}^{n-1} {\alpha}_{n-1}.$  

    Hence, by the Cayley Hamilton Theorem,

    $\displaystyle A^{\ell} = {\alpha}_0 I + {\alpha}_1 A + \cdots + {\alpha}_{n-1} A^{n-1}.$

    That is, we just need to compute the powers of $ A$ till $ n-1.$

    In the language of graph theory, it says the following:
    ``Let $ G$ be a graph on $ n$ vertices. Suppose there is no path of length $ n-1$ or less from a vertex $ v$ to a vertex $ u$ of $ G.$ Then there is no path from $ v$ to $ u$ of any length. That is, the graph $ G$ is disconnected and $ v$ and $ u$ are in different components."

  3. Let $ A$ be a non-singular matrix of order $ n.$ Then note that $ a_n = \det (A) \neq 0$ and

    $\displaystyle A^{-1} = \frac{-1}{a_n} [ A^{n-1} + a_{n-1} A^{n-2} + \cdots + a_1 I ].$

    This matrix identity can be used to calculate the inverse.
    Note that the vector $ A^{-1}$ (as an element of the vector space of all $ n \times n$ matrices) is a linear combination of the vectors $ I, A, \ldots, A^{n-1}.$

EXERCISE 6.1.15   Find inverse of the following matrices by using the Cayley Hamilton Theorem

$\displaystyle i)\;\; \begin{bmatrix}2&3&4 \\ 5&6&7\\ 1&1&2 \end{bmatrix}\;\;\;\...
...\; iii)
\begin{bmatrix}
1 & -2 & -1 \\ -2 & 1 & -1 \\ 0 & -1 & 2 \end{bmatrix}.$

THEOREM 6.1.16   If $ {\lambda}_1, {\lambda}_2, \ldots, {\lambda}_k$ are distinct eigenvalues of a matrix $ A$ with corresponding eigenvectors $ {\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_k,$ then the set $ \{{\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_k\}$ is linearly independent.

Proof. The proof is by induction on the number $ m$ of eigenvalues. The result is obviously true if $ m = 1$ as the corresponding eigenvector is non-zero and we know that any set containing exactly one non-zero vector is linearly independent.

Let the result be true for $ m, \; 1 \leq m < k.$ We prove the result for $ m+1.$ We consider the equation

$\displaystyle c_1 x_1 + c_2 x_2 + \cdots + c_{m+1} x_{m+1} = {\mathbf 0}$ (6.1.9)

for the unknowns $ c_1, c_2, \ldots, c_{m+1}.$ We have
$\displaystyle {\mathbf 0}= A {\mathbf 0}$ $\displaystyle =$ $\displaystyle A ( c_1 x_1 + c_2 x_2 + \cdots + c_{m+1} x_{m+1} )$  
  $\displaystyle =$ $\displaystyle c_1 A x_1 + c_2 A x_2 + \cdots + c_{m+1} A x_{m+1}$  
  $\displaystyle =$ $\displaystyle c_1 \lambda_1 x_1 + c_2 \lambda_2 x_2 + \cdots + c_{m+1}
\lambda_{m+1} x_{m+1}.$ (6.1.10)

From Equations (6.1.9) and (6.1.10), we get

$\displaystyle c_2 (\lambda_2 - \lambda_1 ) {\mathbf x}_2 + c_3 (\lambda_3 - \la...
...+
\cdots + c_{m+1} (\lambda_{m+1} - \lambda_1) {\mathbf x}_{m+1} = {\mathbf 0}.$

This is an equation in $ m$ eigenvectors. So, by the induction hypothesis, we have

$\displaystyle c_i (\lambda_i - \lambda_1) = 0
\;\; {\mbox{ for }} \;\; 2 \leq i \leq m+1.$

But the eigenvalues are distinct implies $ \lambda_i - {\lambda}_1 \neq 0$ for $ 2 \leq i \leq m+1.$ We therefore get $ c_i = 0$ for $ 2 \leq i \leq m+1.$ Also, $ {\mathbf x}_1 \neq {\mathbf 0}$ and therefore (6.1.9) gives $ c_1
= 0.$

Thus, we have the required result. height6pt width 6pt depth 0pt

We are thus lead to the following important corollary.

COROLLARY 6.1.17   The eigenvectors corresponding to distinct eigenvalues of an $ n \times n$ matrix $ A$ are linearly independent.

EXERCISE 6.1.18  
  1. For an $ n \times n$ matrix $ A,$ prove the following.
    1. $ A$ and $ A^{t}$ have the same set of eigenvalues.
    2. If $ {\lambda}$ is an eigenvalue of an invertible matrix $ A$ then $ \displaystyle\frac{1}{{\lambda}}$ is an eigenvalue of $ A^{-1}.$
    3. If $ {\lambda}$ is an eigenvalue of $ A$ then $ {\lambda}^{k}$ is an eigenvalue of $ A^k$ for any positive integer $ k.$
    4. If $ A$ and $ B$ are $ n \times n$ matrices with $ A$ nonsingular then $ B A^{-1}$ and $ A^{-1} B$ have the same set of eigenvalues.

      In each case, what can you say about the eigenvectors?

  2. Let $ A$ and $ B$ be $ 2 \times 2$ matrices for which $ \det(A) =
\det(B)$ and $ {\mbox{tr}}(A) = {\mbox{tr}}(B).$
    1. Do $ A$ and $ B$ have the same set of eigenvalues?
    2. Give examples to show that the matrices $ A$ and $ B$ need not be similar.
  3. Let $ ({\lambda}_1, {\mathbf u})$ be an eigenpair for a matrix $ A$ and let $ ({\lambda}_2, {\mathbf u})$ be an eigenpair for another matrix $ B.$
    1. Then prove that $ ({\lambda}_1+{\lambda}_2, {\mathbf u})$ is an eigenpair for the matrix $ A+B.$
    2. Give an example to show that if $ {\lambda}_1, {\lambda}_2$ are respectively the eigenvalues of $ A$ and $ B,$ then $ {\lambda}_1+{\lambda}_2$ need not be an eigenvalue of $ A+B.$
  4. Let $ {\lambda}_i, 1 \leq i \leq n$ be distinct non-zero eigenvalues of an $ n \times n$ matrix $ A.$ Let $ {\mathbf u}_i, 1 \leq i \leq n$ be the corresponding eigenvectors. Then show that $ {\cal {B}} =
\{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n \}$ forms a basis of $ {\mathbb{F}}^n ({\mathbb{F}}).$ If $ [{\mathbf b}]_{{\cal B}} = (c_1, c_2, \ldots, c_n)^t$ then show that $ A {\mathbf x}= {\mathbf b}$ has the unique solution

    $\displaystyle {\mathbf x}= \frac{c_1}{{\lambda}_1} {\mathbf u}_1 + \frac{c_2}{{\lambda}_2} {\mathbf u}_2 + \cdots +
\frac{c_n}{{\lambda}_n} {\mathbf u}_n.$

A K Lal 2007-09-12