Introduction and Definitions

In this chapter, the linear transformations are from a given finite dimensional vector space

to itself. Observe that in this case, the matrix of the linear transformation is a square matrix. So, in this chapter, all the matrices are square matrices and a vector ${\mathbf x}$ means ${\mathbf x}=(x_1,x_2,\ldots,x_n)^t$ for some positive integer

EXAMPLE 6.1.1 Let be a real symmetric matrix. Consider the following problem:

$\displaystyle {\mbox{ Maximize (Minimize)}} \;\;{\mathbf x}^t A {\mathbf x}{\mb... ... }} {\mathbf x} \in {\mathbb{R}}^n {\mbox{ and }} {\mathbf x}^t {\mathbf x}= 1.$

To solve this, consider the Lagrangian

$\displaystyle L({\mathbf x}, \lambda) = {\mathbf x}^t A {\mathbf x}- \lambda ( ... ...\sum_{i=1}^n\sum_{j=1}^n a_{ij} x_i x_j - \lambda (\sum_{i=1}^n x_i^2 \;\; -1).$

Partially differentiating $L({\mathbf x}, \lambda)$ with respect to for $1 \leq i \leq n,$ we get

$\displaystyle \frac{\partial L}{\partial x_1} = 2 a_{11} x_1 + 2 a_{12} x_2 + \cdots + 2 a_{1n} x_n - 2 \lambda x_1,$

$\displaystyle \frac{\partial L}{\partial x_2} = 2 a_{21} x_1 + 2 a_{22} x_2 + \cdots + 2 a_{2n} x_n - 2 \lambda x_2,$

and so on, till

$\displaystyle \frac{\partial L}{\partial x_n} = 2 a_{n1} x_1 + 2 a_{n2} x_2 + \cdots + 2 a_{nn} x_n - 2 \lambda x_n.$

Therefore, to get the points of extrema, we solve for

$\displaystyle (0,0,\ldots,0)^t = (\frac{\partial L}{\partial x_1}, \frac{\parti... ...rac{\partial L}{\partial {\mathbf x}} = 2 (A {\mathbf x}- \lambda {\mathbf x}).$

We therefore need to find a $\lambda \in {\mathbb{R}}$ and ${\mathbf 0}\neq {\mathbf x}\in {\mathbb{R}}^n$ such that $A {\mathbf x}= \lambda {\mathbf x}$ for the extremal problem.

EXAMPLE 6.1.2 Consider a system of ordinary differential equations of the form

$\displaystyle \frac{d \; {\mathbf y}(t)}{d t} = A {\mathbf y}, \; t \geq 0;$

(6.1.1)

where is a real $n \times n$ matrix and ${\mathbf y}$ is a column vector.
To get a solution, let us assume that

$\displaystyle {\mathbf y}(t) = {\mathbf c}e^{ {\lambda}t}$

(6.1.2)

is a solution of (6.1.1) and look into what ${\lambda}$ and ${\mathbf c}$ has to satisfy, i.e., we are investigating for a necessary condition on ${\lambda}$ and ${\mathbf c}$ so that (6.1.2) is a solution of (6.1.1). Note here that (6.1.1) has the zero solution, namely $y(t) \equiv 0$ and so we are looking for a non-zero ${\mathbf c}.$ Differentiating (6.1.2) with respect to and substituting in (6.1.1), leads to

$\displaystyle {\lambda}e^{{\lambda}t} {\mathbf c}= A e^{{\lambda}t} {\mathbf c}\;\; {\mbox{or equivalently}} \;\; (A - {\lambda}I) {\mathbf c}= {\mathbf 0}.$

(6.1.3)

So, (6.1.2) is a solution of the given system of differential equations if and only if ${\lambda}$ and ${\mathbf c}$ satisfy (6.1.3). That is, given an $n \times n$ matrix we are this lead to find a pair $({\lambda}, {\mathbf c})$ such that ${\mathbf c}\neq {\mathbf 0}$ and (6.1.3) is satisfied.

Let

be a matrix of order

In general, we ask the question:
For what values of ${\lambda}\in {\mathbb{F}},$ there exist a non-zero vector ${\mathbf x}\in {\mathbb{F}}^n$ such that

DEFINITION 6.1.3 (Characteristic Polynomial) Let be a matrix of order The polynomial $\det(A - {\lambda}I)$ is called the characteristic polynomial of and is denoted by $p({\lambda}).$ The equation $p({\lambda}) = 0$ is called the characteristic equation of If ${\lambda}\in {\mathbb{F}}$ is a solution of the characteristic equation $p({\lambda}) = 0,$ then ${\lambda}$ is called a characteristic value of

Some books use the term EIGENVALUE in place of characteristic value.

Remark 6.1.7 To understand the difference between a characteristic value and an eigenvalue, we give the following example.

Consider the matrix $A = \begin{bmatrix}0 & 1 \\ -1 & 0 \end{bmatrix}.$ Then the characteristic polynomial of is

$\displaystyle p({\lambda}) = {\lambda}^2 + 1.$

Given the matrix recall the linear transformation $T_A: {\mathbb{F}}^2 {\longrightarrow}{\mathbb{F}}^2$ defined by

$\displaystyle T_A({\mathbf x}) = A {\mathbf x}\;\; {\mbox{ for every }} \;\; {\mathbf x}\in {\mathbb{F}}^2.$

If $\; {\mathbb{F}}= {\mathbb{C}},$ that is, if is considered a COMPLEX matrix, then the roots of $p({\lambda}) = 0$ in ${\mathbb{C}}$ are $\pm i.$ So, has and as eigenpairs.
If $\; {\mathbb{F}}= {\mathbb{R}},$ that is, if is considered a REAL matrix, then $p({\lambda}) = 0$ has no solution in ${\mathbb{R}}.$ Therefore, if ${\mathbb{F}}= {\mathbb{R}},$ then has no eigenvalue but it has $\pm i$ as characteristic values.

Remark 6.1.8 Note that if $(\lambda, {\mathbf x})$ is an eigenpair for an $n \times n$ matrix then for any non-zero $c \in {\mathbb{F}},\;c \neq 0, \; (\lambda, c {\mathbf x})$ is also an eigenpair for Similarly, if ${\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_r$ are eigenvectors of corresponding to the eigenvalue ${\lambda},$ then for any non-zero $(c_1, c_2, \ldots, c_r) \in {\mathbb{F}}^r,$ it is easily seen that if $\sum\limits_{i=1}^r c_i {\mathbf x}_i \neq {\mathbf 0}$ , then $\sum\limits_{i=1}^r c_i {\mathbf x}_i$ is also an eigenvector of corresponding to the eigenvalue ${\lambda}.$ Hence, when we talk of eigenvectors corresponding to an eigenvalue $\lambda,$ we mean LINEARLY INDEPENDENT EIGENVECTORS.

Suppose ${\lambda}_0 \in {\mathbb{F}}$ is a root of the characteristic equation $\det(A - {\lambda}_0 I) = 0.$ Then $A - {\lambda}_0 I$ is singular and ${\mbox{rank }}(A - {\lambda}_0 I) < n.$ Suppose ${\mbox{rank }}(A - {\lambda}_0 I)=r < n.$ Then by Corollary 4.3.9, the linear system $(A - {\lambda}_0 I) {\mathbf x}= {\mathbf 0}$ has linearly independent solutions. That is, has linearly independent eigenvectors corresponding to the eigenvalue ${\lambda}_0$ whenever ${\mbox{rank }}(A - {\lambda}_0 I)=r < n.$

EXAMPLE 6.1.9

Let $A = {\mbox{diag}}(d_1, d_2, \ldots, d_n)$ with $d_i \in {\mathbb{R}}$ for $1 \leq i \leq n.$ Then $p({\lambda}) = \prod_{i=1}^n ({\lambda}- d_i)$ is the characteristic equation. So, the eigenpairs are

$\displaystyle (d_1, (1,0,\ldots, 0)^t), (d_2, (0,1,0,\ldots,0)^t), \ldots, (d_n, (0,\ldots, 0,1)^t).$
Let $A = \begin{bmatrix}1 & 1 \\ 0 & 1 \end{bmatrix}.$ Then $\det (A - \lambda I_2) = (1-\lambda)^2.$ Hence, the characteristic equation has roots That is is a repeated eigenvalue. Now check that the equation $(A - I_2) {\mathbf x}= {\mathbf 0}$ for ${\mathbf x}= (x_1, x_2)^{t}$ is equivalent to the equation And this has the solution ${\mathbf x}= (x_1, 0)^{t}.$ Hence, from the above remark, $(1, 0)^{t}$ is a representative for the eigenvector. Therefore, HERE WE HAVE TWO EIGENVALUES MATHEND000# BUT ONLY ONE EIGENVECTOR.
Let $A = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix}.$ Then $\det (A - \lambda I_2) = (1-\lambda)^2.$ The characteristic equation has roots Here, the matrix that we have is and we know that $I_2 {\mathbf x}= {\mathbf x}$ for every ${\mathbf x}^{t} \in {\mathbb{R}}^2$ and we can CHOOSE ANY TWO LINEARLY INDEPENDENT VECTORS ${\mathbf x}^{t}, {\mathbf y}^{t}$ from ${\mathbb{R}}^2$ to get $(1, {\mathbf x})$ and $(1, {\mathbf y})$ as the two eigenpairs.
In general, if ${\mathbf x}_1, {\mathbf x}_2, \ldots, {\mathbf x}_n$ are linearly independent vectors in ${\mathbb{R}}^n,$ then $(1, {\mathbf x}_1), \; (1, {\mathbf x}_2), \; \ldots, (1, {\mathbf x}_n)$ are eigenpairs for the identity matrix,
Let $A = \begin{bmatrix}1 & 2 \\ 2 & 1 \end{bmatrix}.$ Then $\det (A - \lambda I_2) = (\lambda- 3)(\lambda + 1).$ The characteristic equation has roots Now check that the eigenpairs are $(3, (1,1)^{t}),$ and $(-1, (1, -1)^{t}).$ In this case, we have TWO DISTINCT EIGENVALUES AND THE CORRESPONDING EIGENVECTORS ARE ALSO LINEARLY INDEPENDENT. The reader is required to prove the linear independence of the two eigenvectors.
Let $A = \begin{bmatrix}1 & -1 \\ 1 & 1 \end{bmatrix}.$ Then $\det (A - \lambda I_2) = \lambda^2 - 2 \lambda + 2.$ The characteristic equation has roots Hence, over ${\mathbb{R}},$ the matrix has no eigenvalue. Over ${\mathbb{C}},$ the reader is required to show that the eigenpairs are and

EXERCISE 6.1.10

Find the eigenvalues of a triangular matrix.
Find eigenpairs over ${\mathbb{C}},$ for each of the following matrices:
$\begin{bmatrix}1 & 0 \\ 0 & 0 \end{bmatrix}, \hspace{0.1in} \begin{bmatrix}1 &... ...bmatrix}\cos \theta & - \sin \theta \\ \sin \theta & \cos \theta \end{bmatrix},$ and $\;\;\begin{bmatrix}\cos \theta & \sin \theta \\ \sin \theta & - \cos \theta \end{bmatrix}.$
Let and be similar matrices.
1. Then prove that and have the same set of eigenvalues.
2. Let $({\lambda}, {\mathbf x})$ be an eigenpair for and $({\lambda}, {\mathbf y})$ be an eigenpair for What is the relationship between the vectors ${\mathbf x}$ and ${\mathbf y}$ ?
  [Hint: Recall that if the matrices and are similar, then there exists a non-singular matrix such that $B = P A P^{-1}.$ ]
Let $A=(a_{ij})$ be an $n \times n$ matrix. Suppose that for all $i, \; 1 \leq i \leq n, \; \sum\limits_{j=1}^n a_{ij} = a.$ Then prove that is an eigenvalue of What is the corresponding eigenvector?
Prove that the matrices and have the same set of eigenvalues. Construct a $2 \times 2$ matrix such that the eigenvectors of and are different.
Let be a matrix such that ( is called an idempotent matrix). Then prove that its eigenvalues are either 0 or or both.
Let be a matrix such that $A^k = {\mathbf 0}$ ( is called a nilpotent matrix) for some positive integer $k \ge 1$ . Then prove that its eigenvalues are all 0 .

Proof. Since $\lambda_1, \lambda_2, \ldots, \lambda_n$ are the

eigenvalues of

by definition,

$\displaystyle \det (A - \lambda I_n) = p({\lambda}) = (-1)^n (\lambda - \lambda_1) (\lambda - \lambda_2) \cdots (\lambda - \lambda_n).$

(6.1.5)

(6.1.5) is an identity in ${\lambda}$ as polynomials. Therefore, by substituting $\lambda = 0$ in (6.1.5), we get

$\displaystyle \det (A) = (-1)^n (-1)^n \prod_{i=1}^n \lambda_i = \prod_{i=1}^n \lambda_i.$

Also,

$\displaystyle \det (A - \lambda I_n)$	$\displaystyle =$	$\displaystyle \begin{bmatrix}a_{11} - \lambda & a_{12} & \cdots & a_{1n} \\ a_{... ... & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} - \lambda \end{bmatrix}$	(6.1.6)
	$\displaystyle =$	$\displaystyle a_0 - \lambda a_1 + \lambda^2 a_2 + \cdots$
		$\displaystyle +(-1)^{n-1} {\lambda}^{n-1} a_{n-1} + (-1)^n \lambda^n$	(6.1.7)

for some $a_0, a_1, \ldots, a_{n-1} \in {\mathbb{F}}.$ Note that $a_{n-1},$ the coefficient of $(-1)^{n-1} {\lambda}^{n-1},$ comes from the product

$\displaystyle (a_{11} - {\lambda})(a_{22} - {\lambda}) \cdots (a_{nn} - {\lambda}).$

So, $a_{n-1} = \sum\limits_{i=1}^n a_{ii} = {\mbox{tr}}(A)$ by definition of trace.

But , from (6.1.5) and (6.1.7), we get

		$\displaystyle a_0 - \lambda a_1 + \lambda^2 a_2 + \cdots + (-1)^{n-1} {\lambda}^{n-1} a_{n-1} + (-1)^n \lambda^n$
	$\displaystyle =$	$\displaystyle (-1)^n (\lambda - \lambda_1) (\lambda - \lambda_2) \cdots (\lambda - \lambda_n).$	(6.1.8)

Therefore, comparing the coefficient of $(-1)^{n-1} \lambda^{n-1},$ we have

$\displaystyle {\mbox{tr}}(A)= a_{n-1} = (-1) \{ (-1)\sum\limits_{i=1}^n \lambda_i \} = \sum\limits_{i=1}^n \lambda_i.$

Hence, we get the required result. height6pt width 6pt depth 0pt

Let

be an $n \times n$ matrix. Then in the proof of the above theorem, we observed that the characteristic equation $\det(A - {\lambda}I) = 0$ is a polynomial equation of degree

in ${\lambda}.$ Also, for some numbers $a_0, a_1, \ldots, a_{n-1} \in {\mathbb{F}},$ it has the form

Remark 6.1.14

Let $A = \left[\begin{array}{cc}0&1 \\ 0 & 0 \end{array}\right].$ Then its characteristic polynomial is $p({\lambda}) = {\lambda}^2.$ Also, for the function, $\;f(0) = 0,$ and $\; f(A) = A \neq {\mathbf 0}.$ This shows that the condition $f({\lambda}) = 0$ for each eigenvalue ${\lambda}$ of does not imply that $f(A) = {\mathbf 0}.$
Suppose we are given a square matrix of order and we are interested in calculating $A^{\ell}$ where $\ell$ is large compared to Then we can use the division algorithm to find numbers ${\alpha}_0, {\alpha}_1, \ldots, {\alpha}_{n-1}$ and a polynomial $f({\lambda})$ such that

$\displaystyle {\lambda}^{\ell}$ $\displaystyle =$ $\displaystyle f({\lambda}) \bigl( {\lambda}^n + a_{n-1} {\lambda}^{n-1} + a_{n-2} {\lambda}^2 + \cdots a_1 {\lambda}+ a_0 \bigr)$

$\displaystyle + {\alpha}_0 + {\lambda}{\alpha}_1 + \cdots + {\lambda}^{n-1} {\alpha}_{n-1}.$

Hence, by the Cayley Hamilton Theorem,

$\displaystyle A^{\ell} = {\alpha}_0 I + {\alpha}_1 A + \cdots + {\alpha}_{n-1} A^{n-1}.$

That is, we just need to compute the powers of till
In the language of graph theory, it says the following:
``Let be a graph on vertices. Suppose there is no path of length or less from a vertex to a vertex of Then there is no path from to of any length. That is, the graph is disconnected and and are in different components."
Let be a non-singular matrix of order Then note that $a_n = \det (A) \neq 0$ and

$\displaystyle A^{-1} = \frac{-1}{a_n} [ A^{n-1} + a_{n-1} A^{n-2} + \cdots + a_1 I ].$

This matrix identity can be used to calculate the inverse.
Note that the vector $A^{-1}$ (as an element of the vector space of all $n \times n$ matrices) is a linear combination of the vectors $I, A, \ldots, A^{n-1}.$

Proof. The proof is by induction on the number

of eigenvalues. The result is obviously true if

as the corresponding eigenvector is non-zero and we know that any set containing exactly one non-zero vector is linearly independent.

Let the result be true for $m, \; 1 \leq m < k.$ We prove the result for $ m+1.$ We consider the equation

$\displaystyle c_1 x_1 + c_2 x_2 + \cdots + c_{m+1} x_{m+1} = {\mathbf 0}$

(6.1.9)

for the unknowns $c_1, c_2, \ldots, c_{m+1}.$ We have

$\displaystyle {\mathbf 0}= A {\mathbf 0}$	$\displaystyle =$	$\displaystyle A ( c_1 x_1 + c_2 x_2 + \cdots + c_{m+1} x_{m+1} )$
	$\displaystyle =$	$\displaystyle c_1 A x_1 + c_2 A x_2 + \cdots + c_{m+1} A x_{m+1}$
	$\displaystyle =$	$\displaystyle c_1 \lambda_1 x_1 + c_2 \lambda_2 x_2 + \cdots + c_{m+1} \lambda_{m+1} x_{m+1}.$	(6.1.10)

From Equations (6.1.9) and (6.1.10), we get

$\displaystyle c_2 (\lambda_2 - \lambda_1 ) {\mathbf x}_2 + c_3 (\lambda_3 - \la... ...+ \cdots + c_{m+1} (\lambda_{m+1} - \lambda_1) {\mathbf x}_{m+1} = {\mathbf 0}.$

This is an equation in

eigenvectors. So, by the induction hypothesis, we have

$\displaystyle c_i (\lambda_i - \lambda_1) = 0 \;\; {\mbox{ for }} \;\; 2 \leq i \leq m+1.$

But the eigenvalues are distinct implies $\lambda_i - {\lambda}_1 \neq 0$ for $2 \leq i \leq m+1.$ We therefore get $ c_i = 0$

for $2 \leq i \leq m+1.$ Also, ${\mathbf x}_1 \neq {\mathbf 0}$ and therefore (6.1.9) gives

Thus, we have the required result. height6pt width 6pt depth 0pt

EXERCISE 6.1.18

For an $n \times n$ matrix prove the following.
1. and $A^{t}$ have the same set of eigenvalues.
2. If ${\lambda}$ is an eigenvalue of an invertible matrix then $\displaystyle\frac{1}{{\lambda}}$ is an eigenvalue of $A^{-1}.$
3. If ${\lambda}$ is an eigenvalue of then ${\lambda}^{k}$ is an eigenvalue of for any positive integer
4. If and are $n \times n$ matrices with nonsingular then $B A^{-1}$ and $A^{-1} B$ have the same set of eigenvalues.
  In each case, what can you say about the eigenvectors?
Let and be $2 \times 2$ matrices for which $\det(A) = \det(B)$ and ${\mbox{tr}}(A) = {\mbox{tr}}(B).$
1. Do and have the same set of eigenvalues?
2. Give examples to show that the matrices and need not be similar.
Let $({\lambda}_1, {\mathbf u})$ be an eigenpair for a matrix and let $({\lambda}_2, {\mathbf u})$ be an eigenpair for another matrix
1. Then prove that $({\lambda}_1+{\lambda}_2, {\mathbf u})$ is an eigenpair for the matrix
2. Give an example to show that if ${\lambda}_1, {\lambda}_2$ are respectively the eigenvalues of and then ${\lambda}_1+{\lambda}_2$ need not be an eigenvalue of
Let ${\lambda}_i, 1 \leq i \leq n$ be distinct non-zero eigenvalues of an $n \times n$ matrix Let ${\mathbf u}_i, 1 \leq i \leq n$ be the corresponding eigenvectors. Then show that ${\cal {B}} = \{{\mathbf u}_1, {\mathbf u}_2, \ldots, {\mathbf u}_n \}$ forms a basis of ${\mathbb{F}}^n ({\mathbb{F}}).$ If $[{\mathbf b}]_{{\cal B}} = (c_1, c_2, \ldots, c_n)^t$ then show that $A {\mathbf x}= {\mathbf b}$ has the unique solution

$\displaystyle {\mathbf x}= \frac{c_1}{{\lambda}_1} {\mathbf u}_1 + \frac{c_2}{{\lambda}_2} {\mathbf u}_2 + \cdots + \frac{c_n}{{\lambda}_n} {\mathbf u}_n.$

$\displaystyle {\lambda}^{\ell}$	$\displaystyle =$	$\displaystyle f({\lambda}) \bigl( {\lambda}^n + a_{n-1} {\lambda}^{n-1} + a_{n-2} {\lambda}^2 + \cdots a_1 {\lambda}+ a_0 \bigr)$
		$\displaystyle + {\alpha}_0 + {\lambda}{\alpha}_1 + \cdots + {\lambda}^{n-1} {\alpha}_{n-1}.$