Sylvester's Law of Inertia and Applications

DEFINITION 6.4.1 (Bilinear Form)   Let $ A$ be a $ n \times n$ matrix with real entries. A bilinear form in $ {\mathbf x}=(x_1, x_2, \ldots, x_n)^t, \; {\mathbf y}=(y_1, y_2, \ldots,
y_n)^t$ is an expression of the type

$\displaystyle Q({\mathbf x}, {\mathbf y})= {\mathbf x}^t A {\mathbf y}=
\sum\limits_{i,j=1}^n a_{ij} x_i y_j.$

Observe that if $ A= I$ (the identity matrix) then the bilinear form reduces to the standard real inner product. Also, if we want it to be symmetric in $ {\mathbf x}$ and $ {\mathbf y}$ then it is necessary and sufficient that $ a_{ij} = a_{ji}$ for all $ i,j = 1, 2, \ldots, n.$ Why? Hence, any symmetric bilinear form is naturally associated with a real symmetric matrix.

DEFINITION 6.4.2 (Sesquilinear Form)   Let $ A$ be a $ n \times n$ matrix with complex entries. A sesquilinear form in $ {\mathbf x}=(x_1, x_2, \ldots, x_n)^t, \; {\mathbf y}=(y_1, y_2, \ldots,
y_n)^t$ is given by

$\displaystyle H({\mathbf x}, {\mathbf y})=
\sum\limits_{i,j=1}^n a_{ij} x_i {\overline{y_j}}.$

Note that if $ A= I$ (the identity matrix) then the sesquilinear form reduces to the standard complex inner product. Also, it can be easily seen that this form is `linear' in the first component and `conjugate linear' in the second component. Also, if we want $ H({\mathbf x}, {\mathbf y}) = {\overline{ H({\mathbf y}, {\mathbf x}) }}$ then the matrix $ A$ need to be an Hermitian matrix. Note that if $ a_{ij} \in {\mathbb{R}}$ and $ {\mathbf x}, {\mathbf y}\in {\mathbb{R}}^n$ , then the sesquilinear form reduces to a bilinear form.

The expression $ Q({\mathbf x},{\mathbf x})$ is called the quadratic form and $ H({\mathbf x},{\mathbf x})$ the Hermitian form. We generally write $ Q({\mathbf x}) $ and $ H({\mathbf x})$ in place of $ Q({\mathbf x},{\mathbf x})$ and $ H({\mathbf x},{\mathbf x})$ , respectively. It can be easily shown that for any choice of $ {\mathbf x},$ the Hermitian form $ H({\mathbf x})$ is a real number.

Therefore, in matrix notation, for a Hermitian matrix $ A$ , the Hermitian form can be rewritten as

$\displaystyle H({\mathbf x}) = {{\mathbf x}}^t A {\mathbf x},
\hspace{0.5in}{\mbox{ where }} {\mathbf x}= (x_1, x_2, \ldots, x_n)^t,
{\mbox{ and }} A = [a_{ij}].$

EXAMPLE 6.4.3   Let $ A = \begin{bmatrix}1 & 2 - i \\ 2 + i & 2
\end{bmatrix}.$ Then check that $ A$ is an Hermitian matrix and for $ {\mathbf x}= (x_1, x_2)^t, $ the Hermitian form
$\displaystyle H({\mathbf x})$ $\displaystyle =$ $\displaystyle {\mathbf x}^* A {\mathbf x}= ({\overline{x}_1}, {\overline{x}_2})...
...rix}1 & 2 - i \\ 2 + i & 2
\end{bmatrix} \begin{pmatrix}x_1
\\ x_2\end{pmatrix}$  
  $\displaystyle =$ $\displaystyle {\overline{x}_1} x_1 + 2
{\overline{x}_2} x_2 + (2 - i) \overline{x}_1 x_2 + (2 +
i) \overline{x}_2 x_1$  
  $\displaystyle =$ $\displaystyle \vert x_1\vert^2 + 2 \vert x_2\vert^2 + 2
{\mbox{Re}}[(2 - i) \overline{x}_1 x_2]$  

where `Re' denotes the real part of a complex number. This shows that for every choice of $ {\mathbf x}$ the Hermitian form is always real. Why?

The main idea is to express $ H({\mathbf x})$ as sum of squares and hence determine the possible values that it can take. Note that if we replace $ {\mathbf x}$ by $ c {\mathbf x},$ where $ c$ is any complex number, then $ H({\mathbf x})$ simply gets multiplied by $ \vert c\vert^2$ and hence one needs to study only those $ {\mathbf x}$ for which $ \Vert {\mathbf x}\Vert = 1,$ i.e., $ {\mathbf x}$ is a normalised vector.

From Exercise 6.3.11.3 one knows that if $ A = A^*$ ($ A$ is Hermitian) then there exists a unitary matrix $ U$ such that $ U^* A U = D$ ( $ D= diag(\lambda_1, \lambda_2, \ldots, \lambda_n)$ with $ \lambda_i$ 's the eigenvalues of the matrix $ A$ which we know are real). So, taking $ {\mathbf z}= U^* {\mathbf x}$ (i.e., choosing $ z_i$ 's as linear combination of $ x_j$ 's with coefficients coming from the entries of the matrix $ U^*$ ), one gets

$\displaystyle H({\mathbf x}) = {\mathbf x}^* A {\mathbf x}= {\mathbf z}^* U^* A...
..._{i=1}^n \lambda_i \left\vert \sum\limits_{j=1}^n {u_{ji}}^* x_j \right\vert^2.$ (6.4.1)

Thus, one knows the possible values that $ H({\mathbf x})$ can take depending on the eigenvalues of the matrix $ A$ in case $ A$ is a Hermitian matrix. Also, for $ 1 \leq i \leq n,$ $ \sum\limits_{j=1}^n {u_{ji}}^* x_j $ represents the principal axes of the conic that they represent in the n-dimensional space.

Equation (6.4.1) gives one method of writing $ H({\mathbf x})$ as a sum of $ n$ absolute squares of linearly independent linear forms. One can easily show that there are more than one way of writing $ H({\mathbf x})$ as sum of squares. The question arises, ``what can we say about the coefficients when $ H({\mathbf x})$ has been written as sum of absolute squares".

This question is answered by `Sylvester's law of inertia' which we state as the next lemma.

LEMMA 6.4.4   Every Hermitian form $ H({\mathbf x}) = {{\mathbf x}}^* A {\mathbf x}$ (with $ A$ an Hermitian matrix) in $ n$ variables can be written as

$\displaystyle H({\mathbf x}) = \vert y_1\vert^2 + \vert y_2\vert^2 + \cdots + \vert y_p\vert^2 - \vert y_{p+1}\vert^2 - \cdots - \vert y_r\vert^2$

where $ y_1, y_2, \ldots, y_r$ are linearly independent linear forms in $ x_1, x_2, \ldots, x_n,$ and the integers $ p$ and $ r,$ $ 0 \leq p \leq r \leq n,$ depend only on $ A.$

Proof. From Equation (6.4.1) it is easily seen that $ H({\mathbf x})$ has the required form. Need to show that $ p$ and $ r$ are uniquely given by $ A.$

Hence, let us assume on the contrary that there exist positive integers $ p, q, r, s$ with $ p > q$ such that /

$\displaystyle H({\mathbf x})$ $\displaystyle =$ $\displaystyle \vert y_1\vert^2 + \vert y_2\vert^2 + \cdots + \vert y_p\vert^2 - \vert y_{p+1}\vert^2 - \cdots - \vert y_r\vert^2$  
  $\displaystyle =$ $\displaystyle \vert z_1\vert^2 + \vert z_2\vert^2 + \cdots + \vert z_q\vert^2 - \vert z_{q+1}\vert^2 - \cdots - \vert z_s\vert^2.$  

Since, $ {\mathbf y}=(y_1, y_2, \ldots, y_n)^t$ and $ {\mathbf z}=(z_1, z_2, \ldots, z_n)^t$ are linear combinations of $ x_1, x_2, \ldots, x_n,$ we can find a matrix $ B$ such that $ {\mathbf z}= B {\mathbf y}.$ Choose $ y_{p+1} = y_{p+2} = \cdots = y_r = 0$ . Since $ p > q,$ Theorem 2.5.1, gives the existence of finding nonzero values of $ y_1, y_2, \ldots, y_p$ such that $ z_1=z_2 = \cdots=z_q = 0.$ Hence, we get

$\displaystyle \vert y_1\vert^2 + \vert y_2\vert^2 + \cdots + \vert y_p\vert^2 = - (\vert z_{q+1}\vert^2 + \cdots + \vert z_s\vert^2).$

Now, this can hold only if $ y_1 = y_2 = \cdots = y_p = 0,$ which gives a contradiction. Hence $ p = q.$

Similarly, the case $ r > s$ can be resolved. height6pt width 6pt depth 0pt




Note: The integer $ r$ is the rank of the matrix $ A$ and the number $ r- 2 p$ is sometimes called the inertial degree of $ A.$

We complete this chapter by understanding the graph of

$\displaystyle a x^2 + 2h x y + b y^2 + 2 f x + 2 g y + c = 0$

for $ a,b,c,f,g,h \in {\mathbb{R}}.$ We first look at the following example.

EXAMPLE 6.4.5   Sketch the graph of $ 3 x^2 + 4 x y + 3 y^2 = 5.$

Solution: Note that

$\displaystyle 3 x^2 + 4 x y + 3 y^2 = [x, \;\;
y] \begin{bmatrix}3 & 2 \\ 2 & 3 \end{bmatrix}\begin{bmatrix}x\\ y\end{bmatrix}.$

The eigenpairs for $ \begin{bmatrix}3 & 2 \\ 2 & 3
\end{bmatrix} $ are $ (5, (1,1)^t), \; (1,
(1,-1)^t).$ Thus,

$\displaystyle \begin{bmatrix}3 & 2 \\ 2 & 3 \end{bmatrix} =
\begin{bmatrix}\fra...
...& \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} &
-\frac{1}{\sqrt{2}} \end{bmatrix}.$

Let

$\displaystyle \begin{bmatrix}u \\ v \end{bmatrix} = \begin{bmatrix}
\frac{1}{\s...
...\begin{bmatrix}\frac{x + y}{\sqrt{2}} \\
\frac{x - y}{\sqrt{2}} \end{bmatrix}.$

Then
$\displaystyle 3 x^2 + 4 x y + 3 y^2$ $\displaystyle =$ $\displaystyle [x, \;\;
y] \begin{bmatrix}3 & 2 \\ 2 & 3 \end{bmatrix}\begin{bmatrix}x\\ y\end{bmatrix}$  
  $\displaystyle =$ $\displaystyle [x, \;\; y] \begin{bmatrix}\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{2}...
...\sqrt{2}} &
-\frac{1}{\sqrt{2}} \end{bmatrix} \begin{bmatrix}x\\ y\end{bmatrix}$  
  $\displaystyle =$ $\displaystyle \bigl[ u, \;\; v \bigr] \begin{bmatrix}5
& 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix}u \\ v \end{bmatrix}$  
  $\displaystyle =$ $\displaystyle 5 u^2 + v^2.$  

Thus the given graph reduces to

$\displaystyle 5 u^2 + v^2 = 5 \;\; {\mbox{ or equivalently }} \;\; u^2 +
\frac{v^2}{5} = 1.$

Therefore, the given graph represents an ellipse with the principal axes $ u = 0$ and $ v = 0.$ That is, the principal axes are

$\displaystyle y + x = 0 {\mbox{ and }} \;\; x - y = 0.$

The eccentricity of the ellipse is $ \;\;e = \frac{2}{\sqrt{5}},$ the foci are at the points $ S_1 = (-\sqrt{2}, \sqrt{2})$ and $ S_2=(\sqrt{2},- \sqrt{2}),$ and the equations of the directrices are $ x - y = \pm \frac{5}{\sqrt{2}}.$

Figure 6.1: Ellipse
\includegraphics[scale=0.5]{ellipse.eps}

DEFINITION 6.4.6 (Associated Quadratic Form)   Let $ a x^2 + 2 h x y + b y^2 + 2 g x + 2 f y + c = 0$ be the equation of a general conic. The quadratic expression

$\displaystyle a x^2 + 2h xy + b y^2 = \bigl[ x, \;\; y \bigr] \begin{bmatrix}a & h \\
h & b \end{bmatrix} \begin{bmatrix}x \\ y \end{bmatrix}$

is called the quadratic form associated with the given conic.

We now consider the general conic. We obtain conditions on the eigenvalues of the associated quadratic form to characterise the different conic sections in $ {\mathbb{R}}^2$ (endowed with the standard inner product).

PROPOSITION 6.4.7   Consider the general conic

$\displaystyle a x^2 + 2 h x y + b y^2 + 2 g x + 2 f y + c = 0.$

Prove that this conic represents
  1. an ellipse if $ ab - h^2 > 0,$
  2. a parabola if $ ab - h^2 = 0,$ and
  3. a hyperbola if $ ab - h^2 <0.$

Proof. Let $ A =\begin{bmatrix}a & h \\ h & b
\end{bmatrix}.$ Then the associated quadratic form

$\displaystyle a x^2 + 2 h xy + b y^2 = \bigl[x \;\; y \bigr] A \begin{bmatrix}
x \\ y \end{bmatrix}.$

As $ A$ is a symmetric matrix, by Corollary 6.3.7, the eigenvalues $ {\lambda}_1, {\lambda}_2$ of $ A$ are both real, the corresponding eigenvectors $ {\mathbf u}_1, {\mathbf u}_2$ are orthonormal and $ A$ is unitarily diagonalisable with

$\displaystyle A = \begin{bmatrix}{\mathbf u}_1^t \\ {\mathbf u}_2^t \end{bmatri...
...\ 0 & {\lambda}_2 \end{bmatrix} \bigl[ {\mathbf u}_1 \;\; {\mathbf u}_2 \bigr].$ (6.4.2)

Let $ \begin{bmatrix}u \\ v \end{bmatrix} = \bigl[ {\mathbf u}_1 \;\; {\mathbf u}_2 \bigr]
\begin{bmatrix}x \\ y \end{bmatrix}.$ Then

$\displaystyle a x^2 + 2 h xy + b y^2 = {\lambda}_1 u^2 + {\lambda}_2 v^2$

and the equation of the conic section in the $ (u,v)$ -plane, reduces to

$\displaystyle {\lambda}_1 u^2 + {\lambda}_2 v^2 + 2 g_1 u + 2 f_1 v + c = 0.$

Now, depending on the eigenvalues $ {\lambda}_1, {\lambda}_2,$ we consider different cases:
  1. $ {\lambda}_1 = 0 = {\lambda}_2.$
    Substituting $ {\lambda}_1= {\lambda}_2 =0$ in (6.4.2) gives $ A = {\mathbf 0}.$ Thus, the given conic reduces to a straight line $ 2 g_1 u + 2 f_1 v + c = 0$ in the $ (u,v)$ -plane.
  2. $ \lambda_1 = 0, {\lambda}_2 \ne 0.$
    In this case, the equation of the conic reduces to

    $\displaystyle \lambda_2 (v + d_1)^2 = d_2 u + d_3 \;\; {\mbox{ for some }} \;\;
d_1, d_2, d_3 \in {\mathbb{R}}.$

    1. If $ d_2 = d_3 = 0,$ then in the $ (u,v)$ -plane, we get the pair of coincident lines $ v = - d_1$ .
    2. If $ d_2 = 0, \; d_3 \neq 0.$
      1. If $ {\lambda}_2 \cdot d_3 > 0,$ then we get a pair of parallel lines $ v = -d_1 \pm \displaystyle\sqrt{\frac{d_3}{{\lambda}_2}}.$
      2. If $ {\lambda}_2 \cdot d_3 < 0,$ the solution set corresponding to the given conic is an empty set.
    3. If $ d_2 \neq 0.$ Then the given equation is of the form $ Y^2 = 4 a X$ for some translates $ X = x + {\alpha}$ and $ Y = y + \beta$ and thus represents a parabola.

      Also, observe that $ \lambda_1 = 0$ implies that the $ \det (A) =
0.$ That is, $ a b - h^2 = \det(A) = 0.$

  3. $ \lambda_1 > 0$ and $ \lambda_2 < 0.$
    Let $ \lambda_2 = - \alpha_2.$ Then the equation of the conic can be rewritten as

    $\displaystyle \lambda_1 (u + d_1)^2 - \alpha_2 (v + d_2)^2 = d_3 \;\; {\mbox{ for some }}
\;\; d_1, d_2, d_3 \in {\mathbb{R}}.$

    In this case, we have the following:
    1. suppose $ d_3 = 0.$ Then the equation of the conic reduces to

      $\displaystyle \lambda_1 (u + d_1)^2 - \alpha_2 (v + d_2)^2= 0.$

      The terms on the left can be written as product of two factors as $ {\lambda}_1, {\alpha}_2 > 0.$ Thus, in this case, the given equation represents a pair of intersecting straight lines in the $ (u,v)$ -plane.
    2. suppose $ d_3 \neq 0.$ As $ d_3 \neq 0,$ we can assume $ d_3 > 0.$ So, the equation of the conic reduces to

      $\displaystyle \frac{\lambda_1 (u + d_1)^2}{d_3} - \frac{\alpha_2 (v + d_2)^2}{d_3}= 1.$

      This equation represents a hyperbola in the $ (u,v)$ -plane, with principal axes

      $\displaystyle u + d_1 = 0 {\mbox{ and }} \;\; v + d_2 = 0.$

    As $ {\lambda}_1 {\lambda}_2 < 0,$ we have

    $\displaystyle ab - h^2 = \det(A) = {\lambda}_1 {\lambda}_2 < 0.$

  4. $ \lambda_1, \lambda_2 > 0.$
    In this case, the equation of the conic can be rewritten as

    $\displaystyle \lambda_1 (u + d_1)^2 + \lambda_2 (v + d_2)^2 = d_3, \;\; {\mbox{ for some }}
\;\; d_1, d_2, d_3 \in {\mathbb{R}}.$

    we now consider the following cases:
    1. suppose $ d_3 = 0.$ Then the equation of the ellipse reduces to a pair of perpendicular lines $ u + d_1 = 0$ and $ v + d_2 = 0$ in the $ (u,v)$ -plane.
    2. suppose $ d_3 < 0.$ Then there is no solution for the given equation. Hence, we do not get any real ellipse in the $ (u,v)$ -plane.
    3. suppose $ d_3 > 0.$ In this case, the equation of the conic reduces to

      $\displaystyle \frac{\lambda_1 (u + d_1)^2}{d_3} + \frac{\alpha_2 (v + d_2)^2}{d_3}= 1.$

      This equation represents an ellipse in the $ (u,v)$ -plane, with principal axes

      $\displaystyle u + d_1 = 0 {\mbox{ and }} \;\; v + d_2 = 0.$

    Also, the condition $ {\lambda}_1 {\lambda}_2 > 0$ implies that

    $\displaystyle ab - h^2 = \det(A) = {\lambda}_1 {\lambda}_2 > 0.$

height6pt width 6pt depth 0pt

Remark 6.4.8   Observe that the condition

$\displaystyle \begin{bmatrix}u \\ v \end{bmatrix} = \bigl[ {\mathbf u}_1 \;\; {\mathbf u}_2 \bigr]
\begin{bmatrix}x \\ y \end{bmatrix}$

implies that the principal axes of the conic are functions of the eigenvectors $ {\mathbf u}_1$ and $ {\mathbf u}_2.$

EXERCISE 6.4.9   Sketch the graph of the following surfaces:
  1. $ x^2 + 2 x y + y^2 - 6 x - 10 y = 3.$
  2. $ 2x^2 + 6 x y + 3 y^2 -12 x -6 y = 5.$
  3. $ 4x^2 - 4 xy + 2 y^2 +12 x - 8 y = 10.$
  4. $ 2 x^2 - 6 xy +5 y^2 - 10 x + 4 y = 7.$

As a last application, we consider the following problem that helps us in understanding the quadrics. Let

$\displaystyle a x^2 + b y^2 + c z^2 + 2 d x y + 2e x z + 2 f y z + 2 l x + 2 m y + 2 n z + q = 0$ (6.4.3)

be a general quadric. Then we need to follow the steps given below to write the above quadric in the standard form and thereby get the picture of the quadric. The steps are:
  1. Observe that this equation can be rewritten as

    $\displaystyle {\mathbf x}^t A {\mathbf x}+ {\mathbf b}^t {\mathbf x}+ q = 0,$

    where

    $\displaystyle A = \begin{bmatrix}a & d & e \\ d & b & f \\ e & f & c \end{bmatr...
...\;\; {\mbox{ and }} \;\; {\mathbf x}= \begin{bmatrix}x \\ y \\ z
\end{bmatrix}.$

  2. As the matrix $ A$ is symmetric matrix, find an orthogonal matrix $ P$ such that $ P^t A P$ is a diagonal matrix.
  3. Replace the vector $ {\mathbf x}$ by $ {\mathbf y}= P^t {\mathbf x}.$ Then writing $ {\mathbf y}^t = (y_1, y_2, y_3),$ the equation (6.4.3) reduces to

    $\displaystyle {\lambda}_1 y_1^2 + {\lambda}_2 y_2^2 + {\lambda}_3 y_3^2 + 2 l_1 y_1 + 2 l_2 y_2 + 2 l_3 y_3 + q^\prime = 0$ (6.4.4)

    where $ {\lambda}_1, \; {\lambda}_2, {\lambda}_3$ are the eigenvalues of $ A.$
  4. Complete the squares, if necessary, to write the equation (6.4.4) in terms of the variables $ z_1, z_2, z_3$ so that this equation is in the standard form.
  5. Use the condition $ {\mathbf y}= P^t {\mathbf x}$ to determine the centre and the planes of symmetry of the quadric in terms of the original system.

EXAMPLE 6.4.10   Determine the quadric $ 2 x^2 + 2 y^2 + 2 z^2 + 2 x y + 2 x z + 2 y z + 4 x + 2 y + 4 z + 2 = 0.$
Solution: In this case, $ A = \begin{bmatrix}2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{bmatrix}$ and $ {\mathbf b}= \begin{bmatrix}4 \\ 2 \\ 4 \end{bmatrix}$ and $ q = 2$ . Check that for the orthonormal matrix $ P = \begin{bmatrix}\frac{1}{\sqrt{3}} &
\frac{1}{\sqrt{2}} & \frac{1}{\sqrt{6}...
...rac{1}{\sqrt{6}} \\
\frac{1}{\sqrt{3}} & 0 & \frac{-2}{\sqrt{6}} \end{bmatrix}$ , $ P^t A P = \begin{bmatrix}4 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}.$ So, the equation of the quadric reduces to

$\displaystyle 4 y_1^2 + y_2^2 + y_3^2 + \frac{10}{\sqrt{3}} y_1 + \frac{2}{\sqrt{2}} y_2 - \frac{2}{\sqrt{6}} y_3 + 2 = 0.$

Or equivalently,

$\displaystyle 4(y_1 + \frac{5}{4 \sqrt{3}})^2 + (y_2 + \frac{1}{\sqrt{2}})^2 + (y_3 - \frac{1}{\sqrt{6}})^2 = \frac{9}{12}.$

So, the equation of the quadric in standard form is

$\displaystyle 4 z_1^2 + z_2^2 + z_3^2 = \frac{9}{12},$

where the point $ (x, y, z)^t = P ( \frac{-5}{4 \sqrt{3}}, \frac{-1}{\sqrt{2}}, \frac{1}{\sqrt{6}})^t = ( \frac{-3}{4}, \frac{1}{4},
\frac{-3}{4})^t$ is the centre. The calculation of the planes of symmetry is left as an exercise to the reader.

A K Lal 2007-09-12