Difference between revisions of "Characteristic polynomial"

(create)
 
(add, will finish later)
Line 1: Line 1:
The '''characteristic polynomial''' of a linear [[operator]] refers to the [[polynomial]] whose roots are the [[eigenvalue]]s of the operator.  
+
The '''characteristic polynomial''' of a linear [[operator]] refers to the [[polynomial]] whose roots are the [[eigenvalue]]s of the operator. It carries much information about the operator.  
  
 
== Definition ==
 
== Definition ==
Suppose <math>A</math> is a <math>n \times n</math> matrix (over a field <math>K</math>). Then the characteristic polynomial of <math>A</math> is defined as <math>P_A(t) = \Det(tI - A)</math>, which is a <math>n</math>th degree polynomial in <math>t</math>.  
+
Suppose <math>A</math> is a <math>n \times n</math> [[matrix]] (over a field <math>K</math>). Then the characteristic polynomial of <math>A</math> is defined as <math>P_A(t) = \Det(tI - A)</math>, which is a <math>n</math>th degree polynomial in <math>t</math>. Here, <math>I</math> refers to the <math>n\times n</math> [[identity matrix]].
  
Written out,  
+
Written out, the characteristic polynomial is the [[determinant]]
  
 
<center><cmath>P_A(t) = \begin{vmatrix}t-a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & t-a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & t-a_{nn}\end{vmatrix}</cmath></center>
 
<center><cmath>P_A(t) = \begin{vmatrix}t-a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & t-a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & t-a_{nn}\end{vmatrix}</cmath></center>
Line 11: Line 11:
 
An [[eigenvector]] <math>\bold{v} \in K^n</math> is a non-zero vector that satisfies the relation <math>A\bold{v} = \lambda\bold{v}</math>, for some scalar <math>\lambda \in K</math>. In other words, applying a linear operator to an eigenvector causes the eigenvector to dilate. The associated number <math>\lambda</math> is called the [[eigenvalue]].  
 
An [[eigenvector]] <math>\bold{v} \in K^n</math> is a non-zero vector that satisfies the relation <math>A\bold{v} = \lambda\bold{v}</math>, for some scalar <math>\lambda \in K</math>. In other words, applying a linear operator to an eigenvector causes the eigenvector to dilate. The associated number <math>\lambda</math> is called the [[eigenvalue]].  
  
There are at most <math>n</math> distinct eigenvalues, whose values are exactly the roots of the characteristic polynomial of the square matrix.  
+
There are at most <math>n</math> distinct eigenvalues, whose values are exactly the roots of the characteristic polynomial of the square matrix. To prove this, we use the fact that the determinant of a matrix is <math>0</math> [[iff]] the column vectors of the matrix are [[linearly dependent]]. Observe that if <math>\lambda</math> satisfies <math>\det(\lambda I-A) = 0</math>, then the column vectors of <math>\lambda I - A</math> are linearly dependent. Hence, there exists a non-zero vector <math>\bold{v} \in K^n</math> such that <math>(\lambda I - A) \bold{v} = \bold{O}</math>. Distributing and re-arranging, <math>A\bold{v} = \lambda\bold{v}</math>, as desired. In the other direction, if <math>A \bold{v} = \lambda \bold{v}</math>, then <math>\lambda I \bold{v} - A \bold{v} = \bold{O}</math>. But then, the column vectors of <math>\lambda I - A</math> are linearly dependent, so it follows that <math>\det(\lambda I - A) = 0</math>.  
  
 
By the [[Hamilton-Cayley Theorem]], the character polynomial of a square matrix applied to the square matrix itself is zero.  
 
By the [[Hamilton-Cayley Theorem]], the character polynomial of a square matrix applied to the square matrix itself is zero.  
  
 
== Linear recurrences ==
 
== Linear recurrences ==
 +
Consider a monic [[homogenous]] [[linear recurrence]] of the form
 +
 +
<center><cmath>x_{n} = c_1x_{n-1} + c_2x_{n-2} + \cdots + c_kx_{n-k}, \quad (*)</cmath></center>
 +
 +
where <math>x_n</math> is a sequence of real numbers, and <math>c_1, \ldots, c_k</math> are real constants. The characteristic polynomial of this recurrence is the polynomial
 +
 +
<center><cmath>P(x) = x^k - c_1x^{k-1} - c_2x^{k-2}  - \cdots -c_{k-1}x - c_k.</cmath></center> 
 +
 +
For example, let <math>F_n</math> be the <math>n</math>th [[Fibonacci number]] defined by <math>F_1 = F_2 = 1</math>, and
 +
 +
<center><cmath>F_n = F_{n-1} + F_{n-2} \Longleftrightarrow F_n - F_{n-1} - F_{n-2} = 0.</cmath></center>
 +
 +
Then, its characteristic polynomial is <math>x^2 - x - 1</math>. <br><br>
 +
 +
The roots of the polynomial can be used to write a closed form for the recurrence. If the roots of this polynomial <math>P</math> are distinct, then suppose the roots are <math>r_1,r_2, \cdots, r_k</math>. Then, there exists real constants <math>a_1, a_2, \cdots, a_k</math> such that
 +
 +
<center><cmath>x_n = a_1 \cdot r_1^n + a_2 \cdot r_2^n + \cdots + a_k \cdot r_k^n</cmath></center>
 +
 +
If we evaluate <math>k</math> different values of <math>x_i</math> (typically <math>x_0, x_1, \cdots, x_{k-1}</math>), we can find a linear system in the <math>a_i</math>s that can be solved for each constant. Refer to the [[#Introductory|introductory problems]] below to see an example of how to do this. In particular, for the Fibonacci numbers, this yields [[Binet's formula]]. <br><br>
 +
 +
If there are roots with multiplicity greater than <math>1</math>, suppose <math>r_1 = r_2 = \cdots = r_{k_1}</math>. Then we would replace the term <math>a_1 \cdot r^n + a_2 \cdot r_2^n + \cdots + a_{k_1} \cdot r_{k_1}^n</math> with the expression <math>(a_1 \cdot n^{k_1-1} + a_2 \cdot n^{k_1 - 2} + \cdots + a_{k_1-1} \cdot n + a_{k_1}) \cdot r^n</math>.
 +
 +
For example, consider the recurrence relation <math>x_n = -2x_{n-1} -x_{n-2} \Longleftrightarrow x_n + 2x_{n-1} + x_{n-2} = 0</math>. It’s characteristic polynomial, <math>x^2 + 2x + 1</math>, has a <math>-1</math> double root. Then, its closed form solution is of the type <math>x_n = (-1)^n(an + b)</math>.
 +
 +
=== Proof ===
 +
Note: The ideas expressed in this section can be transferred to the next section about differential equations.
 +
 +
The following proof is from [[linear algebra]], by the [[spectral theorem]]. Let <math>\bold{y}_{n-1} = \begin{pmatrix} x_{n-1} \\ x_{n-2} \\ \vdots \\ x_{n-k}  \end{pmatrix}</math>. Then, we can express our linear recurrence as the matrix
 +
 +
<center><cmath>A = \begin{pmatrix}a_1 & a_2 & a_3 & \cdots & a_{k-1} & a_{k} \\ 1 & 0 & 0 & \cdots &0 & 0 \\ 0 & 1 & 0 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 & 0 \end{pmatrix},</cmath></center>
 +
 +
so that <math>\bold{y}_n = A\bold{y}_{n-1}</math> (try to verify this). The characteristic polynomial of <math>A</math> is precisely <math>P_A(t) = \det (tI - A) = a_1 \cdot t^{k-1} + a_2 \cdot t^{k-2} \cdots + a_k = 0</math> (again, try to verify this). If the roots of <math>P_A</math> are distinct, then there exists a [[basis]] (of <math>\mathbb{R}^{k-1}</math>) consisting of eigenvectors of <math>A</math>. That is, we can write
 +
 +
<center><cmath>A = U^{-1} D U</cmath></center>
 +
 +
for an [[unitary matrix]] <math>U</math> and a diagonal matrix <math>D</math>. Then, <math>A^2 = U^{-1} D U U^{-1} D U = U^{-1} D^2 U</math>, and in general, <math>A^n = U^{-1} D^n U</math>. Thus, <math>\bold{y}_{n} = A^{n}\bold{y}_0 = U^{-1}D^{n+k}U\bold{y}_0</math>. Here, <math>U^{-1}, U,</math> and <math>\bold{y}_0</math> are fixed (note that to find the values of <math>\bold{y}_0</math>, we may need to trace the recurrence backwards. We only take the <math>0</math>th index for simplicity). It follows that <math>y_{n}</math> is a linear combination of the diagonal elements of <math>D</math>, namely <math>r_1^n, r_2^n, \ldots, r_k^n</math>.
 +
 +
----
 +
 +
There are a couple of other ways to prove this. One is by [[induction]], though the proof is not very revealing; we can explicitly check that a sequence <math>\{a_1 \cdot r_1^n + a_2 \cdot r_2^n + \cdots + a_k \cdot r_k^n\}</math>, for real numbers <math>a_1, a_2, \ldots, a_k</math>, satisfies the linear recurrence relation <math>(*)</math>. If the two sequences are the same for the first <math>k</math> values of the sequence, it follows by induction that the two sequences must be exactly the same.
 +
 +
In particular, for <math>k=2</math>, we can check this by using the identity
 +
 +
<center><cmath>ax^{n+1} + by^{n+1} = (x+y)(ax^n + by^n) - xy(ax^{n-1} + by^{n-1}).</cmath></center> <br>
 +
 +
A second method is to use [[partial fractions]]. Consider the [[generating function]] given by
 +
 +
<center><cmath>F(x) = \frac{}{}</cmath></center>
 +
 +
See these [http://www.artofproblemsolving.com/Forum/weblog_entry.php?t=250866 blog][http://www.artofproblemsolving.com/Forum/weblog_entry.php?t=215833 posts] for further ideas.
 +
 +
== Differential equations ==
 +
Given a monic linear [[homogenous]] [[differential equation]] of the form <math>D^ny +c_{n-1}D^{n-1}y + \cdots + c_1Dy + c_0y = 0</math>, then the characteristic polynomial of the equation is the polynomial
 +
 +
<center><cmath>P(t) = t^n + c_{n-1}t^{n-1} + c_{n-2}t^{n-2} + \cdots + c_0.</cmath></center>
 +
 +
Here, <math>D = \frac{d}{dx}</math> is short-hand for the differential operator. <br><br>
 +
 +
If the roots of the polynomial are distinct, say <math>r_1, r_2, \cdots, r_n</math>, then the solutions of this differential equation are precisely the linear combinations <math>y(x) = a_1e^{r_1x} + a_2e^{r_2x} + \cdots + a_ne^{r_nx}</math>. Similarly, if there is a set of roots with multiplicity greater than <math>1</math>, say <math>r_1, r_2, \cdots, r_k</math>, then we would replace the term <math>a_1e^{r_1x} + \cdots + a_ke^{r_kx}</math> with the expression <math>(a_1x^{k-1}  + a_2x^{k-2} + \cdots + a_{k-1}x + a_k)e^{r_1x}</math>. <br><br>
 +
 +
In general, given a linear differential equation of the form <math>Ly = f</math>, where <math>L = c_nD^n + c_{n-1}D^{n-1} + \cdots + c_0</math> is a linear differential operator, then the set of solutions is given by the sum of any solution to the homogenous equation <math>Ly = 0</math> and a specific solution to <math>Ly = f</math>.
 +
 +
=== Proof ===
 +
We can apply an induction technique similar to the section on linear recurrences above.
 +
 +
From linear algebra, we can use the following theorem. <!-- Let <math>L: V \to V</math> be a linear operator for a [[vector space]]s <math>V</math> over a field <math>K</math>. Then, <math>W_1, W_2</math> are the [[kernel]]s of <math>L_1, L_2</math> respectively so that <math>V = W_1 \oplus W_2</math>, then <math>L_1 + L_2</math>. Hence, <math>L_1+L_2</math>. -->
  
 
== Problems ==
 
== Problems ==
 +
*Prove [[Binet's formula]]. Find a similar closed form equation for the [[Lucas sequence]], defined the starting terms <math>L_1 = 2, L_2 = 1</math>, and <math>L_n = L_{n-1} + L_{n-2}</math>.
 +
*
  
 
{{stub}}
 
{{stub}}
  
[[Category:Linear Algebra]]
+
[[Category:Linear algebra]]

Revision as of 19:19, 27 February 2010

The characteristic polynomial of a linear operator refers to the polynomial whose roots are the eigenvalues of the operator. It carries much information about the operator.

Definition

Suppose $A$ is a $n \times n$ matrix (over a field $K$). Then the characteristic polynomial of $A$ is defined as $P_A(t) = \Det(tI - A)$ (Error compiling LaTeX. Unknown error_msg), which is a $n$th degree polynomial in $t$. Here, $I$ refers to the $n\times n$ identity matrix.

Written out, the characteristic polynomial is the determinant

\[P_A(t) = \begin{vmatrix}t-a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & t-a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & t-a_{nn}\end{vmatrix}\]

Properties

An eigenvector $\bold{v} \in K^n$ is a non-zero vector that satisfies the relation $A\bold{v} = \lambda\bold{v}$, for some scalar $\lambda \in K$. In other words, applying a linear operator to an eigenvector causes the eigenvector to dilate. The associated number $\lambda$ is called the eigenvalue.

There are at most $n$ distinct eigenvalues, whose values are exactly the roots of the characteristic polynomial of the square matrix. To prove this, we use the fact that the determinant of a matrix is $0$ iff the column vectors of the matrix are linearly dependent. Observe that if $\lambda$ satisfies $\det(\lambda I-A) = 0$, then the column vectors of $\lambda I - A$ are linearly dependent. Hence, there exists a non-zero vector $\bold{v} \in K^n$ such that $(\lambda I - A) \bold{v} = \bold{O}$. Distributing and re-arranging, $A\bold{v} = \lambda\bold{v}$, as desired. In the other direction, if $A \bold{v} = \lambda \bold{v}$, then $\lambda I \bold{v} - A \bold{v} = \bold{O}$. But then, the column vectors of $\lambda I - A$ are linearly dependent, so it follows that $\det(\lambda I - A) = 0$.

By the Hamilton-Cayley Theorem, the character polynomial of a square matrix applied to the square matrix itself is zero.

Linear recurrences

Consider a monic homogenous linear recurrence of the form

\[x_{n} = c_1x_{n-1} + c_2x_{n-2} + \cdots + c_kx_{n-k}, \quad (*)\]

where $x_n$ is a sequence of real numbers, and $c_1, \ldots, c_k$ are real constants. The characteristic polynomial of this recurrence is the polynomial

\[P(x) = x^k - c_1x^{k-1} - c_2x^{k-2}  - \cdots -c_{k-1}x - c_k.\]

For example, let $F_n$ be the $n$th Fibonacci number defined by $F_1 = F_2 = 1$, and

\[F_n = F_{n-1} + F_{n-2} \Longleftrightarrow F_n - F_{n-1} - F_{n-2} = 0.\]

Then, its characteristic polynomial is $x^2 - x - 1$.

The roots of the polynomial can be used to write a closed form for the recurrence. If the roots of this polynomial $P$ are distinct, then suppose the roots are $r_1,r_2, \cdots, r_k$. Then, there exists real constants $a_1, a_2, \cdots, a_k$ such that

\[x_n = a_1 \cdot r_1^n + a_2 \cdot r_2^n + \cdots + a_k \cdot r_k^n\]

If we evaluate $k$ different values of $x_i$ (typically $x_0, x_1, \cdots, x_{k-1}$), we can find a linear system in the $a_i$s that can be solved for each constant. Refer to the introductory problems below to see an example of how to do this. In particular, for the Fibonacci numbers, this yields Binet's formula.

If there are roots with multiplicity greater than $1$, suppose $r_1 = r_2 = \cdots = r_{k_1}$. Then we would replace the term $a_1 \cdot r^n + a_2 \cdot r_2^n + \cdots + a_{k_1} \cdot r_{k_1}^n$ with the expression $(a_1 \cdot n^{k_1-1} + a_2 \cdot n^{k_1 - 2} + \cdots + a_{k_1-1} \cdot n + a_{k_1}) \cdot r^n$.

For example, consider the recurrence relation $x_n = -2x_{n-1} -x_{n-2} \Longleftrightarrow x_n + 2x_{n-1} + x_{n-2} = 0$. It’s characteristic polynomial, $x^2 + 2x + 1$, has a $-1$ double root. Then, its closed form solution is of the type $x_n = (-1)^n(an + b)$.

Proof

Note: The ideas expressed in this section can be transferred to the next section about differential equations.

The following proof is from linear algebra, by the spectral theorem. Let $\bold{y}_{n-1} = \begin{pmatrix} x_{n-1} \\ x_{n-2} \\ \vdots \\ x_{n-k}  \end{pmatrix}$. Then, we can express our linear recurrence as the matrix

\[A = \begin{pmatrix}a_1 & a_2 & a_3 & \cdots & a_{k-1} & a_{k} \\ 1 & 0 & 0 & \cdots &0 & 0 \\ 0 & 1 & 0 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 & 0 \end{pmatrix},\]

so that $\bold{y}_n = A\bold{y}_{n-1}$ (try to verify this). The characteristic polynomial of $A$ is precisely $P_A(t) = \det (tI - A) = a_1 \cdot t^{k-1} + a_2 \cdot t^{k-2} \cdots + a_k = 0$ (again, try to verify this). If the roots of $P_A$ are distinct, then there exists a basis (of $\mathbb{R}^{k-1}$) consisting of eigenvectors of $A$. That is, we can write

\[A = U^{-1} D U\]

for an unitary matrix $U$ and a diagonal matrix $D$. Then, $A^2 = U^{-1} D U U^{-1} D U = U^{-1} D^2 U$, and in general, $A^n = U^{-1} D^n U$. Thus, $\bold{y}_{n} = A^{n}\bold{y}_0 = U^{-1}D^{n+k}U\bold{y}_0$. Here, $U^{-1}, U,$ and $\bold{y}_0$ are fixed (note that to find the values of $\bold{y}_0$, we may need to trace the recurrence backwards. We only take the $0$th index for simplicity). It follows that $y_{n}$ is a linear combination of the diagonal elements of $D$, namely $r_1^n, r_2^n, \ldots, r_k^n$.


There are a couple of other ways to prove this. One is by induction, though the proof is not very revealing; we can explicitly check that a sequence $\{a_1 \cdot r_1^n + a_2 \cdot r_2^n + \cdots + a_k \cdot r_k^n\}$, for real numbers $a_1, a_2, \ldots, a_k$, satisfies the linear recurrence relation $(*)$. If the two sequences are the same for the first $k$ values of the sequence, it follows by induction that the two sequences must be exactly the same.

In particular, for $k=2$, we can check this by using the identity

\[ax^{n+1} + by^{n+1} = (x+y)(ax^n + by^n) - xy(ax^{n-1} + by^{n-1}).\]


A second method is to use partial fractions. Consider the generating function given by

\[F(x) = \frac{}{}\]

See these blogposts for further ideas.

Differential equations

Given a monic linear homogenous differential equation of the form $D^ny +c_{n-1}D^{n-1}y + \cdots + c_1Dy + c_0y = 0$, then the characteristic polynomial of the equation is the polynomial

\[P(t) = t^n + c_{n-1}t^{n-1} + c_{n-2}t^{n-2} + \cdots + c_0.\]

Here, $D = \frac{d}{dx}$ is short-hand for the differential operator.

If the roots of the polynomial are distinct, say $r_1, r_2, \cdots, r_n$, then the solutions of this differential equation are precisely the linear combinations $y(x) = a_1e^{r_1x} + a_2e^{r_2x} + \cdots + a_ne^{r_nx}$. Similarly, if there is a set of roots with multiplicity greater than $1$, say $r_1, r_2, \cdots, r_k$, then we would replace the term $a_1e^{r_1x} + \cdots + a_ke^{r_kx}$ with the expression $(a_1x^{k-1}  + a_2x^{k-2} + \cdots + a_{k-1}x + a_k)e^{r_1x}$.

In general, given a linear differential equation of the form $Ly = f$, where $L = c_nD^n + c_{n-1}D^{n-1} + \cdots + c_0$ is a linear differential operator, then the set of solutions is given by the sum of any solution to the homogenous equation $Ly = 0$ and a specific solution to $Ly = f$.

Proof

We can apply an induction technique similar to the section on linear recurrences above.

From linear algebra, we can use the following theorem.

Problems

This article is a stub. Help us out by expanding it.