Diagonalizability

In Linear Algebra, Diagonalizability refers to the ability to diagonalize a matrix $A$, i.e. the ability to find an invertible matrix $P$ and a diagonal matrix $D$ such that

\[A=P D P^{-1}\]

This article also shows how to compute eigenvalues and eigenvectors.

How to diagonalize a matrix

Given a arbitrary $n$ times $n$ matrix $A$, one calculate the Eigenvalue of $A$, by finding solutions to the equation

\[\text{det} (A - \lambda I_n)=0\]

where $I_n$ denotes the $n$ times $n$ Identity matrix and $\text{det}$ stands for the determinant of the matrix.

The polynomial $\text{det} (A - \lambda I_n)$ is known as the matrix's Characteristic polynomial.

Now, order the eigenvalues (they may be complex). There should be $n$ eigenvalues counting multiplicities, $\lambda _1 , \lambda _2 , \dots , \lambda _n$. Then for each eigenvalue find its corresponding eigenvector. If you cannot find $n$ linear independent eigenvectors, then the next step would fail ($P$ would not be invertible) and $A$ would be not diagonalizable. On the contrary, suppose one obtains $n$ linear independent eigenvectors, one for each eigenvalue (if there are double eigenvalues the double eigenvalues should generate two linear independent eigenvectors), $v_1, v_2, \dots , v_n$ corresponding to the eigenvalues (This is very very important). Then the matrix

\[P= \begin{bmatrix} v_1 & v_2 & \dots & v_n \\ \end{bmatrix}\]

and the diagonal matrix

\[D= \begin{bmatrix} \lambda _1 & 0 & 0 & \dots & 0 \\ 0 & \lambda _2 & 0 & \dots & 0 \\ 0 & 0 & \lambda _3 & \dots & 0 \\ \dots & \dots & \dots & \dots & \dots \\ 0 & 0 & 0 & \dots & \lambda _n \\ \end{bmatrix}\]

satisfy the desired $A=P D P^{-1}$. (Note that $v_i$ are column vectors)


It is best to illustrate with an example. Consider the matrix

\[A= \begin{bmatrix} 7 & 2 \\ -4 & 1 \\ \end{bmatrix}\]

(Note that we're using small 2 times 2 matrices since it is easier. In practice, most of the time you will encounter 3 times 3 matrices)

We compute its characteristic equation:

\[\text{det} (A - \lambda I_2 )=0\]

\[\text{det} \begin{bmatrix} 7 - \lambda & 2 \\ -4 & 1 - \lambda \\ \end{bmatrix} = 0\]

\[\implies (7- \lambda )(1- \lambda ) - 2 (-4)=0\] \[\implies \lambda ^2 -8 \lambda +15=0\]

\[\implies \lambda = 3,5\]

We obtain eigenvalues of 5 and 3. Thus, we have $D= \begin{bmatrix} 5 & 0 \\ 0 & 3 \\ \end{bmatrix} = 0$ Now we calculate eigenvectors.

To do so, we first calculate the matrix $A- \lambda _1 I_2$. We see that would be

\[\begin{bmatrix} 2 & 2 \\ -4 & -4 \\ \end{bmatrix}\]

for the eigenvalue of $5$. We calculate possible solutions to the equation

\[(A- \lambda I_2) v=0\]

(the solution is for the vector $v$). Normally, this would involve a row reduction of an augmented matrix, but in this case it is simple, so one can see a solution by looking. We see $\begin{bmatrix} 1 \\ -1 \\ \end{bmatrix}$ is a eigenvector for $5$. Similarly, we can find rather easily that $\begin{bmatrix} 1 \\ -2 \\ \end{bmatrix}$ is a eigenvector for the eigenvalue $3$. Adjoining those eigenvector, we see that

\[A=P D P^{-1}\]

for matrices

\[P= \begin{bmatrix} 1 & 1 \\ -1 & -2 \\ \end{bmatrix}\]

and

\[D= \begin{bmatrix} 5 & 0 \\ 0 & 3 \\ \end{bmatrix}\]

One can verify that that is true.

Uses of Diagonalization

As one saw, it is difficult to diagonalize a matrix. So why do people do it? This is because that as one can verify, the power of a diagonal matrix is extremely easy to compute, while the power of a arbitrary matrix is tedious.

To illustrate, back to the original example, one sees that to compute $A^{10}$, one have

\begin{align*} A^{10} &= (P D P^{-1})^{10} \\ &= P D^{10} P^{-1} \\ &= \begin{bmatrix} 1 & 1 \\ -1 & -2 \\ \end{bmatrix}  \begin{bmatrix} 5 & 0 \\ 0 & 3 \\ \end{bmatrix} ^{10} \begin{bmatrix} 1 & 1 \\ -1 & -2 \\ \end{bmatrix} ^{-1} \\ &= \begin{bmatrix} 1 & 1 \\ -1 & -2 \\ \end{bmatrix} \begin{bmatrix} 5^{10} & 0 \\ 0 & 3^{10} \\ \end{bmatrix} \begin{bmatrix} 2 & 1 \\ -1 & -1 \\ \end{bmatrix} \\ &= \begin{bmatrix} 19472201 & 9706576 \\ -19413152 & -9647527 \\ \end{bmatrix} \end{align*}

Despite what you might think, this computation is actually much much simpler than old-fashioned matrix multiplication (I dare you to try). Now, after computing matrix powers, one can compute matrix limits, which can be used to compute Markov chains (I'll spare you the details).

See Also

This article is a stub. Help us out by expanding it.