China Post 1: The problem of repeated roots

by t0rajir0u, Jul 19, 2008, 12:00 PM

This is the first in a series of three posts I'll have written while on vacation in China. Enjoy!

In my previous discussion of linear recurrences I restricted my attention to characteristic polynomials with no repeated roots. The structure of the corresponding solutions (sums of geometric series) was then straightforward and seen to be an exact analogue of partial fraction decomposition, among other phenomena. Let's explore what happens when this restriction is lifted.

The solutions to the corresponding recurrences surprised me when I first encountered them in the context of linear ODEs. For example, the characteristic polynomial $P(\lambda) = \lambda^2 - 2\lambda + 1$ has repeated roots and the general solution to $s_{n + 2} = 2s_{n + 1} - s_n$ is given by $s_n = An + B$ for some constant $A$ . Correspondingly, the general solution to the differential equation $y'' = 2y' - y$ is given by $y = (Ax + B)e^x$ . Plugging in and verifying is simple; are there deeper reasons why the repeated-roots solutions take this form?

Another of my discussions shows that the first solution is actually something we should expect. Again letting $\mathbf{S}$ denote the shift operator on sequences, the recurrence $(\mathbf{S}^2 - 2 \mathbf{S} + \mathbf{I})(\mathbf{s}) = 0$ can be "factored" (as per the ideas here)) as $(\mathbf{S} - \mathbf{I})(\mathbf{S} - \mathbf{I})(\mathbf{s}) = 0$ . The operator $\mathbf{S} - \mathbf{I}$ (where $\mathbf{I}$ denotes the identity operator) is in fact the forward difference operator (on the space of sequences, not on the space we get after performing a "Newton transform"!), and the set of sequences with second finite difference zero is clearly

$s_n = A{n \choose 1} + B{n \choose 0}$ .

So this particular result is no longer unexpected and leads to the corresponding result in linear ODEs. (The analogies between the forward difference and the derivative (i.e. the fact that the functions with second derivative zero are the linear functions), and the corresponding natural bases, are explored in the field of umbral calculus.) But this is as far as a repeated root of $1$ goes; the solutions to $s_{n + 2} = 4s_{n + 1} - 4s_n$ are given by $s_n = (An + B) 2^n$ , the product of a linear and geometric sequence, and now we must develop a more general technique.

We saw before that using ordinary generating functions (i.e. thinking about ODEs and applying a Laplace transform) made life simple because it translated these problems to problems of partial fraction decomposition. Here our partial fraction decompositions involve terms of the form $\frac {1}{(1 - \lambda x)^r}$ for $r > 1$ , but the corresponding sequences are simple: they are given by

$\frac {1}{(1 - \lambda x)^r} = \sum_{k = 0}^{\infty} {r + k - 1 \choose r - 1} \lambda^k x^k$

as per this post. The important point here (that is exactly the property that we want) is that the coefficient of $\lambda^k$ here is a polynomial in $k$ of degree $r - 1$ , which is exactly the behavior we observed above for $r = 2$ . We are now prepared to state the following

Proposition: Let $\{ s_n \}$ be a sequence satisfying a recurrence given by a characteristic polynomial $P(\lambda)$ with roots $r_1, r_2, ... r_k$ and corresponding multiplicities $m_1, m_2, ... m_k$ (in other words, $P(\lambda) = (\lambda - r_1)^{m_1}...(\lambda - r_k)^{m_k}$ ). Then

$s_n = Q_1(n) r_1^n + Q_2(n) r_2^n + ... + Q_k(n) r_k^n$

where $Q_i(n)$ is a polynomial of degree at most $m_i - 1$ . (There are then $\sum m_i = \text{deg } P$ coefficients to work with, exactly the number of initial conditions we expect to uniquely determine the solution.)

The proof by ordinary generating functions is straightforward and follows more or less the route I have given above. It is interesting that our proof above has a combinatorial interpretation: the appearance of polynomial coefficients becomes a consequence of a counting argument (the balls-and-urns problem) that amounts to taking powers of a particular generating function. This proof is uniquely easy in this particular domain: if we try to translate it back to the space of bare sequences, we see that the special property here is that the vector space structure common to all three domains we discussed earlier has been augmented by a convolution

$\{ a_n \} * \{ b_n \} = \{\sum_{i = 0}^{n} a_i b_{n - i} \}$

that is the product operation on two ordinary generating functions (and is also the proper way to make the space of terminating sequences behave like polynomials). In other words, what we have now is a (commutative and) associative algebra. While the definition of the convolution of two sequences in this way has a natural algebraic motivation in terms of ordinary generating functions, note that its motivation in terms of sequences alone is combinatorial in nature: if $\{ a_n \}$ counts the number of subsets of some set $A$ with "weight" $n$ (for an appropriate definition of "weight") and $\{ b_n \}$ counts the number of subsets of some set $B$ with weight $n$ , then $\{ a_n \} * \{ b_n \}$ counts the number of subsets of $A \cup B$ with weight $n$ . Nevertheless, it is far easier to deal with this convolution as it applies to ordinary generating functions on a purely algebraic basis.

So much for a combinatorial proof. Let me motivate a second proof by working with exponential generating functions (ODEs) instead.

In my previous post I connected the Fibonacci sequence to a matrix I called the Fibonacci matrix. This connection can be generalized, and its construction can be motivated in two ways, both of which I'd prefer to demonstrate by example. The recurrence $s_{n + 3} = 6s_{n + 2} - 12s_{n + 1} + 8s_n$ (which has characteristic polynomial $(\lambda - 2)^3$ ) is equivalent to the differential equation $y''' = 6y'' - 12y' + 8y$ , which we will solve with the substitutions (exactly the ones I used before)

$x = y'$
$z = x' = y''$

which gives the system (written in matrix notation)

$\left[ \begin{array}{c} y' \\ x' \\ z' \end{array} \right] = \left[ \begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 8 & - 12 & 6 \end{array} \right] \left[ \begin{array}{c} y \\ x \\ z \end{array} \right]$ .

The matrix that appears here is (the transpose of) the companion matrix of $(\lambda - 2)^3$ . Its characteristic and minimal polynomial are both $(\lambda - 2)^3$ (this generalizes). As before, the solution to the above system is

$\left[ \begin{array}{c} y(t) \\ x(t) \\ z(t) \end{array} \right] = e^{\mathbf{M} t} \left[ \begin{array}{c} y(0) \\ x(0) \\ z(0) \end{array} \right]$ ,

so the powers of the companion matrix give us the solution to our linear recurrence, which we already know to be in the form $s_n = (An^2 + Bn + C) 2^n$ . What we are interested in is how this form can be derived from the form of $\mathbf{M}$ alone.

To determine the way the Fibonacci matrix behaved (and therefore to prove Binet's formula) we diagonalized it by finding its eigendecomposition. Unfortunately, it's not possible to diagonalize $\mathbf{M}$ ; the eigenspace associated with the eigenvalue $2$ has dimension $1$ (in fact, is spanned by $\left < 1, 2, 4 \right >$ ), not $3$ . So what can we do instead of diagonalize?

It turns out that in general the "closest" we can get to diagonalizing is a nearly-diagonal form called Jordan normal form. The Jordan normal form of $\mathbf{M}$ is composed of the single Jordan block

$\left[ \begin{array}{ccc} 2 & 1 & 0 \\ 0 & 2 & 1 \\ 0 & 0 & 2 \end{array} \right]$

and therefore to study the powers of $\mathbf{M}$ (and companion matrices of polynomials with repeated roots in general) it suffices to study the powers of Jordan blocks. In fact, we should expect that such matrices are not diagonalizable; if they were, the corresponding linear recurrences would be describable as if their characteristic polynomials did not have repeated roots. Hence the need for polynomial coefficients in the general solution to the recurrence problem is intimately related to the need for Jordan blocks in the general diagonalization problem.

We expect that the powers of Jordan blocks are describable in terms of polynomials (equivalently, binomial coefficients) of degree $2$ . A few computations suggest that

$\left[ \begin{array}{ccc} 2 & 1 & 0 \\ 0 & 2 & 1 \\ 0 & 0 & 2 \end{array} \right]^n = \left[ \begin{array}{ccc} 2^n & {n \choose 1} 2^{n - 1} & {n \choose 2} 2^{n - 2} \\ 0 & 2^n & {n \choose 1} 2^{n - 1} \\ 0 & 0 & 2^n \end{array} \right]$

which is certainly a pattern we expect from our previous discussion of partial fractions, but can we justify this pattern using matrical methods alone? (The inductive proof offers some insight - the recurrence involved is just Pascal's rule - but I'd like to go deeper.)

The motivation we need is to go back to the notion of forward difference, which corresponds to Jordan blocks with eigenvalue $- 1$ . We saw that the forward difference operator could be written as $(\mathbf{S} - \mathbf{I})$ where $\mathbf{S}$ is the shift operator (on infinite sequences). Now note that our Jordan block can be written as

$\left[ \begin{array}{ccc} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{array} \right] + 2 \mathbf{I}_3$

where the operator on the left takes $\left < a, b, c \right >$ to $\left < b, c, 0 \right >$ - in other words, it is a finite shift operator (as opposed to the infinite shift operator we have been working with!). We will denote this matrix $\mathbf{S}_3$ , as it is closely related to $\mathbf{S}$ . Unlike the infinite shift operator, a finite shift operator is nilpotent - in particular, $\mathbf{S}^3 = 0$ , which makes the powers of the Jordan block immediately transparent. Now just the observation that

$\mathbf{S}_3^2 = \left[ \begin{array}{ccc} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right]$

is enough for us to deduce the form of the powers of a Jordan block:

$(\mathbf{S}_3 + 2 \mathbf{I}_3)^n = 2^n \mathbf{I}_3 + 2^{n - 1} {n \choose 1} \mathbf{S}_3 + 2^{n - 2} {n \choose 2} \mathbf{S}_3^2$ .

Of course, all this discussion generalizes to arbitrary Jordan blocks, which are really just the matrices $\mathbf{S}_n + \lambda \mathbf{I}_n$ and are intimately related to our discussion of the shift operator on sequences (in particular, note that a linear recurrence can be factored into operators of the form $(\mathbf{S} - \lambda \mathbf{I})$ ). This is the essential ingredient in the matrix proof of our Proposition that we wanted, and we get a little extra - we started off characterizing homogeneous linear recurrences and we have now characterized the powers of any matrix (through Jordan normal form). This justifies the statement in the Wikipedia article that "in principle the Jordan form could give a closed-form expression for the exponential exp(A)."

So we've shown that the behavior of a homogeneous linear recurrence is determined by the powers of the companion matrix of its characteristic polynomial. The natural question now is whether the set of companion matrices includes similarity classes of every matrix, but the answer is clearly no. A companion matrix has the property that its characteristic and minimal polynomials are identical, which is far from the case in general, so the matrix solution to the problem of homogeneous linear recurrences is just a special case of the description of the powers of matrices in general. In other other words, disappointingly enough our Proposition above cannot be used to prove the Jordan normal form theorem (so I am not quite Terence Tao!).

Nevertheless, the connection between partial fraction decomposition and powers of Jordan blocks is instructive. While I cannot offer a full proof of the Jordan normal form theorem along these lines, let me discuss a useful technique. To analyze a matrix $\mathbf{M} \in \mathcal{M}_n(F)$ , pick a vector $\mathbf{v} \in F^n$ . The sequence of vectors $\mathbf{v}, \mathbf{Mv}, \mathbf{M}^2 \mathbf{v}, ... \mathbf{M}^n \mathbf{v}$ sits in a vector space of dimension $n$ , so there exists a nontrivial linear dependence. In particular, some polynomial $P(x)$ exists such that

$P(\mathbf{M})(\mathbf{v}) = 0$ .

The span of the above vectors is (I found out) known as the $\mathbf{M}$ -cyclic subspace of $F^n$ generated by $\mathbf{v}$ . Relative to the basis $\mathbf{v}, \mathbf{Mv}, ...$ , the matrix of $\mathbf{M}$ (restricted to this subspace) is precisely the companion matrix of $\mathbf{M}$ . (These ideas lead to a very natural interpretation of the rational canonical form or Frobenius normal form, which generalizes the companion matrix construction.) If $F$ contains the eigenvalues of $\mathbf{M}$ and if $\mathbf{v}$ is an eigenvector of $\mathbf{M}$ with eigenvalue $\lambda$ , then $P_{\lambda}(\mathbf{M}) = \mathbf{M} - \lambda \mathbf{I}_n$ , a linear polynomial. It is readily seen, therefore, that

$\prod_{\lambda} P_{\lambda}(\mathbf{M})(\mathbf{v}) = 0$

where the product is taken over all eigenvalues of $\mathbf{M}$ and $\mathbf{v}$ is any vector in the span of the eigenvectors of $\mathbf{M}$ . In the case that $\mathbf{M}$ has $n$ distinct eigenvectors, this is all of $F^n$ and the polynomial on the LHS is clearly the characteristic (and minimal) polynomial of $\mathbf{M}$ (in other words, here is a special case of the Cayley-Hamilton theorem). As before, however, if the eigenvectors of $\mathbf{M}$ do not span $F^n$ (which is equivalent to the presence of repeated roots in the characteristic polynomial of $\mathbf{M}$ ), we have at best a polynomial that divides the minimal polynomial of $\mathbf{M}$ , which in turn divides the characteristic polynomial of $\mathbf{M}$ .

We already know the structure the characteristic and minimal polynomials have. The natural extension from here is to augment the set of eigenvectors with the set of generalized eigenvectors, that is, vectors such that $(\mathbf{M} - \lambda \mathbf{I}_n)^k \mathbf{v} = 0$ for some power $k$ . The appropriate values of $k$ then give us both the minimal and characteristic polynomials of $\mathbf{M}$ , and the appropriate generalized eigenvectors form a basis for $F^n$ which can be used to prove the Jordan normal form theorem computationally. Unfortunately, it is here that I am out of details.

Practice Problem 1: Find a matrix whose minimal polynomial is not its characteristic polynomial (without cheating and starting from its Jordan normal form!).

Practice Problem 2: The main proposition proven here appears in the following guise in the Princeton Lectures in Analysis:

Let $R(x) = \frac {P(x)}{Q(x)}$ be a rational function with $\text{deg } Q \ge \text{deg } P + 2$ and $Q(x) \neq 0$ on the real axis. Prove that if $\alpha_1, \alpha_2, ... \alpha_k$ are the roots of $R$ in the upper half-plane, then there exists polynomials $P_j(\xi)$ of degree less than the multiplicity of $\alpha_j$ such that

$\int_{ - \infty}^{\infty} R(x) e^{ - 2 \pi i x \xi} \, dx = \sum_{j = 1}^{k} P_j(\xi) e^{ - 2 \pi i \alpha_j \xi}$

when $\xi < 0$ .

Comment

2 Comments

The post below has been deleted. Click to close.

This post has been deleted. Click here to see post.

hm...
have you noticed that your links open a new page with the site, but also move the current page to that site, too?

by CA Math, Jul 22, 2008, 1:39 AM

Report

The post below has been deleted. Click to close.

This post has been deleted. Click here to see post.

They don't for me. Are you ctrl-clicking?

by t0rajir0u, Jul 22, 2008, 5:49 AM

Report

Submit

orz $~~~~$
by clarkculus, Jan 10, 2025, 4:13 PM
Insanely auraful
by centslordm, Jan 1, 2025, 11:17 PM
Fly High
by Siddharthmaybe, Oct 22, 2024, 8:34 PM
Dang it he is gone ( /revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive.
by 799786, Aug 4, 2022, 1:56 PM
annoying precision
by centslordm, May 16, 2021, 7:34 PM
rip t0rajir0u
by OlympusHero, Dec 5, 2020, 9:29 PM
Shoutbox bump xD
by DuoDuoling0, Oct 4, 2020, 2:25 AM
dang hes gone
by OlympusHero, Jul 28, 2020, 3:52 AM
First shout in July
by smartguy888, Jul 20, 2020, 3:08 PM
https://artofproblemsolving.com/community/c2448

has more.

-πφ
by piphi, Jun 12, 2020, 8:20 PM
wait hold up 310,000
people visited this man?!?!??
by srisainandan6, May 29, 2020, 5:16 PM
first shout in 2020
by OlympusHero, Apr 4, 2020, 1:15 AM
in his latest post he says he moved to wordpress
by MelonGirl, Nov 16, 2019, 2:43 AM
Please revive!
by AopsUser101, Oct 30, 2019, 7:10 PM
first shout in october fj9odiais
by bulbasaur., Oct 14, 2019, 1:14 AM
t0rajir0u is beast dude. I bet he'll make IMO this year
by #H34N1, Mar 26, 2008, 2:37 AM
"annoying precision" is somewhat of an understatement, don't you think?
by xscapezaer, Mar 23, 2008, 10:06 PM
Wow. Just by looking at the page--I gained IQ points. That's uncanny. Keep it up!
by The_Scintillator, Mar 21, 2008, 2:20 AM
Mmm t0r... Are you a senior? You're too genius Nice Blog btw... But for the sake of some less advanced people, go about defining some terms...
by resurrection, Jan 31, 2008, 1:08 AM
weird
by Airjohn6, Jan 30, 2008, 11:52 PM
pls help me with integrals i cant private message u b/c i just joined mathlinks, write me back at memena@loyno.edu
by memena, Jan 22, 2008, 2:18 AM
Can you post more problems about number theory? I'd like to suggest you about combinatorial problems!
by ghjk, Jan 10, 2008, 7:03 AM
The article: A Digression is nice
by FOURRIER, Nov 5, 2007, 4:47 PM
Interested in number theory? Mmm... I am bad in number theory but I need to take it. Hope that you can help me.
by ifai, Oct 13, 2007, 7:14 AM
your blog is very useful and i like it.
by lovejrz, Oct 12, 2007, 9:26 AM
madness
by madness, Oct 4, 2007, 4:48 PM
t0r, I recommend you watch out at the RSI-PROMYS frisbee game this weekend. You betrayed both MOP and PROMYS for RSI. I'm not going to be there, but let's just say there's a hitman after you.
by K81o7, Jul 27, 2007, 3:27 AM
hello.
by ajai, Jul 13, 2007, 12:52 AM
just drop by to say hello! and digging for stuffs that I could understand
by jeez123, Jun 30, 2007, 5:02 PM

128 shouts

Annoying precision

China Post 1: The problem of repeated roots

by t0rajir0u, Jul 19, 2008, 12:00 PM

2 Comments