New location!

by t0rajir0u, Apr 30, 2009, 5:16 PM

It took much longer than I anticipated, but I've finally put up a new post on WordPress!

From now on, all new posts will be on WordPress.

A question about automata

by t0rajir0u, Apr 12, 2009, 9:20 PM

Question: Let $A$ be a finite alphabet and let $L$ be a language on $A$ . Let $L(x) = \sum_{n \ge 0} L_n x^n$ , where $L_n$ is the number of words in $L$ of length $n$ .

Is it true that $L(x)$ is a ratio of integer polynomials if and only if there exists another language $L'$ such that $L_n' = L_n$ for all $n$ and such that $L'$ can be recognized by a finite automaton (equivalently, a regular expression)?

For example, the set of palindromes cannot be recognized by a finite automaton, but there is a bijection from palindromes to strings where each letter except the last letter is repeated exactly twice in a row, and that language can be recognized.

The way I really want to phrase the $L'$ requirement is that $L$ and $L'$ , regarded as species, should be related by a natural transformation, but I don't know whether this is strictly stronger than the requirement I've written down. Anyway, I'm reasonably sure that as it stands the answer is "yes."

Here's an actual question which I wish I had the CS background to answer: characterize the languages $L$ with the property that if $w \in L$ then every (edit:) contiguous substring of $w$ is in $L$ , i.e. describe what mode of computation recognizes such languages.

Some updates

by t0rajir0u, Mar 16, 2009, 7:19 AM

Apologies for not updating in awhile; MIT has, as usual, been quite busy. But I will hopefully have lots to talk about very soon. Anyway, some bullet points:

- Within the next two weeks I plan on moving this blog to Wordpress. Among other things, it'll look nicer and a little more professional and should be a lot easier to manage. Once I do so, I've got several posts waiting to be finished, so expect several updates during spring break.

- I've finally started work on my UROP. The motivation for the project is a little involved, but the project itself involves studying a sequence of ideals $I_m$ in a polynomial ring over $4n$ variables invariant under a finite group $G$ of symplectic matrices of size $2n$ in a particular way. So far we're still trying to reproduce, using MAGMA, certain computations done by Eric Rains at Caltech. Once we've verified that the code we have works, we'll move on to some other computations and hopefully get enough evidence for a conjecture of some kind.

- WOOT has kindly asked me to write a handout for the class. It'll be on generating functions, and I think it'll be a good opportunity to collect together a lot of example solutions. This, hopefully, will also be done within the next two weeks.

- MIT's Campus Preview Weekend is coming up! If you are visiting, feel free to PM me any questions you might have, and if you'd like to find me to discuss math or anything else, just let me know!

Fibonacci numbers @ HMMT

by t0rajir0u, Feb 21, 2009, 10:11 PM

Today I gave a talk at the Harvard-MIT Math Tournament as a mini-event on the Fibonacci numbers. The slides are available here (~1 MB).

The focus of the talk was the material from this post and this post and, to a lesser extent, this post. Briefly, there are three ways to think about the proof of Binet's formula:

1. The space of sequences satisfying $F_{n+2} = F_{n+1} + F_n$ is ia vector space of dimension $2$ , and there exists a basis consisting of the two geometric sequences $\phi^n, \varphi^n$ .

2. The generating function $\sum_{n \ge 0} F_n x^n$ has a partial fraction decomposition.

3. The powers of the matrix $\left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right]$ describe the Fibonacci numbers and this matrix is diagonalizable.

As I discussed in some of the posts above, these proofs are essentially the same (although it is the third that leads into the material from this post, which I unfortunately didn't have enough time to cover); each is about relating geometric series to the eigenvectors of a shift operator. See the posts above or the slides for details.

The combinatorics of basic calculus

by t0rajir0u, Feb 19, 2009, 4:53 AM

Much of this post is inspired by Doron Zeilberger's insightful entry in the Princeton Companion to Mathematics.

If we all learned combinatorics before we learned calculus perhaps we would be more inclined to ask the following very basic question: where do the factorials in a Taylor series expansion really come from?

Factorials must clearly count permutations, and yet it is never really explained what permutations have to do with power series expansions. The proof that the $n^{th}$ derivative of $x^n$ is $n!$ is elementary enough, and yet that only changes the question: what combinatorial significance does differentiation have?

This question clearly has something to do with why exponential generating functions are useful and important. One of the reason one prefers an EGF to an OGF is that the underlying object being counted is "labeled": for example, the number of labeled sets with $n$ elements (and $n$ labels) is $a_n = n!$ , which has exponential generating function

$A(x) = \sum_{n \ge 0} \frac {a_n}{n!} x^n = \frac {1}{1 - x}$ .

By comparison, there is not much useful one can say about the ordinary generating function $\sum n! x^n$ . Its radius of convergence is $0$ , and even in the combinatorial frame of mind where we don't care about convergence this is bad news since we know it won't actually correspond to a function we can work with.

I'd like to work a few more examples. The number of words of length $n$ selected from an alphabet of $s$ letters is $s^n$ , which has exponential generating function

$S(x) = \sum_{n \ge 0} \frac {s^n}{n!} x^n = e^{sx}$ .

In particular, the number of words of length $n$ selected from an alphabet of $1$ letter is our good friend the exponential. What significance does this have for differentiation? It's easy to count words of length $n$ by recursion: you start with a word of length $n - 1$ and add a letter to the end. In other words,

$s^n = s \cdot s^{n - 1}.$

In the language of exponential generating functions, differentiation corresponds to a shift in index (this is what we're really going after) and the above is equivalent to the identity

$\frac {d}{dx} e^{sx} = s e^{sx}.$

So something genuinely combinatorial is going on here. Let's try something else. The number of words of length $n$ from an alphabet of $s + r$ letters is, by the above definition, $ e^{(s + r)}x} = e^{sx} e^{rx}$. We've seen the natural definition of multiplying exponential generating functions before: it is the convolution

$c_n = \sum_{k = 0}^{n} {n \choose k} a_k b_{n - k}$

that picks a labeled object of size $k$ from $A$ and a labeled object of size $n - k$ from $B$ to form a labeled object of size $n$ and then decides where to permute the two accordingly. Here, a word from an alphabet of $s + r$ letters is obtained from two words, one from an alphabet of $s$ letters and one from an alphabet of $r$ letters, which are then permuted. So we can think of the factorial factors as accounting for these permutations.

The above definition is missing something. Given only a definition of $a_k$ , what is being "permuted"? The answer is that we should regard at least some of the EGFs of labeled objects as counting the number of ways we can put "connected components" of some other basic object together. Here, the basic object is a letter, of which there are $s$ of size one: hence the EGF of the set of letters is $sx$ . Given an EGF $a(x)$ for a set of basic objects, $a(x)^n$ (by the above argument) counts the number of ordered $n$ -tuples of objects from $a$ of a given size: we then divide out by $n!$ (this is very promising!) because we only care about the collection itself, and then sum over all $n$ to obtain

The Fundamental Theorem of Exponential Generating Functions: If $a(x)$ is the EGF of a collection of basic objects subject to the condition $a(0) = 0$ , let $A(x)$ denote the EGF of disjoint unions of basic objects. Then

$A(x) = \sum_{n \ge 0} \frac {a(x)^n}{n!} = e^{a(x)}.$

(The condition that $a(0) = 0$ is necessary to ensure that the computation of $A(x)$ occurs formally: otherwise the computation of $A(0)$ requires the notion of convergence.)

This is marvelous! Notice how beautifully this subsumes all of our previous discussion. Given two EGFs $a(x), b(x)$ of two kinds of connected components (say, two alphabets with disjoint letters), $a(x) + b(x)$ is just their union, and the exponential of that counts the number of ways we can put $a$ -components and $b$ -components together, hence the above provides a combinatorial proof that $e^{a + b} = e^a e^b$ . Even the first example we treated can be understood in this manner: $n!$ counts the number of permutations, a permutation is a disjoint union of cycles, and the number of cycles of length $k$ is (fixing a base point) $(k - 1)!$ , hence

$a(x) = \sum_{k \ge 0} \frac {(k - 1)!}{k!} x^k = - \ln (1 - x)$

and

$A(x) = e^{a(x)} = \frac {1}{1 - x}$

exactly as originally discussed. Hence we have a combinatorial proof of the power series of the natural log. (Since it takes non-connected things to connected things, there is a sense in which the power series of the natural log is another instance of PIE; we'll see a fun application of this later.) Returning to differentiation, one basic calculus notion is the product rule

$(fg)' = f g' + f' g$

which we'd like to prove combinatorially. If we know what $f, g$ count, we know what $fg$ counts. It was suggested earlier that $f', g'$ are just index shifts; to be more precise, if $f$ is the EGF of $f_n$ then $f'$ is the EGF of $f_{n + 1}$ . Following the approach the Wikipedia article on combinatorial species, we can think of this as adding an extra element before counting: hence the product rule reduces to the very simple observation that when we differentiate $fg$ we can add an extra element that is either of $f$ type or of $g$ type. For example, applied to $e^{sx}, e^{rs}$ it gives

$(e^{(s + r)x})' = se^{sx} e^{rx} + re^{sx} e^{rx}$

which is the very simple statement that when we add an extra letter it can be from the alphabet of $s$ letters or from the alphabet of $r$ letters. For this reason it is also fruitful to think of the product rule as a special case of the chain rule

$f(g)' = f'(g) g'$

which we would, again, like to prove combinatorially. When $f = e^x$ , this is the statement that

$(e^g)' = e^g g'$ ;

in other words, adding an element to the set of disjoint unions of "basic elements" $g$ is done by starting with a smaller disjoint union of basic elements $e^g$ and then adding another basic element (after reindexing appropriately). There is an interpretation of composition for more general EGFs $f$ where $f(g)$ denotes the set of " $f$ -structures on $g$ ": when $f = e^x$ , the EGF of the number of words on a singleton alphabet, this reduces to the fundamental theorem; the Wiki article has the details.

One last remark. It now appears (at least to me) that the theory of ordinary generating functions is a special case of the theory of exponential generating functions, for the following reason: whenever $a_n$ counts an unlabeled combinatorial family, we can attach labels to form the sequence $n! a_n$ whose EGF is precisely the OGF of the original sequence. Thus the multiplication of OGFs is a special case of the multiplication of EGFs (as is readily verified) and so forth. For example, the Catalan numbers count rooted binary trees with $n + 1$ leaves, and labeling those leaves gets us the quadruple factorial numbers. In the language of species, the sequences for which OGFs are nice have permutation group all of $S_n$ .

Some applications.

As mentioned before, $\frac {1}{1 - x}$ is the EGF of $f_n = n!$ ; that is, it counts the number of permutations, or labeled sets, of $n$ elements. How many labeled sets with "basic objects" of size $1$ or $2$ are there? For labeling reasons, the EGF to use here is $g(x) = x + x^2$ because we can think of the basic object of size $2$ with $2!$ sets of labels on it; this fits well with the idea that OGFs are EGFs in disguise. (A precise definition of "label" is given in the Wiki article: roughly, there is a way to make precise the notion that it doesn't matter what the labels are; that is, the definition of a species is functorial. They can be numbers or colors or other combinatorial objects.) This gives the answer as

$f(g) = \frac {1}{1 - x - x^2} = \sum_{n \ge 0} F_{n + 1} x^n$ ,

precisely the (shifted) Fibonacci numbers as we already knew, that is, the number of tilings of a $1 \times n$ grid with $1 \times 1$ and $1 \times 2$ tiles (where we have multiplied by $n!$ because of labeling). This directly suggests the identity

$F_{n + 1} = \sum_{j \ge 0} {n - j \choose j}$

obtained by writing $f(g) = \sum_{n \ge 0} (x + x^2)^n$ and applying the binomial theorem; we've already seen the combinatorial proof here. The cool thing about this technique is that it allows us to prove combinatorial identities about a linear recurrence without computing the roots of its characteristic polynomial: for example, if we want basic objects of size $1$ or $3$ , we get $g(x) = x + x^3$ and the answer is

$f(g) = \frac {1}{1 - x - x^3} = \sum_{n \ge 0} G_n x^n = \sum_{n \ge 0} (x + x^3)^n$

which immediately gives, by the binomial theorem or combinatorial reasoning, the identity

$G_n = \sum_{j \ge 0} {n - 2j \choose j}$

even though we haven't computed the roots of the characteristic polynomial or the analogue of the Fibonacci polynomials. If we want basic objects of any positive integer size, we get $g(x) = x + x^2 + ... = \frac {x}{1 - x}$ , which gives

$f(g) = \frac {1}{1 - \frac {x}{1 - x}} = \frac {1 - x}{1 - 2x} = 1 + \sum_{n \ge 1} 2^{n - 1} x^n.$

What does it mean to ask for a tiling with tiles of any size? It simply means to place a number of dividers between the tiles, and the number of ways to do this is obviously $2^\text{number of spots between tiles}$ (or $1$ if there aren't any tiles). Can you extend this to a combinatorial proof that the set of fractional linear transformations is closed under composition? (The fractional linear transformations are precisely the generating functions of geometric series with a possibly different first term.) An "infinite" extension is given in the Practice Problems.

If we allow two "colors" of basic objects, we can ask, for example, what happens if we allow $r$ types of basic red object of size $1$ and $s$ types of basic blue object of size $1$ : then the generating function should be the product

$\frac {1}{(1 - rx)(1 - sx)} = \frac {1}{r - s} \left( \frac {1}{1 - rx} - \frac {1}{1 - sx} \right) = \sum_{n \ge 0} \frac {r^n - s^n}{r - s} x^n$

although we have to keep in mind that what this product counts is the number of ordered pairs of red tilings and blue tilings with a fixed total size. This generating function is easily explained as follows: it is the sum $\sum_{k = 0}^{n - 1} r^k s^{n - 1 - k}$ , so we have yet again provided a combinatorial proof of the geometric series formula. Can you extend this to a combinatorial proof of partial fraction decomposition?

Now let's think about unlabeled sets. The Bell numbers $B_n$ count the number of partitions of $\{ 1, 2, ... n \} = [n]$ , equivalently, the number of equivalence relations. We can think of a partition as a disjoint union of nonempty sets. The EGF of the nonempty sets is $e^x - 1$ (there is no nonempty set of size $0$ ), hence by the fundamental theorem we obtain

$\sum_{n \ge 0} B_n \frac {x^n}{n!} = e^{e^x - 1}$ .

Taking the derivative tells us that

$B_{n + 1} = \sum_{k = 0}^{n} {n \choose k} B_k$

which we can also deduce combinatorially as follows: given a partition of $[n + 1]$ , the partition that $n + 1$ belongs to has some size $(n + 1) - k$ , and removing it gives a partition of some $k$ -element subset of $[n]$ .

To further elucidate the distinction between labeled and unlabeled sets, let's see what happens if we look at the "number of labeled collections of unlabeled nonempty sets," whatever that means. The EGF is

$f(g) = \frac {1}{2 - e^x} = \frac {1}{2} \sum_{k \ge 0} \frac {e^{kx}}{2^k} = \sum_{\ge 0} \frac {M_n}{n!} x^n$

which gives $M_n = \sum_{k \ge 0} \frac {k^n}{2^k}$ . This is interesting! You might recognize $M_1$ as the expected number of coin tosses $k$ before you toss a heads, and generally you can think of $M_k$ as the expected value of $k^n$ , which is a certain kind of moment.

Before we analyze this EGF, let's stop and think about what the powers of $g$ mean. The powers of $e^x$ are obvious: if $e^x$ is the number of unlabeled sets, $(e^x)^k = e^{kx}$ is the number of ways we can take $k$ unlabeled sets (each with objects that are considered disjoint from the others) - say, in $k$ colors - and permute them together, hence the number of words on $k$ letters as we saw before. When we take powers of $e^x - 1$ we disallow the empty set for any given letter (color), i.e.

$(e^x - 1)^k = \sum_{j = 0}^{k} ( - 1)^j {k \choose j} e^{jx}$

counts the number of words on $k$ letters such that each letter is used at least once. The above identity is then an instance of the Principle of Inclusion-Exclusion.

We can now describe what $M_n$ counts. If we number the colors (letters) $1, 2, 3, ...$ (there are countably many of them now), then $M_n$ is the number of words of length $n$ such that if letter $k$ is used at least once, then every letter $i, i < k$ is also used. This sequence does not appear to have a great name, but it is the number of ways $n$ competitors can place in a competition if we allow ties (where letter $k$ becomes $k^{th}$ place), which is a very succinct interpretation. The places define a weak order on $n$ elements, although it's not clear what all this has to do with moments of coin tosses. I'd be interested if someone could explain this to me. As for the closed form, I'll go with a link for now.

Another example is due to Zeilberger: returning to the observation that a permutation is a disjoint union of cycles, we'd like to count the number of permutations with exactly $k$ cycles. This can be done by assigning a weight $y$ to each cycle, i.e. to replace $- \ln (1 - x)$ with $- y \ln (1 - x)$ , which then gives the generating function

$e^{ - y \ln (1 - x)} = \frac {1}{(1 - x)^y} = \sum_{n \ge 0} { - y \choose n} ( - x)^n = \frac {y...(y + (n - 1))}{n!} x^n$

which gives the answer as the coefficient of $y^k$ in a rising factorial; this is the definition of the Stirling numbers of the first kind, up to sign. Compare with the earlier discussion on Newton polynomials. (There is, according to Stanley, a sense in which negative binomial coefficients having combinatorial significance is the result of a combinatorial reciprocity theorem, but I am far from understanding this point.)

Now suppose we'd instead like to count the number of permutations with exactly $k$ fixed points. This can be done by assigning a weight $y$ to the $1$ -cycles, i.e. to replace $- \ln (1 - x)$ with $- \ln (1 - x) - x + yx$ . This gives

$e^{ - \ln(1 - x) - x + yx} = \frac {1}{1 - x} e^{yx - x} = \sum_{n \ge 0} \frac {F_n(y)}{n!} x^n$

where $F_n(y) = \sum_{k = 0}^{n} D_{n,k} y^k = \sum_{\pi \in S_n} y^{\text{Fix}(\pi)}$ is the polynomial generating function of the rencontres numbers. On the other hand, clearly

$ \frac {1}{1 - x} e^{yx - x} = \frac {1}{1 - x} \sum_{n \ge 0} \left( \sum_{k = 0}^{n} {n \choose k} y^k ( - 1)^{n - k}} \right) \frac {x^n}{n!}$

which gives $D_{n,k} = \sum_{i = 0}^{n} {i \choose k} ( - 1)^{i - k} \frac {n!}{i!}$ , another instance of PIE. In the case $k = 0$ we obtain $D_{n,0} = n! \sum_{i = 0}^{n} ( - 1)^i \frac {1}{i!} \approx \frac {n!}{e}$ , a well-known formula for derangements. Substituting $y = 0$ gives the corresponding generating function.

Let $I_n$ denote the permutations of $[n]$ of order dividing $2$ (i.e. involutions). Such a permutation consists of $1$ -cycles and $2$ -cycles only. These have EGF $x + \frac {x^2}{2}$ , hence by the fundamental theorem the answer is

$\sum_{n \ge 0} \frac {I_n}{n!} x^n = e^{x + \frac {x^2}{2} }.$

While this is not a familiar function, having this generating function makes it easy to deduce other properties of $I_n$ . Differentiating, we obtain the identity

$I_{n + 1} = I_n + n I_{n - 1}$

which we can deduce combinatorially as follows: in an involution of $[n + 1], n + 1$ is either a fixed point or belongs to a cycle of length $2$ . Generally, the number of permutations of $[n]$ of order dividing $k$ has EGF

$\text{exp} \left( \sum_{d | k} \frac {x^d}{d} \right).$

The EGF of the number of involutions in $S_n$ with no fixed points is

$\sum_{n \ge 0} \frac {I_n'}{n!} x^n = e^{\frac {x^2}{2} }$ .

It's not hard to see why: this implies that $I_{2n}' = \frac {(2n!)}{2^n n!} = 1 \cdot 3 \cdot 5 \cdot ... \cdot (2n - 1)$ , which is easy to give a combinatorial interpretation to via the recurrence $I_{2n} = (2n - 1) I_{2n - 2}$ : in an involution of $[2n]$ with no fixed points, $2n$ belongs to a $2$ -cycle. I'm curious what all this has to do with the normal distribution.

Finally, let me give an example of using the fundamental theorem in the other direction. The number of labeled simple graphs on $n$ vertices is obviously $2^{\frac {n(n - 1)}{2} }$ (two pairs of vertices either have or don't have an edge); note that this is much easier than talking about the number of unlabeled simple graphs since we must deal with graph isomorphisms. This generating function is

$G(x) = \sum_{n \ge 0} 2^{\frac {n(n - 1)}{2} } \frac {x^n}{n!}$ .

This suggests that the logarithm of the above generating function, which begins

$\log G(x) = x + \frac {x^2}{2!} + \frac {4x^3}{3!} + \frac {38x^4}{4!} + \frac {728x^5}{5!} + ...$ ,

must count the number of (nonempty) connected simple graphs on $n$ vertices, which in fact is true. There is, as far as I know, no simpler way to describe this generating function.

Practice Problem 1: Compute the coefficients of $e^{\frac {x}{1 - x} }$ ; what does this EGF count?

Practice Problem 2: The Euler zigzag numbers count the number of permutations $\delta_1, ... \delta_n$ such that $\delta_1 < \delta_2 > \delta_3 < \delta_4 > ...$ . Show that

$2A_{n + 1} = \sum_{k = 0}^{n} {n \choose k} A_k A_{n - k}$

and from there deduce the exponential generating function $\sum_{n \ge 0} \frac {A_n}{n!} x^n$ . (It is a sum of two functions with which you should be very familiar!)

Practice Problem 3: Prove that a sequence $s_n$ that satisfies a linear recurrence with polynomial (rather than constant) coefficients has EGF satisfying a differential equation, also with polynomial coefficients.

Practice Problem 4: Keeping in mind the discussion of $f(x) = \frac {1}{1 - x}$ , prove that

$\sum_{n \ge 0} C_n x^n = \frac {1}{1 - \frac {x}{1 - \frac {x}{1 - \frac {x}{1 - ...}}}}$

where $C_n$ are the Catalan numbers. What do the convergents, interpreted as OGFs, count?

*Practice Problem 5: Prove, first by generating functions, and then combinatorially, the identity

$\sum_{\pi \in S_n} \text{Fix}(\pi)^k = B_{k,n} n!$

where $B_{k,n}$ is the number of partitions of $[k]$ into at most $n$ parts and $\text{Fix}(\pi)$ is the number of fixed points of $\pi$ .

*Practice Problem 6 (Putnam 2005 B6): Let $\sigma(\pi)$ denote the signature of a permutation $\pi \in S_n$ . Show that

$\sum_{\pi \in S_n} \frac {\sigma(\pi)}{\text{Fix}(\pi) + 1} = ( - 1)^{n + 1} \frac {n}{n + 1}$ .

Hint: See if you can modify the generating function of the rencontres numbers.

Practice Problem 7: See this thread.

A question and some thoughts II

by t0rajir0u, Feb 12, 2009, 10:33 PM

Question: Does there exist a satisfying combinatorial interpretation of non-integer eigenvalues of a graph?

A little background. The adjacency matrix $\mathbf{A}(G)$ associated to a graph $G$ has the property that its powers describe the number of walks of certain lengths between vertices of $G$ , so to find closed forms for these numbers it is easiest to diagonalize $\mathbf{A}(G)$ and work with its eigenvalues instead. The question is whether there is generally a way we can deduce these eigenvalues combinatorially without computing them.

This may seem a little backwards, so let me explain. Many nice graphs have integer eigenvalues: for example, the complete graph $K_n$ with self-loops has eigenvalues $n, 0, 0, ... 0$ and it's trivial to see why, since a walk on $K_n$ is just a list of vertices. (It's only slightly harder to figure out why the complete graph $K_n$ without self-loops has eigenvalues $n - 1, - 1, - 1, ... - 1$ .) I've gone on and on about what eigenvalue $0$ means. What I'm interested in is whether non-integer eigenvalues can still be given combinatorial meaning.

Let me demonstrate with two problems I've been thinking of.

Problem 1: Find a combinatorial proof of Binet's formula, $2^{n - 1} F_n = \sum_{k = 0}^{\lfloor \frac {n - 1}{2} \rfloor } {n \choose 2k + 1} 5^k$ .

This amounts to finding a combinatorial interpretation of the eigenvalues of the graph with adjacency matrix

$\mathbf{F} = \left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right]$ .

I have discussed the relationship between the Fibonacci numbers and walks on this graph (i.e. tilings with $1 \times 1$ and $1 \times 2$ dominoes) previously. The $2^{n - 1}$ factor corresponds to coloring such a tiling or equivalently adding edges to the graph. The question is to explain the $5^k$ factor.

It is likely that the answer to this question is well-known. It may not even be terribly difficult, but I have been working on it for some time without success.

Problem 2: Find a combinatorial proof of the identity $\sum_{k = 0}^{\lfloor \frac {n}{3} \rfloor} {n \choose 3k} = \frac {2^n + (1 + \omega)^n + (1 + \omega^2)^n}{3}$ .

This amounts to finding a combinatorial interpretation of the eigenvalues of the graph with adjacency matrix

$\mathbf{S} = \left[ \begin{array}{ccc} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 0 & 1 & 1 \end{array} \right]$ .

We've seen this matrix connected to this problem before. The bijection is as follows: a walk on the above graph has the property that you have two choices at each vertex, either stay or move to the next vertex. Interpret "stay at a vertex during step $k$ " as "don't add $k$ to a subset" and interpret "move to the next vertex during step $k$ " as "add $k$ to a subset." Then the number of closed walks from vertex $v_1$ to itself is the number of walks where you have to move a number of times congruent to $0 \bmod 3$ , i.e. the number of subsets with cardinality divisible by $3$ , and similarly for the other two. To explain the eigenvalues here, one needs to either exhibit a counting argument that depends on the value of $n \bmod 6$ or explain what will, after simplification, turn out to be a $( - 3)^k$ factor. (The former might actually be quite easy; nevertheless, I have not been able to make progress on it.)

The above two examples can be summarized, a little tongue-in-cheek, as follows: is there a combinatorial proof of the quadratic formula? (This would allow us to explain the role of the discriminant.) Perhaps I really mean a categorical proof, but that's taking me too far afield.

The difficulty I have in resolving either of these questions is that I cannot imagine a way in which a solution would naturally generalize. While I think there might be a nice proof in the Fibonacci case, it is difficult to generalize beyond two vertices. As for the other, it is a special property of the third roots of unity that $1 + \omega, 1 + \omega^2$ are still (sixth) roots of unity; the generalization that handles what I would call "a combinatorial interpretation of the roots of unity filter," on the other hand, would have to deal with eigenvalues that are not roots of unity (or multiples thereof, such as $1 + i$ ) and hence do not correspond (in a straightforward way) to some kind of periodic behavior. Computing the "real forms" of these identities would be even harder; one has to study quadratic subfields of cyclotomic fields and I really cannot imagine a combinatorial theory making these computations.

There is a sense in which what I've been looking for in these last two posts is a "categorified graph theory" something along the lines of what John Baez calls the groupoidification of linear algebra, but equipped to handle the notion of similarity. I definitely need more background to see if this idea makes any sense, though.

More questions:

The spectral theorem tells us that undirected graphs have real eigenvalues. Can this be deduced from purely graph-theoretic concerns, i.e. the number of walks of length $l$ as $l$ grows is either a sum of components which are either strictly increasing, strictly decreasing, or have period $2$ ? In other words, can we prove (a special case of) the spectral theorem graph-theoretically?

Directed acyclic graphs have a natural interpretation as the Hasse diagrams of finite posets. Are two Hasse diagrams for the same poset isospectral? Do two posets with isospectral Hasse diagrams have the same width? Can we prove a weaker version of the nilpotent Jordan normal form theorem via Dilworth's theorem? (Part of this question should be settled quickly by some computations I should get around to doing; it will be very obvious if the answer to either of the first two questions is "no.")

The Fibonacci discussion above generalizes as follows: it appears that any problem asking for the number of words of length $l$ on a finite alphabet avoiding certain local patterns such as substrings (i.e. no consecutive $B$ s) can be rephrased as a question about walks on a certain graph, but it is also possible to impose global patterns (the number of $B$ s is a multiple of $3$ ) as in the third-roots-of-unity discussion. Is there a concise description of problems of this sort that reduce to walks on graphs? (At least one type of global pattern - for example, that the number of $A$ s be equal to the number of $B$ s - is definitely not of this form.)

A question and some thoughts

by t0rajir0u, Jan 30, 2009, 7:01 AM

Question: Does there exist a satisfying graph-theoretic interpretation of the condition that two graphs are isospectral?

I've been thinking about this question for some time now. I don't really have the background to discuss it in detail, but briefly, the spectrum of a graph is the set of eigenvalues of its adjacency matrix; the powers of those eigenvalues then describes the powers of the adjacency matrix, hence the distribution of paths of length $n$ on the graph. Intuitively, I think of the spectrum as describing "flows" on a graph.

Two graphs that are isomorphic are also isospectral, but the converse does not hold. This is not surprising: two graphs are isomorphic if and only if their adjacency matrices are similar up to permutation matrices, while two graphs are isospectral if and only if their adjacency matrices are similar up to any invertible matrix in $\mathbb{C}^n$ . Needless to say, this is a much weaker condition, and it's hard to come up with a convincing description of it in graph-theoretic terms.

I have a vague idea that some intermediate notion of isomorphism is possible first by considering weighted graphs and then by finding some way of formalizing what it means to conjugate an adjacency matrix by something other than a (weighted) permutation matrix. But I don't really know what's going on here.

Anyway, the motivation for this question is the following characterization of a class of nilpotent matrices:

Observation. A matrix $M$ with non-negative real coefficients is nilpotent if and only if it is the weighted adjacency matrix of a directed acyclic graph.

I like this because it is highly intuitive: there are no paths of length greater than $n$ on a directed acyclic graph on $n$ vertices, or else they wouldn't be acyclic. A nicely behaved class of directed acyclic graphs is the set of directed trees with edges pointing away from or towards the root. This observation is the basis of the following

Observation. The nilpotent Jordan normal form theorem is (almost) equivalent to the statement that every (weighted) directed acyclic graph is isospectral to a directed [url=hhttp://en.wikipedia.org/wiki/Path_graph]path graph[/url].

Again, I like this because it is highly intuitive. A path is the simplest example of a directed acyclic graph, and to suggest that they are representative of the set of directed acyclic graphs in a particular way is a statement that I think is manageable by other means; I also think such statements should generalize to representative statements about other classes, and it would be very interesting to see the extent to which such statements translated into pure linear algebra.

Along those lines, I have become very interested lately with the idea that linear algebra is really just combinatorics, but I don't think I have the background to understand what Baez really means when he says that. Nevertheless, it is interesting to think about, and is one of the motivations for the above question.

Coefficients in the closed forms of linear recurrences

by t0rajir0u, Jan 12, 2009, 10:39 PM

Terence Tao recently wrote a post that describes, among other things, how to use test cases to quickly recover parts of formulas. Exercise 1, in particular, should remind you of a recent post here where the same trick was used to tease out the coefficients of the DFT matrix of order $n$ , but using it to derive Lagrange interpolation is something I had missed when writing up the Lagrange interpolation post, and it reminded me of another context in which I'd seen the same trick.

But first, to clarify: let $P(x)$ be a polynomial of degree $n$ . Suppose we are given that $P(x_i) = y_i$ for $i = 0, 1, 2, ... n$ where the $x_i$ are distinct (so the interpolation problem is well-posed). Write

$\frac {P(x)}{\prod (x - x_i)} = \sum \frac {c_i}{x - x_i}$ .

What are the coefficients $c_i$ ? The test case $x \to x_k$ for a particular $k$ tells us that the residue at $x_i$ on both sides must be equal; equivalently, multiplying by $x - x_k$ on both sides gives

$\frac {P(x)}{\prod_{i \neq k} (x - x_i)} = c_k + \sum \frac {c_i(x - x_k)}{x - x_i}$

and substituting $x = x_k$ gives

$\frac {y_k}{\prod_{i \neq k} (x_k - x_i)} = c_k$ ,

precisely the same Lagrange interpolation coefficient we can find by other methods. The actual interpolation is given by multiplying

$P(x) = \sum c_k \prod_{i \neq k} (x - x_i)$

as usual. The DFT problem is recovered by considering the interpolation question of what polynomial $P(x)$ of degree at most $n$ satisfies

$P(e^{\frac {2 \pi i k}{n} }) = e^{\frac {2 \pi i km}{n} }, k = 0, 1, ... n$

which is, of course, $P(x) = x^m$ , and consequently finding the partial fraction decomposition of $\frac {x^m}{x^n - 1}$ .

Note the similarity between this and the way the DFT problem was posed earlier in terms of finding the "closed form" of a sequence satisfying $s_{k + n} = s_k$ . In my previous discussions of the general issue of understanding homogeneous linear recurrences, I've neglected the exact coefficients of the various terms in the closed form in favor of a clear understanding of the bases of the exponentials that appear in those terms. It's been enough to know that the coefficients are a linear function of the initial values. There are certain special cases, however, where the value of the coefficient becomes important besides the DFT case.

Problem: A six-sided die is rolled repeatedly and its values tallied up. Let $p_n$ denote the probability that at some point the sum of the values of the rolls is $n$ . Compute $\lim_{n \to \infty} p_n$ .

Setup. Before I get into the computation, let's take a moment to think about this problem from an elementary perspective. As $n$ gets large, the partial sums increase by $\frac {7}{2}$ on average (the expected value of a dice roll) and it's intuitive (although not immediately obvious) that the probability $p_n$ becomes uniform (I should say that the limit exists); in fact, the limit should clearly be $\frac {2}{7}$ . Generally, given an arbitrary die we should expect the limit to be one over the expected value.

Can we be more rigorous? A moment's thought reveals that the probability $p_n$ satisfies a linear recurrence: the partial sum can only be $n$ if it was one of $n - 1, n - 2, ... n - 6$ before, so we have the straightforward recurrence

$p_n = \frac {1}{6} \left( p_{n - 1} + p_{n - 2} + ... + p_{n - 6} \right)$ .

The characteristic polynomial of this recurrence is $f(x) = x^6 - \frac {x^5 + x^4 + x^3 + x^2 + x + 1}{6}$ . Here we run into a problem: we don't know the roots of this polynomial (other than the obvious one, $1$ ). The root $1$ should control the limiting behavior of the sequence, so we want to show a few things:

1) All the other roots have absolute value strictly less than $1$ . This (almost) ensures that the limit exists.

2) The root $1$ occurs with multiplicity $1$ . This ensures that the limit exists. (If you're not sure why any of this is true, review the previous post on this subject.)

3) We can compute the coefficient of $1$ in the closed form of $p_n$ . This tells us what the limit is.

At this point, there are two related approaches we can take.

Solution 1. 1) can be reworded as follows: the companion matrix $\mathbf{A}$ of $f(x)$ (which acts on the column vector with components $p_{1 + k}, p_{2 + k}, ... p_{6 + k}$ ) has eigenvalue $1$ with multiplicity $1$ and all other eigenvalues have absolute value less than $1$ . This sort of statement generalizes significantly; the Perron-Frobenius theorem states that this is true of any matrix with strictly positive entries (that is, there exists a strictly maximal eigenvalue). Of course, $\mathbf{A}$ isn't of this form, but it's enough to show that some power of it is.

Regard $\mathbf{A}$ as the adjacency matrix of a weighted, directed graph on $n$ vertices. We begin with a directed path of length $6$ (where each vertex $v_i$ points to the next vertex $v_{i + 1}$ ) and in addition specify weighted edges from $v_6$ to every vertex, each of weight $\frac {1}{6}$ . The powers $\mathbf{A}^k$ describe the number of paths of length $k$ between vertices.

These paths are easy to exhibit explicitly: the only paths between two vertices $v_i, v_j, i, j < 6$ occur when we either travel directly between the two (when $i < j$ ) or when we travel to $v_6$ , stay there for awhile, and travel back. In other words, for $k > 6$ there exist paths between any two vertices of length exactly $k$ . (It is instructive to count exactly how many paths there are as a polynomial in $\frac {1}{6}$ ; the actual entry in the corresponding power of $\mathbf{A}$ is weighted by the number of times the path stays at $v_6$ .) This lets us apply the Perron-Frobenius theorem to $\mathbf{A}^k$ and then we have our result.

$\mathbf{A}$ has a special property that admits an addditional interpretation: it is the transition matrix of a finite Markov chain, and the vertices of the above graph describe its state space. That is, our Markov process consists of $6$ states where the first five states transition to the next and the sixth state transitions, with uniform probability, to any of the other states. The Perron-Frobenius applied to a stochastic matrix tells us that $1$ is a strictly maximal eigenvalue, which is exactly the nice result we'd like for a Markov process: it tells us that there exists a stationary distribution.

We are done here if we can compute the stationary distribution. The Perron-Frobenius theorem gets us 2) and, in fact, also gives us 3): it says that there is a unique row eigenvector $\mathbf{v}$ with entries summing to $1$ such that

$\lim_{n \to \infty} \mathbf{A}^n = \mathbf{1} \mathbf{v}$

and $\mathbf{v}$ is the row eigenvector associated with the eigenvalue $1$ . At this point it's only a quick calculation to verify that this eigenvector is

$\mathbf{v} = \left < \frac {6}{21}, \frac {5}{21}, ... \frac {1}{21} \right >$

and now we only have to compute the initial values. Considering how much work I will do to avoid this calculation, it is not in fact a difficult one, but instead I will motivate a different calculation that will get us the answer without needing to compute the initial values. The important result to keep in mind is that we did not have to compute the other five eigenvectors of $\mathbf{A}$ (or the corresponding eigenvalues).

Solution 2. In the interest of pursuing the original point and avoiding the use of the Perron-Frobenius theorem, the proof of 1) is fairly straightforward. Suppose $z$ is a root of $f(x)$ . If $|z| \ge 1$ , then

$|z|^6 = \frac {|z|^6 + ... + |z|^6}{6} \ge \frac {|z|^5 + ... + 1}{6} \ge \left| \frac {z^5 + ... + 1}{6} \right|$

with equality if and only if $|z| = 1$ and $z^5, z^4, ... z, 1$ all point in the same direction (the last step is the triangle inequality), which is true if and only if $z = 1$ . Alternately, $\frac {z^5 + ... + 1}{6}$ is a convex combination of points that lie on the unit circle, which itself lies on the unit circle if and only if it one of the vertices of the convex hull of $z^5, z^4, ... 1$ (which is just the points themselves). This implies that $z$ must be a root of unity of order at most $6$ , and then one can verify directly that only $1$ works. 2) can be proven by a similar argument applied to $f'(x)$ or by dividing by $x - 1$ and substituting in $x = 1$ . How should we calculate 3)?

Considering I built up this discussion by talking about partial fraction decompositions, let's take the generating functions approach and try to figure out what $F(x) = \sum_{n \ge 0} p_n x^n$ should look like. The generating function that describes the probability distribution of the sums of $k$ independent dice rolls is just

$\left( \frac {x^6 + ... + x}{6} \right)^k$

and since the number of independent dice rolls takes values over all $k$ , the generating function we want is

$F(x) = \sum_{k \ge 0} \left( \frac {x^6 + ... + x}{6} \right)^k = \frac {1}{1 - \frac {x^6 + ... + x}{6} }$ .

It's worth pointing out that $F(x)$ is easy to calculate even if the initial conditions aren't because of the way the problem is set up. Its denominator has roots which are the reciprocal of those of $f(x)$ as expected. Now, the important observation: we don't have to compute the complete partial fraction decomposition to isolate the coefficient of $1$ ! All we need to do is to factor $1 - x$ out of the denominator, like so:

$\frac {1}{6} (1 - x)(6 + 5x + ... + x^5)$

and then we can compute the partial fraction decomposition using these factors alone. To be even lazier, the residue at $1$ can be evaluated using the same trick as we used before: multiplying by $1 - x$ , this is just

$\frac {6}{6 + 5 + 4 + ... + 1} = \frac {2}{7}$

exactly as we expected. Note that the computation we are performing here is, by, l'Hopital's rule, the evaluation of the derivative of the denominator of $F(x)$ at $1$ , which is just the expected value of a single dice roll. This observation, as well as the rest of the argument, generalizes (see the Practice Problem).

The main result to take away from this discussion is the following

Proposition: Let $s_k$ be a sequence satisfying a homogeneous linear recurrence with characteristic polynomial $P(x)$ of degree $n$ , and suppose that $a$ is a known root of $P(x)$ . Then the coefficient of $a^k$ (which is in general a polynomial in $k$ ) in the closed form of $s_k$ is a rational function of the initial conditions and $a$ (and in particular does not depend on the nature of the rest of the roots of $P(x)$ ).

I mention this largely because it violated my intuition: I had previously thought that it wasn't possible to find the coefficients of the closed form of a given recurrence without computing every root of the characteristic polynomial.

Despite the manner in which we first arrived at this statement, it can be proven in a very straightforward manner. We'll suppose for brevity that $a$ has multiplicity $1$ . Write $P(x) = (x - a) Q(x)$ and write $s_k = ca^k + t_k$ . Then $t_n$ satisfies the (shorter) recurrence with characteristic polynomial $Q(x)$ rather than $P(x)$ . Given, therefore, the initial conditions

$s_0 = c + t_0$
$s_1 = ca + t_1$
...
$s_{n - 1} = ca^{n - 1} + t_{n - 1}$

along with the expression of $t_{n - 1}$ in terms of $t_0, t_1, ... t_{n - 2}$ provides a system of $n + 1$ equations in $n + 1$ variables which we can solve. In particular, $Q(x)$ has some coefficients $q_1, q_2, ... q_n$ such that

$t_{n - 1} = q_1 t_{n - 2} + q_2 t_{n - 3} + ... + q_{n - 1} t_0$

hence such that (adding appropriate multiples of the equations above)

$s_{n - 1} - q_1 s_{n - 2} - ... = c Q(a) \implies$
$c = \frac { s_{n - 1} - q_1 s_{n - 2} - ... }{Q(a)}$ .

The numerator can be understood as "quotienting out" the initial values by the roots of $Q(x)$ whereas the denominator can be understood either as a Lagrange interpolation coefficient or as $P'(a)$ .

Clearly there is an interesting connection between the Lagrange interpolation problem and the problem of figuring out the coefficients of the closed form of a linear recurrence. Well, perhaps it's not too interesting. Approaching the problem "forwardly" (that is, using the most basic tools first), let's set up a system. If the roots of $P(x)$ are given by $r_1, r_2, ... r_n$ (forgive me for switching around notation all the time), then the problem of determining the coefficients in

$s_k = c_1 r_1^k + c_2 r_2^k + ... + c_n r_n^k$

(where again I have assumed that all roots occur with multiplicity $1$ ) is simply the problem of solving the system

$\left[ \begin{array}{cccc} 1 & 1 & \hdots & 1 \\ r_1 & r_2 & \hdots & r_n \\ r_1^2 & r_2^2 & \hdots & r_n^2 \\ \vdots & \vdots & \ddots & \vdots \\ r_1^{n - 1} & r_2^{n - 1} & \hdots & r_n^{n - 1} \\ \end{array} \right] \left[ \begin{array}{c} c_1 \\ c_2 \\ c_3 \\ \vdots \\ c_n \end{array} \right] = \left[ \begin{array}{c} s_0 \\ s_1 \\ s_2 \\ \vdots \\ s_{n - 1} \end{array} \right]$

and what is the matrix that should appear but the transpose of the Vandermonde matrix of the roots! We have already seen that the finding-coefficients problem and the Lagrange interpolation problem are in some sense adjoint to each other in the special case of the DFT problem; how should we generalize?

One interpretation uses the partial fractions approach. In the Lagrange interpolation problem we are given the values of a polynomial $P(x)$ at some points; these values determine the coefficients of the partial fraction decomposition of a certain rational function with $P(x)$ , from which the coefficients of $P(x)$ can be computed. In the linear recurrence problem we are given the initial values of a recurrence $s_n$ ; these values determine the coefficients of the numerator of a certain rational function, from which the coefficients of its partial fraction decomposition can be computed.

Practice Problem 1: Generalize the conclusion of the solution to Problem 1. That is, given a die with probability distribution $q_1, q_2, ...$ (where $q_k$ describes the probability of rolling a $k$ ) such that $q_k \ge 0$ and $\sum q_k = 1$ , what is the limiting behavior of the probability that partial sums of repeated rolls hits a value of $n$ ? (In particular, when does this sequence converge? The answer is not "always.") This is equivalent to the problem of finding the outcome of a sequence of repeated weighted averages

$p_n = q_1 p_{n - 1} + q_2 p_{n - 2} + ...$

when the number of faces of the die is finite. (The first thing you should do is generalize the verification of condition 1) and examine the cases in which it fails.)

Practice Problem 2: Interpret the $n \times n$ Jordan block with eigenvalue $1$ as an adjacency matrix and compute its powers by a counting argument. How does this prove 1) the binomial theorem and 2) the balls-and-urns formula? Compare with the ordinary generating functions proof.

Practice Problem 3: In keeping with our observation that for special cases the ordinary generating function is easier to compute than the initial conditions, verify that the identity $\frac {P'(x)}{P(x)} = \sum_{P(a) = 0} \frac {1}{x - a}$ is equivalent to Newton's sums.

(Can you find a proof of Newton's sums by computing the trace of the $n^{th}$ power of the companion matrix of $P$ ? in a clever way? I can't.

)

Practice Problem 4: One way to state the fundamental theorem of symmetric polynomials is that any function of the roots of an irreducible polynomial $P(x)$ invariant under any permutation of the roots is a function of its coefficients. The discriminant is an example of such a function. Prove this fact by taking tensor powers of the companion matrix of $P(x)$ . (I don't actually know if this works.

)

The pretty picture post (crystals and cyclotomics)

by t0rajir0u, Dec 25, 2008, 7:00 AM

Today I'd like to talk about tesselations of the plane. With regards to regular polygons, it is not hard to see that only triangles, squares, and hexagons work, and of course one may extend this to any shearing or stretching of those shapes (parallelograms). What happens if we don't restrict ourselves to regular polygons? Still if we confine ourselves to a single repeating unit, we find that the symmetries obeyed by the resulting pattern continue to be 2-fold, 3-fold, 4-fold, or 6-fold.

Invalid image file

For example, the above Escher drawing exhibits 3-fold symmetry. This simple observation turns out to be fundamental in describing the structure of crystals, Nature's very own tesselations of space.

Invalid image file

The above depicts the crystal structure of diamond, which is cubic; specifically, it's known among crystallographers as the face-centered cubic Bravais lattice. Up to shearing and stretching, the crystal systems on which Bravais lattices are based are either cubic or hexagonal. Why do no other crystal systems (and hence no other Bravais lattices) appear? Why, for example, are there no icosahedral lattices?

To investigate this question, we'll make the notion of a lattice more precise as follows and end up proving the result known as the crystallographic restriction theorem. Define a lattice in $\mathbb{R}^n$ to be a discrete subgroup of $\mathbb{R}^n$ that spans it as a vector space. More concretely, we can describe a lattice in $\mathbb{R}^n$ by specifying $n$ linearly independent vectors $\mathbf{v}_1, \mathbf{v}_2, ... \mathbf{v}_n$ and considering the set

$\Lambda = \{ a_1 \mathbf{v}_1 + a_2 \mathbf{v}_2 + ... + a_n \mathbf{v}_n | a_i \in \mathbb{Z} \forall i\}.$

For example, the set of points with integer coordinates forms a lattice, and is in some sense the only lattice (but not for our purposes). We relate lattices to tesselations as follows: in any tesselation of a single repeating unit, we can identify a point in that repeating unit. The set of all points in the entire tesselation should form a lattice because we want tesselations to have translational symmetry in $n$ independent directions. Now that we have mathematically formulated the notion of a lattice, what can we say about its symmetries?

By "symmetries" we mean orientation-preserving isometries that map every point in the lattice to another point in the lattice; this is intuitive. The isometries of $\mathbb{R}^n$ are always linear, so we are talking about a subgroup of the Euclidean group that we can exhibit as a m atrix group. A lattice is said to have $k$ -fold symmetry there exists an orientation-preserving isometry $T$ (in two and three dimensions, rotations about a point and an axis, respectively) such that $T^k = I$ , the identity map. We now wish to show that the only possible orders in dimensions $2$ and $3$ occur when $k = 2, 3, 4, 6$ .

Proof. Written in the basis $\mathbf{v}_1, \mathbf{v}_2, ... \mathbf{v}_n$ , every symmetry of the lattice must take each basis vector to an integer combination of the other basis vectors, hence any such symmetry must have integer coordinates in this basis, in other words, must be an element of $GL_n(\mathbb{Z})$ .

In dimensions $2$ and $3$ , the minimal polynomial of such a matrix has degree at most $3$ , has integer coefficients, and must divide the polynomial $x^k - 1$ . On the other hand, the monic irreducible factors of $x^k - 1$ are precisely the cyclotomic polynomials. The cyclotomic polynomials of degree at most $3$ (in fact, exactly $2$ ) are precisely $\Phi_1(x), \Phi_2(x), \Phi_3(x), \Phi_4(x), \Phi_6(x)$ . Hence the only nontrivial finite orders an element of $GL_2(\mathbb{Z})$ can have are $2, 3, 4, 6$ as desired.

The beauty of the above approach is that it is absolutely general. For example, the possible symmetries of a lattice in $4$ dimensions occur when $k = 2, 3, 4, 5, 6, 8, 12$ .

Corollary: Let $p, q$ be positive integers. If $\cos \frac {p}{q} \pi$ is rational, then it can only be $0, \pm 1, \pm \frac {1}{2}$ . (See also MellowMelon's demonstration using Chebyshev polynomials.) This is not, strictly speaking, a corollary; one needs the additional observation that the irreducible factors of $x^k - 1$ are still the cyclotomic polynomials over $\mathbb{Q}$ as well as over $\mathbb{Z}$ , that is, Gauss's lemma.

There are a few interesting questions to ponder from here. One is purely mathematical: what interest are lattices to a mathematician? I hope I have already convinced you of the power of lattice methods in number theory. Indeed, you may have spotted that the lattices with the symmetries described above could be said to include the Gaussian and Eisenstein integers, which have shown up more than once in this blog.

//cdn.artofproblemsolving.com/images/5b286fb5a734af6137460a801cac015a13e52c47.png

//cdn.artofproblemsolving.com/images/5b286fb5a734af6137460a801cac015a13e52c47.png

In Lie theory, lattices play the role of root systems, a method by which Lie groups and Lie algebras are analyzed and classified. Certain extremely symmetrical lattices correspond to what are called the exceptional Lie groups, and these extremely symmetrical objects occur in physical theories; for example, $E_8$ has some tantalizing connections to theoretical physics.

Lattices in the complex plane also occur in the theory of elliptic curves via a type of doubly periodic function called an elliptic function. Elliptic functions parameterize elliptic curves over $\mathbb{C}$ and have periods which are two complex numbers whose ratio is not real. Since a doubly periodic function has the same value if its domain is taken "modulo" the lattice $\Lambda$ spanned by its periods, we can identify the range of the elliptic function - that is, the curve itself - with a fundamental parallelogram of the lattice with its sides identified - which is a torus!

Invalid image file

And now you know why all those specials on Fermat's Last Theorem tell you that elliptic curves are doughnuts.

The other question we could ask is scientific: we assumed that crystals are described by lattices with $3$ independent translational symmetries. Do there exist crystals that do not have this property? Scientists did not discover physical evidence of such quasicrystals until the 80s, but

- the mathematical community was aware that these structures could, in principle, exist decades earlier, and
- Islamic architects were already using aperiodic tilings centuries earlier!

//cdn.artofproblemsolving.com/images/de0a2f696caf7f6e5da5b51ca04d224c3972bc7c.gif

//cdn.artofproblemsolving.com/images/de0a2f696caf7f6e5da5b51ca04d224c3972bc7c.gif

Some aperiodic tilings occur with $5$ -fold, $8$ -fold, or $12$ -fold symmetry, and that's because they are projections down from $4$ -dimensional lattices along non-lattice planes, which explains both the large amount of structure and the lack of translational symmetry.

This scientific point brings up another mathematical question. Crystals tend to organize themselves into lattices because that is the structure that maximizes interaction among the atoms and therefore minimizes energy and excess volume. In other words, the relationship between crystals and lattice structures is related to the circle-packing problem. Is it always the case, then, that the densest packing in $n$ dimensions is a lattice packing?

This seems a silly question to ask at first. Intuitively, any irregularity in a packing correlates to some excess that could be removed by increasing the regularity of the packing. However, even the statement for dimension $3$ was only proven very recently, and this question is actually open in dimensions $4$ and higher.

Practice Problem 1: Prove the general form of Minkowski's theorem. (Try to find a proof other than the one given in the Wikipedia article! In two dimensions, there is a proof using Pick's theorem.)

Practice Problem 2: Let $u_0 = 0, u_1 = 4$ and $u_{n + 2} = \frac {6}{5} u_{n + 1} - u_n$ . Show that $|u_n| < 5$ for all $n$ . Show, on the other hand, that $\limsup_{n \to \infty} |u_n| = 5$ .

Practice Problem 3: The generic example of a potential function gives the energy of two spherical charges with charges $q_1, q_2$ at a distance $r$ from each other as $\frac {q_1 q_2}{r}$ . Compute the average potential energy per charge of an infinite lattice of alternately charged circles of radius $1$ and charges $\pm 1$ in the plane (with circles centered at lattice points with even coordinates). This is essentially a Madelung constant. (This sum obviously doesn't converge absolutely, so be careful. Also, I haven't done this problem and I'm not sure the answer comes out nicely at all

)

Fun with Newton polynomials, Part III

by t0rajir0u, Dec 1, 2008, 11:20 PM

This post isn't really about Newton polynomials

First, we will need the following

Proposition: Let $(a_n), (b_n)$ be two sequences with exponential generating functions $A(x) = \sum a_n \frac {x^n}{n!}, B(x) = \sum b_n \frac {x^n}{n!}$ and suppose that $C(x) = A(x) B(x) = \sum c_n \frac {x^n}{n!}$ . Then

$c_n = \sum_{k = 0}^{n} {n \choose k} a_k b_{n - k}.$

Exponential generating functions are used to study sequences like the Bell numbers whose growth rate makes their ordinary generating functions bad (for example, they might converge nowhere). The multiplication of exponential generating functions, as opposed to ordinary generating functions, occurs in situations where order counts. Certainly there's much more to be said about this point.

But the interesting thing here is the correspondence between the convolution above and an operation I referred to as a "Newton transform" (this is not standard terminology) that assigns to a sequence $(a_n)$ the function

$\sum a_n {x \choose n}.$

If we shift our focus of attention instead to the sequence

$c_n = \sum_{k = 0}^{n} a_k {n \choose k}$

we can find in it a combinatorial interpretation: if $a_k$ is the number of ways to arrange $k$ objects in some way such that order matters, $c_n$ is the number of ways to arrange some objects from among $n$ distinguishable objects in that way. It is now obvious that

$C(x) = e^x A(x).$

This is poweful! If $A(x)$ is a polynomial, it consists of a finite number of terms whose coefficients specify the coefficients, in Newton form, of a polynomial whose positive integer values are the coefficients of the LHS! It appears that multiplication by $e^x$ alone is enough for us to automatically compute values of a polynomial by specifying only its Newton coefficients. You may also recall a combinatorial identity whose (non-combinatorial!) proof is now trivial: letting $a_n = r! {n \choose r}$ gives

$A(x) = \sum_{n = r}^{\infty} \frac {x^n}{(n - r)!} = x^r e^x$

hence $C(x) = x^r e^{2x}$ and

$\frac {c_n}{r!} = \sum_{k = 0}^{n} {k \choose r} {n \choose k} = {n \choose r} 2^{n - r}$

as before. Also note that the combinatorial interpretation of $c_n$ I gave above is consistent with the interpretation I gave in that post.

Earlier I noted another special case when $a_k = p^k$ was a geometric series, which gives $A(x) = e^{px}$ and

$C(x) = e^{(p + 1)x}$

exactly as shown earlier. Now let's talk about the roots of unity filter again: this corresponds to the case where

$a_k = a_{m + k}, a_i = 1, a_j = 0, j \not \equiv i \bmod m$ .

For example, if $i = 0, m = 3$ , the corresponding generating function is $A(x) = 1 + \frac {x^3}{3!} + \frac {x^6}{6!} + ...$ . The product $e^x A(x)$ will then tell us how to evaluate the sum

$c_n = \sum_{k \equiv i \bmod m}^{n} {n \choose k}$ .

By inspection, $A(x)$ is a solution to the differential equation $A^{(m)}(x) = A(x)$ , so it is a linear combination of functions of the form $e^{\zeta_m^i x}$ where $\zeta_m = e^{\frac {2 \pi i}{m} }$ and $0 \le i < m$ . This is all material we have more or less encountered already. An application of the DFT gives us

$a_n = \frac {1}{m} \sum_{k = 0}^{m} \zeta_m^{k(n - i)} \implies$
$A(x) = \frac {1}{m} \sum_{k = 0}^{m} \zeta_m^{ - ki} \exp(\zeta_m^k x) \implies$
$C(x) = \frac {1}{m} \sum_{k = 0}^{m} \zeta_m^{ - ki} \exp( (\zeta_m^k + 1) x) \implies$
$c_n = \frac {1}{m} \sum_{k = 0}^{m} \zeta_m^{ - ki} (\zeta_m^k + 1)^n$ .

When $i = 0, m = 3$ , this gives

$c_n = \frac {2^n + (1 + \omega)^n + (1 + \omega^2)^n}{3}$

as before. Now, these generating functions are a little unwieldy (complex numbers as an argument in an exponent?), so we might like to see what ordinary generating functions have to say about all of this. But what happens to the exponential convolution? We could use the properties of the Laplace transform, but let's work backwards instead: we want to find a relationship between the generating functions

$F(x) = \sum_{n = 0}^{\infty} a_n x^n$

and

$G(x) = \sum_{n = 0}^{\infty} \left( \sum_{k = 0}^{n} {n \choose k} a_k \right) x^n.$

Exchanging the order of summation in $G(x)$ , we obtain

$G(x) = \sum_{k = 0}^{\infty} a_k \left( \sum_{n = 0}^{\infty} {n + k \choose k} x^{n + k} \right).$

We've seen a generating function very much like this one before; it gives

$\sum_{n = 0}^{\infty} {n + k \choose k} x^n = \frac {1}{(1 - x)^{k + 1}}$

and therefore

$G(x) = \sum_{k = 0}^{\infty} a_k \frac {x^k}{(1 - x)^{k + 1}} = \frac {1}{1 - x} F \left( \frac {x}{1 - x} \right)$ .

This is either very nice or very ugly depending on your point of view; in any case, it agrees with the answer you get using the properties of the Laplace transform ( $\mathcal{L} \{e^t f(t) \} = F(s - 1)$ ). We're interested in the function $F(x) = \frac {x^i}{1 - x^n}$ , which gives

$G(x) = \frac {1}{1 - x} \cdot \frac {x^i (1 - x)^{n - i} }{(1 - x)^n - x^n}.$

The factorization of the denominator into linear terms of the form $(1 - zx)$ gives $z = \zeta_m^k + 1$ as before, which is very nice, but to find the coefficients we found before requires a partial fraction decomposition, either of $G$ or of $F$ . We can compute the partial fraction decomposition of $F$ using the DFT as before - which is already a nice result - but we can be tricky: observe that if we write

$\frac {x^i}{1 - x^n} = \sum_{k = 0}^{n} \frac {r_k}{1 - \zeta_m^k x}$

then we can compute the residue $r_k$ by computing

$\lim_{x \to \zeta_m^{ - k}} (1 - \zeta_m^k x) F(x) = \lim_{x \to \zeta_m^{ - k}} \frac {x^i (1 - \zeta_m^k x)}{1 - x^n}.$

l'Hospital's rule then gives

$\lim_{x \to \zeta_m^{ - k}} \frac {x^i ( - \zeta_m^k)}{ - nx^{n - 1}} = \zeta_m^{ - ki} \zeta_m^k$

which is in exact agreement with our previous answer, although we need to rewrite the partial fraction decomposition as

$\frac {r_k}{1 - \zeta_m^k x} = \frac {\zeta_m^{ - ki} }{\zeta_m^{ - k} - x}$

to see the agreement. This can be understood as an alternate proof that the DFT works; the proof sketched in a previous post was about orthogonality and can be understood in the context of Parseval's theorem.

So it seems that both ordinary and exponential generating functions have something to bring to the table. Exponential generating functions reveal that a convolution with $e^x$ is what makes the "Newton transform" work (see binomial transform), but there were difficulties - we could not explicitly write down $A(x)$ in terms of a well-known function. The convolution became a Mobius transformation of the ordinary generating function, but there we had partial fractions and analysis available.

Practice Problem 1: Repeat the above computation using the formula

$\frac {P'(x)}{P(x)} = \sum_{P(r) = 0} \frac {1}{x - r}$ .

Practice Problem 2: Show that $\mathbb{Z}$ cannot be partitioned into a finite number of disjoint arithmetic sequences with each common difference distinct.

Practice Problem 3: The Bell numbers $B_n$ count the number of partitions of a set with $n$ elements, where neither the order of the partitions nor the order of the elements of a partition matter. Show that they satisfy the recurrence

$B_{n + 1} = \sum_{k = 0}^{n} {n \choose k} B_k$ .

Hence compute the exponential generating function $\sum_{n = 0}^{\infty} B_n \frac {x^n}{n!}$ .

Submit

orz $~~~~$
by clarkculus, Jan 10, 2025, 4:13 PM
Insanely auraful
by centslordm, Jan 1, 2025, 11:17 PM
Fly High
by Siddharthmaybe, Oct 22, 2024, 8:34 PM
Dang it he is gone ( /revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive.
by 799786, Aug 4, 2022, 1:56 PM
annoying precision
by centslordm, May 16, 2021, 7:34 PM
rip t0rajir0u
by OlympusHero, Dec 5, 2020, 9:29 PM
Shoutbox bump xD
by DuoDuoling0, Oct 4, 2020, 2:25 AM
dang hes gone
by OlympusHero, Jul 28, 2020, 3:52 AM
First shout in July
by smartguy888, Jul 20, 2020, 3:08 PM
https://artofproblemsolving.com/community/c2448

has more.

-πφ
by piphi, Jun 12, 2020, 8:20 PM
wait hold up 310,000
people visited this man?!?!??
by srisainandan6, May 29, 2020, 5:16 PM
first shout in 2020
by OlympusHero, Apr 4, 2020, 1:15 AM
in his latest post he says he moved to wordpress
by MelonGirl, Nov 16, 2019, 2:43 AM
Please revive!
by AopsUser101, Oct 30, 2019, 7:10 PM
first shout in october fj9odiais
by bulbasaur., Oct 14, 2019, 1:14 AM
t0rajir0u is beast dude. I bet he'll make IMO this year
by #H34N1, Mar 26, 2008, 2:37 AM
"annoying precision" is somewhat of an understatement, don't you think?
by xscapezaer, Mar 23, 2008, 10:06 PM
Wow. Just by looking at the page--I gained IQ points. That's uncanny. Keep it up!
by The_Scintillator, Mar 21, 2008, 2:20 AM
Mmm t0r... Are you a senior? You're too genius Nice Blog btw... But for the sake of some less advanced people, go about defining some terms...
by resurrection, Jan 31, 2008, 1:08 AM
weird
by Airjohn6, Jan 30, 2008, 11:52 PM
pls help me with integrals i cant private message u b/c i just joined mathlinks, write me back at memena@loyno.edu
by memena, Jan 22, 2008, 2:18 AM
Can you post more problems about number theory? I'd like to suggest you about combinatorial problems!
by ghjk, Jan 10, 2008, 7:03 AM
The article: A Digression is nice
by FOURRIER, Nov 5, 2007, 4:47 PM
Interested in number theory? Mmm... I am bad in number theory but I need to take it. Hope that you can help me.
by ifai, Oct 13, 2007, 7:14 AM
your blog is very useful and i like it.
by lovejrz, Oct 12, 2007, 9:26 AM
madness
by madness, Oct 4, 2007, 4:48 PM
t0r, I recommend you watch out at the RSI-PROMYS frisbee game this weekend. You betrayed both MOP and PROMYS for RSI. I'm not going to be there, but let's just say there's a hitman after you.
by K81o7, Jul 27, 2007, 3:27 AM
hello.
by ajai, Jul 13, 2007, 12:52 AM
just drop by to say hello! and digging for stuffs that I could understand
by jeez123, Jun 30, 2007, 5:02 PM