A question about automata
by t0rajir0u, Apr 12, 2009, 9:20 PM
Question: Let
be a finite alphabet and let
be a language on
. Let
, where
is the number of words in
of length
.
Is it true that
is a ratio of integer polynomials if and only if there exists another language
such that
for all
and such that
can be recognized by a finite automaton (equivalently, a regular expression)?
For example, the set of palindromes cannot be recognized by a finite automaton, but there is a bijection from palindromes to strings where each letter except the last letter is repeated exactly twice in a row, and that language can be recognized.
The way I really want to phrase the
requirement is that
and
, regarded as species, should be related by a natural transformation, but I don't know whether this is strictly stronger than the requirement I've written down. Anyway, I'm reasonably sure that as it stands the answer is "yes."
Here's an actual question which I wish I had the CS background to answer: characterize the languages
with the property that if
then every (edit:) contiguous substring of
is in
, i.e. describe what mode of computation recognizes such languages.







Is it true that





For example, the set of palindromes cannot be recognized by a finite automaton, but there is a bijection from palindromes to strings where each letter except the last letter is repeated exactly twice in a row, and that language can be recognized.
The way I really want to phrase the



Here's an actual question which I wish I had the CS background to answer: characterize the languages




Some updates
by t0rajir0u, Mar 16, 2009, 7:19 AM
Apologies for not updating in awhile; MIT has, as usual, been quite busy. But I will hopefully have lots to talk about very soon. Anyway, some bullet points:
- Within the next two weeks I plan on moving this blog to Wordpress. Among other things, it'll look nicer and a little more professional and should be a lot easier to manage. Once I do so, I've got several posts waiting to be finished, so expect several updates during spring break.
- I've finally started work on my UROP. The motivation for the project is a little involved, but the project itself involves studying a sequence of ideals
in a polynomial ring over
variables invariant under a finite group
of symplectic matrices of size
in a particular way. So far we're still trying to reproduce, using MAGMA, certain computations done by Eric Rains at Caltech. Once we've verified that the code we have works, we'll move on to some other computations and hopefully get enough evidence for a conjecture of some kind.
- WOOT has kindly asked me to write a handout for the class. It'll be on generating functions, and I think it'll be a good opportunity to collect together a lot of example solutions. This, hopefully, will also be done within the next two weeks.
- MIT's Campus Preview Weekend is coming up! If you are visiting, feel free to PM me any questions you might have, and if you'd like to find me to discuss math or anything else, just let me know!
- Within the next two weeks I plan on moving this blog to Wordpress. Among other things, it'll look nicer and a little more professional and should be a lot easier to manage. Once I do so, I've got several posts waiting to be finished, so expect several updates during spring break.
- I've finally started work on my UROP. The motivation for the project is a little involved, but the project itself involves studying a sequence of ideals




- WOOT has kindly asked me to write a handout for the class. It'll be on generating functions, and I think it'll be a good opportunity to collect together a lot of example solutions. This, hopefully, will also be done within the next two weeks.
- MIT's Campus Preview Weekend is coming up! If you are visiting, feel free to PM me any questions you might have, and if you'd like to find me to discuss math or anything else, just let me know!
Fibonacci numbers @ HMMT
by t0rajir0u, Feb 21, 2009, 10:11 PM
Today I gave a talk at the Harvard-MIT Math Tournament as a mini-event on the Fibonacci numbers. The slides are available here (~1 MB).
The focus of the talk was the material from this post and this post and, to a lesser extent, this post. Briefly, there are three ways to think about the proof of Binet's formula:
1. The space of sequences satisfying
is ia vector space of dimension
, and there exists a basis consisting of the two geometric sequences
.
2. The generating function
has a partial fraction decomposition.
3. The powers of the matrix
describe the Fibonacci numbers and this matrix is diagonalizable.
As I discussed in some of the posts above, these proofs are essentially the same (although it is the third that leads into the material from this post, which I unfortunately didn't have enough time to cover); each is about relating geometric series to the eigenvectors of a shift operator. See the posts above or the slides for details.
The focus of the talk was the material from this post and this post and, to a lesser extent, this post. Briefly, there are three ways to think about the proof of Binet's formula:
1. The space of sequences satisfying



2. The generating function

3. The powers of the matrix
![$ \left[ \begin{array}{cc} 1 & 1 \\ 1 & 0 \end{array} \right]$](http://latex.artofproblemsolving.com/2/d/6/2d66c0270096f9afdd4f3ec02fd060e23c13617f.png)
As I discussed in some of the posts above, these proofs are essentially the same (although it is the third that leads into the material from this post, which I unfortunately didn't have enough time to cover); each is about relating geometric series to the eigenvectors of a shift operator. See the posts above or the slides for details.
The combinatorics of basic calculus
by t0rajir0u, Feb 19, 2009, 4:53 AM
Much of this post is inspired by Doron Zeilberger's insightful entry in the Princeton Companion to Mathematics.
If we all learned combinatorics before we learned calculus perhaps we would be more inclined to ask the following very basic question: where do the factorials in a Taylor series expansion really come from?
Factorials must clearly count permutations, and yet it is never really explained what permutations have to do with power series expansions. The proof that the
derivative of
is
is elementary enough, and yet that only changes the question: what combinatorial significance does differentiation have?
This question clearly has something to do with why exponential generating functions are useful and important. One of the reason one prefers an EGF to an OGF is that the underlying object being counted is "labeled": for example, the number of labeled sets with
elements (and
labels) is
, which has exponential generating function
.
By comparison, there is not much useful one can say about the ordinary generating function
. Its radius of convergence is
, and even in the combinatorial frame of mind where we don't care about convergence this is bad news since we know it won't actually correspond to a function we can work with.
I'd like to work a few more examples. The number of words of length
selected from an alphabet of
letters is
, which has exponential generating function
.
In particular, the number of words of length
selected from an alphabet of
letter is our good friend the exponential. What significance does this have for differentiation? It's easy to count words of length
by recursion: you start with a word of length
and add a letter to the end. In other words,

In the language of exponential generating functions, differentiation corresponds to a shift in index (this is what we're really going after) and the above is equivalent to the identity

So something genuinely combinatorial is going on here. Let's try something else. The number of words of length
from an alphabet of
letters is, by the above definition, $ e^{(s + r)}x} = e^{sx} e^{rx}$. We've seen the natural definition of multiplying exponential generating functions before: it is the convolution

that picks a labeled object of size
from
and a labeled object of size
from
to form a labeled object of size
and then decides where to permute the two accordingly. Here, a word from an alphabet of
letters is obtained from two words, one from an alphabet of
letters and one from an alphabet of
letters, which are then permuted. So we can think of the factorial factors as accounting for these permutations.
The above definition is missing something. Given only a definition of
, what is being "permuted"? The answer is that we should regard at least some of the EGFs of labeled objects as counting the number of ways we can put "connected components" of some other basic object together. Here, the basic object is a letter, of which there are
of size one: hence the EGF of the set of letters is
. Given an EGF
for a set of basic objects,
(by the above argument) counts the number of ordered
-tuples of objects from
of a given size: we then divide out by
(this is very promising!) because we only care about the collection itself, and then sum over all
to obtain
The Fundamental Theorem of Exponential Generating Functions: If
is the EGF of a collection of basic objects subject to the condition
, let
denote the EGF of disjoint unions of basic objects. Then

(The condition that
is necessary to ensure that the computation of
occurs formally: otherwise the computation of
requires the notion of convergence.)
This is marvelous! Notice how beautifully this subsumes all of our previous discussion. Given two EGFs
of two kinds of connected components (say, two alphabets with disjoint letters),
is just their union, and the exponential of that counts the number of ways we can put
-components and
-components together, hence the above provides a combinatorial proof that
. Even the first example we treated can be understood in this manner:
counts the number of permutations, a permutation is a disjoint union of cycles, and the number of cycles of length
is (fixing a base point)
, hence

and

exactly as originally discussed. Hence we have a combinatorial proof of the power series of the natural log. (Since it takes non-connected things to connected things, there is a sense in which the power series of the natural log is another instance of PIE; we'll see a fun application of this later.) Returning to differentiation, one basic calculus notion is the product rule

which we'd like to prove combinatorially. If we know what
count, we know what
counts. It was suggested earlier that
are just index shifts; to be more precise, if
is the EGF of
then
is the EGF of
. Following the approach the Wikipedia article on combinatorial species, we can think of this as adding an extra element before counting: hence the product rule reduces to the very simple observation that when we differentiate
we can add an extra element that is either of
type or of
type. For example, applied to
it gives

which is the very simple statement that when we add an extra letter it can be from the alphabet of
letters or from the alphabet of
letters. For this reason it is also fruitful to think of the product rule as a special case of the chain rule

which we would, again, like to prove combinatorially. When
, this is the statement that
;
in other words, adding an element to the set of disjoint unions of "basic elements"
is done by starting with a smaller disjoint union of basic elements
and then adding another basic element (after reindexing appropriately). There is an interpretation of composition for more general EGFs
where
denotes the set of "
-structures on
": when
, the EGF of the number of words on a singleton alphabet, this reduces to the fundamental theorem; the Wiki article has the details.
One last remark. It now appears (at least to me) that the theory of ordinary generating functions is a special case of the theory of exponential generating functions, for the following reason: whenever
counts an unlabeled combinatorial family, we can attach labels to form the sequence
whose EGF is precisely the OGF of the original sequence. Thus the multiplication of OGFs is a special case of the multiplication of EGFs (as is readily verified) and so forth. For example, the Catalan numbers count rooted binary trees with
leaves, and labeling those leaves gets us the quadruple factorial numbers. In the language of species, the sequences for which OGFs are nice have permutation group all of
.
Some applications.
As mentioned before,
is the EGF of
; that is, it counts the number of permutations, or labeled sets, of
elements. How many labeled sets with "basic objects" of size
or
are there? For labeling reasons, the EGF to use here is
because we can think of the basic object of size
with
sets of labels on it; this fits well with the idea that OGFs are EGFs in disguise. (A precise definition of "label" is given in the Wiki article: roughly, there is a way to make precise the notion that it doesn't matter what the labels are; that is, the definition of a species is functorial. They can be numbers or colors or other combinatorial objects.) This gives the answer as
,
precisely the (shifted) Fibonacci numbers as we already knew, that is, the number of tilings of a
grid with
and
tiles (where we have multiplied by
because of labeling). This directly suggests the identity

obtained by writing
and applying the binomial theorem; we've already seen the combinatorial proof here. The cool thing about this technique is that it allows us to prove combinatorial identities about a linear recurrence without computing the roots of its characteristic polynomial: for example, if we want basic objects of size
or
, we get
and the answer is

which immediately gives, by the binomial theorem or combinatorial reasoning, the identity

even though we haven't computed the roots of the characteristic polynomial or the analogue of the Fibonacci polynomials. If we want basic objects of any positive integer size, we get
, which gives

What does it mean to ask for a tiling with tiles of any size? It simply means to place a number of dividers between the tiles, and the number of ways to do this is obviously
(or
if there aren't any tiles). Can you extend this to a combinatorial proof that the set of fractional linear transformations is closed under composition? (The fractional linear transformations are precisely the generating functions of geometric series with a possibly different first term.) An "infinite" extension is given in the Practice Problems.
If we allow two "colors" of basic objects, we can ask, for example, what happens if we allow
types of basic red object of size
and
types of basic blue object of size
: then the generating function should be the product

although we have to keep in mind that what this product counts is the number of ordered pairs of red tilings and blue tilings with a fixed total size. This generating function is easily explained as follows: it is the sum
, so we have yet again provided a combinatorial proof of the geometric series formula. Can you extend this to a combinatorial proof of partial fraction decomposition?
Now let's think about unlabeled sets. The Bell numbers
count the number of partitions of
, equivalently, the number of equivalence relations. We can think of a partition as a disjoint union of nonempty sets. The EGF of the nonempty sets is
(there is no nonempty set of size
), hence by the fundamental theorem we obtain
.
Taking the derivative tells us that

which we can also deduce combinatorially as follows: given a partition of
, the partition that
belongs to has some size
, and removing it gives a partition of some
-element subset of
.
To further elucidate the distinction between labeled and unlabeled sets, let's see what happens if we look at the "number of labeled collections of unlabeled nonempty sets," whatever that means. The EGF is

which gives
. This is interesting! You might recognize
as the expected number of coin tosses
before you toss a heads, and generally you can think of
as the expected value of
, which is a certain kind of moment.
Before we analyze this EGF, let's stop and think about what the powers of
mean. The powers of
are obvious: if
is the number of unlabeled sets,
is the number of ways we can take
unlabeled sets (each with objects that are considered disjoint from the others) - say, in
colors - and permute them together, hence the number of words on
letters as we saw before. When we take powers of
we disallow the empty set for any given letter (color), i.e.

counts the number of words on
letters such that each letter is used at least once. The above identity is then an instance of the Principle of Inclusion-Exclusion.
We can now describe what
counts. If we number the colors (letters)
(there are countably many of them now), then
is the number of words of length
such that if letter
is used at least once, then every letter
is also used. This sequence does not appear to have a great name, but it is the number of ways
competitors can place in a competition if we allow ties (where letter
becomes
place), which is a very succinct interpretation. The places define a weak order on
elements, although it's not clear what all this has to do with moments of coin tosses. I'd be interested if someone could explain this to me. As for the closed form, I'll go with a link for now.
Another example is due to Zeilberger: returning to the observation that a permutation is a disjoint union of cycles, we'd like to count the number of permutations with exactly
cycles. This can be done by assigning a weight
to each cycle, i.e. to replace
with
, which then gives the generating function

which gives the answer as the coefficient of
in a rising factorial; this is the definition of the Stirling numbers of the first kind, up to sign. Compare with the earlier discussion on Newton polynomials. (There is, according to Stanley, a sense in which negative binomial coefficients having combinatorial significance is the result of a combinatorial reciprocity theorem, but I am far from understanding this point.)
Now suppose we'd instead like to count the number of permutations with exactly
fixed points. This can be done by assigning a weight
to the
-cycles, i.e. to replace
with
. This gives

where
is the polynomial generating function of the rencontres numbers. On the other hand, clearly
$ \frac {1}{1 - x} e^{yx - x} = \frac {1}{1 - x} \sum_{n \ge 0} \left( \sum_{k = 0}^{n} {n \choose k} y^k ( - 1)^{n - k}} \right) \frac {x^n}{n!}$
which gives
, another instance of PIE. In the case
we obtain
, a well-known formula for derangements. Substituting
gives the corresponding generating function.
Let
denote the permutations of
of order dividing
(i.e. involutions). Such a permutation consists of
-cycles and
-cycles only. These have EGF
, hence by the fundamental theorem the answer is

While this is not a familiar function, having this generating function makes it easy to deduce other properties of
. Differentiating, we obtain the identity

which we can deduce combinatorially as follows: in an involution of
is either a fixed point or belongs to a cycle of length
. Generally, the number of permutations of
of order dividing
has EGF

The EGF of the number of involutions in
with no fixed points is
.
It's not hard to see why: this implies that
, which is easy to give a combinatorial interpretation to via the recurrence
: in an involution of
with no fixed points,
belongs to a
-cycle. I'm curious what all this has to do with the normal distribution.
Finally, let me give an example of using the fundamental theorem in the other direction. The number of labeled simple graphs on
vertices is obviously
(two pairs of vertices either have or don't have an edge); note that this is much easier than talking about the number of unlabeled simple graphs since we must deal with graph isomorphisms. This generating function is
.
This suggests that the logarithm of the above generating function, which begins
,
must count the number of (nonempty) connected simple graphs on
vertices, which in fact is true. There is, as far as I know, no simpler way to describe this generating function.
Practice Problem 1: Compute the coefficients of
; what does this EGF count?
Practice Problem 2: The Euler zigzag numbers count the number of permutations
such that
. Show that

and from there deduce the exponential generating function
. (It is a sum of two functions with which you should be very familiar!)
Practice Problem 3: Prove that a sequence
that satisfies a linear recurrence with polynomial (rather than constant) coefficients has EGF satisfying a differential equation, also with polynomial coefficients.
Practice Problem 4: Keeping in mind the discussion of
, prove that

where
are the Catalan numbers. What do the convergents, interpreted as OGFs, count?
*Practice Problem 5: Prove, first by generating functions, and then combinatorially, the identity

where
is the number of partitions of
into at most
parts and
is the number of fixed points of
.
*Practice Problem 6 (Putnam 2005 B6): Let
denote the signature of a permutation
. Show that
.
Hint: See if you can modify the generating function of the rencontres numbers.
Practice Problem 7: See this thread.
If we all learned combinatorics before we learned calculus perhaps we would be more inclined to ask the following very basic question: where do the factorials in a Taylor series expansion really come from?
Factorials must clearly count permutations, and yet it is never really explained what permutations have to do with power series expansions. The proof that the



This question clearly has something to do with why exponential generating functions are useful and important. One of the reason one prefers an EGF to an OGF is that the underlying object being counted is "labeled": for example, the number of labeled sets with




By comparison, there is not much useful one can say about the ordinary generating function


I'd like to work a few more examples. The number of words of length




In particular, the number of words of length





In the language of exponential generating functions, differentiation corresponds to a shift in index (this is what we're really going after) and the above is equivalent to the identity

So something genuinely combinatorial is going on here. Let's try something else. The number of words of length



that picks a labeled object of size








The above definition is missing something. Given only a definition of









The Fundamental Theorem of Exponential Generating Functions: If




(The condition that



This is marvelous! Notice how beautifully this subsumes all of our previous discussion. Given two EGFs









and

exactly as originally discussed. Hence we have a combinatorial proof of the power series of the natural log. (Since it takes non-connected things to connected things, there is a sense in which the power series of the natural log is another instance of PIE; we'll see a fun application of this later.) Returning to differentiation, one basic calculus notion is the product rule

which we'd like to prove combinatorially. If we know what












which is the very simple statement that when we add an extra letter it can be from the alphabet of



which we would, again, like to prove combinatorially. When


in other words, adding an element to the set of disjoint unions of "basic elements"







One last remark. It now appears (at least to me) that the theory of ordinary generating functions is a special case of the theory of exponential generating functions, for the following reason: whenever




Some applications.
As mentioned before,









precisely the (shifted) Fibonacci numbers as we already knew, that is, the number of tilings of a





obtained by writing





which immediately gives, by the binomial theorem or combinatorial reasoning, the identity

even though we haven't computed the roots of the characteristic polynomial or the analogue of the Fibonacci polynomials. If we want basic objects of any positive integer size, we get


What does it mean to ask for a tiling with tiles of any size? It simply means to place a number of dividers between the tiles, and the number of ways to do this is obviously


If we allow two "colors" of basic objects, we can ask, for example, what happens if we allow





although we have to keep in mind that what this product counts is the number of ordered pairs of red tilings and blue tilings with a fixed total size. This generating function is easily explained as follows: it is the sum

Now let's think about unlabeled sets. The Bell numbers

![$ \{ 1, 2, ... n \} = [n]$](http://latex.artofproblemsolving.com/9/8/6/986a195ff5d8ef6bc29df33a237fb917c11ee19f.png)



Taking the derivative tells us that

which we can also deduce combinatorially as follows: given a partition of
![$ [n + 1]$](http://latex.artofproblemsolving.com/6/5/b/65bfeade6fc043f4ecdebde41d2fc7fcfb0121d0.png)



![$ [n]$](http://latex.artofproblemsolving.com/8/6/f/86f3c9726ae54305ce8de9304a59c227951947a5.png)
To further elucidate the distinction between labeled and unlabeled sets, let's see what happens if we look at the "number of labeled collections of unlabeled nonempty sets," whatever that means. The EGF is

which gives





Before we analyze this EGF, let's stop and think about what the powers of









counts the number of words on

We can now describe what










Another example is due to Zeilberger: returning to the observation that a permutation is a disjoint union of cycles, we'd like to count the number of permutations with exactly





which gives the answer as the coefficient of

Now suppose we'd instead like to count the number of permutations with exactly






where

$ \frac {1}{1 - x} e^{yx - x} = \frac {1}{1 - x} \sum_{n \ge 0} \left( \sum_{k = 0}^{n} {n \choose k} y^k ( - 1)^{n - k}} \right) \frac {x^n}{n!}$
which gives




Let

![$ [n]$](http://latex.artofproblemsolving.com/8/6/f/86f3c9726ae54305ce8de9304a59c227951947a5.png)





While this is not a familiar function, having this generating function makes it easy to deduce other properties of


which we can deduce combinatorially as follows: in an involution of
![$ [n + 1], n + 1$](http://latex.artofproblemsolving.com/7/4/9/7497f47e5d2b7b5bf6d5ac170598374e3826e8e5.png)

![$ [n]$](http://latex.artofproblemsolving.com/8/6/f/86f3c9726ae54305ce8de9304a59c227951947a5.png)


The EGF of the number of involutions in


It's not hard to see why: this implies that


![$ [2n]$](http://latex.artofproblemsolving.com/9/6/5/965c914a12624d87db0daca7c2243641ef2b2ae2.png)


Finally, let me give an example of using the fundamental theorem in the other direction. The number of labeled simple graphs on



This suggests that the logarithm of the above generating function, which begins

must count the number of (nonempty) connected simple graphs on

Practice Problem 1: Compute the coefficients of

Practice Problem 2: The Euler zigzag numbers count the number of permutations



and from there deduce the exponential generating function

Practice Problem 3: Prove that a sequence

Practice Problem 4: Keeping in mind the discussion of


where

*Practice Problem 5: Prove, first by generating functions, and then combinatorially, the identity

where

![$ [k]$](http://latex.artofproblemsolving.com/c/d/b/cdbcbc5bebb2e1903ba8e493df41bec0748ed0ad.png)



*Practice Problem 6 (Putnam 2005 B6): Let



Hint: See if you can modify the generating function of the rencontres numbers.
Practice Problem 7: See this thread.
A question and some thoughts II
by t0rajir0u, Feb 12, 2009, 10:33 PM
Question: Does there exist a satisfying combinatorial interpretation of non-integer eigenvalues of a graph?
A little background. The adjacency matrix
associated to a graph
has the property that its powers describe the number of walks of certain lengths between vertices of
, so to find closed forms for these numbers it is easiest to diagonalize
and work with its eigenvalues instead. The question is whether there is generally a way we can deduce these eigenvalues combinatorially without computing them.
This may seem a little backwards, so let me explain. Many nice graphs have integer eigenvalues: for example, the complete graph
with self-loops has eigenvalues
and it's trivial to see why, since a walk on
is just a list of vertices. (It's only slightly harder to figure out why the complete graph
without self-loops has eigenvalues
.) I've gone on and on about what eigenvalue
means. What I'm interested in is whether non-integer eigenvalues can still be given combinatorial meaning.
Let me demonstrate with two problems I've been thinking of.
Problem 1: Find a combinatorial proof of Binet's formula,
.
This amounts to finding a combinatorial interpretation of the eigenvalues of the graph with adjacency matrix
.
I have discussed the relationship between the Fibonacci numbers and walks on this graph (i.e. tilings with
and
dominoes) previously. The
factor corresponds to coloring such a tiling or equivalently adding edges to the graph. The question is to explain the
factor.
It is likely that the answer to this question is well-known. It may not even be terribly difficult, but I have been working on it for some time without success.
Problem 2: Find a combinatorial proof of the identity
.
This amounts to finding a combinatorial interpretation of the eigenvalues of the graph with adjacency matrix
.
We've seen this matrix connected to this problem before. The bijection is as follows: a walk on the above graph has the property that you have two choices at each vertex, either stay or move to the next vertex. Interpret "stay at a vertex during step
" as "don't add
to a subset" and interpret "move to the next vertex during step
" as "add
to a subset." Then the number of closed walks from vertex
to itself is the number of walks where you have to move a number of times congruent to
, i.e. the number of subsets with cardinality divisible by
, and similarly for the other two. To explain the eigenvalues here, one needs to either exhibit a counting argument that depends on the value of
or explain what will, after simplification, turn out to be a
factor. (The former might actually be quite easy; nevertheless, I have not been able to make progress on it.)
The above two examples can be summarized, a little tongue-in-cheek, as follows: is there a combinatorial proof of the quadratic formula? (This would allow us to explain the role of the discriminant.) Perhaps I really mean a categorical proof, but that's taking me too far afield.
The difficulty I have in resolving either of these questions is that I cannot imagine a way in which a solution would naturally generalize. While I think there might be a nice proof in the Fibonacci case, it is difficult to generalize beyond two vertices. As for the other, it is a special property of the third roots of unity that
are still (sixth) roots of unity; the generalization that handles what I would call "a combinatorial interpretation of the roots of unity filter," on the other hand, would have to deal with eigenvalues that are not roots of unity (or multiples thereof, such as
) and hence do not correspond (in a straightforward way) to some kind of periodic behavior. Computing the "real forms" of these identities would be even harder; one has to study quadratic subfields of cyclotomic fields and I really cannot imagine a combinatorial theory making these computations.
There is a sense in which what I've been looking for in these last two posts is a "categorified graph theory" something along the lines of what John Baez calls the groupoidification of linear algebra, but equipped to handle the notion of similarity. I definitely need more background to see if this idea makes any sense, though.
More questions:
The spectral theorem tells us that undirected graphs have real eigenvalues. Can this be deduced from purely graph-theoretic concerns, i.e. the number of walks of length
as
grows is either a sum of components which are either strictly increasing, strictly decreasing, or have period
? In other words, can we prove (a special case of) the spectral theorem graph-theoretically?
Directed acyclic graphs have a natural interpretation as the Hasse diagrams of finite posets. Are two Hasse diagrams for the same poset isospectral? Do two posets with isospectral Hasse diagrams have the same width? Can we prove a weaker version of the nilpotent Jordan normal form theorem via Dilworth's theorem? (Part of this question should be settled quickly by some computations I should get around to doing; it will be very obvious if the answer to either of the first two questions is "no.")
The Fibonacci discussion above generalizes as follows: it appears that any problem asking for the number of words of length
on a finite alphabet avoiding certain local patterns such as substrings (i.e. no consecutive
s) can be rephrased as a question about walks on a certain graph, but it is also possible to impose global patterns (the number of
s is a multiple of
) as in the third-roots-of-unity discussion. Is there a concise description of problems of this sort that reduce to walks on graphs? (At least one type of global pattern - for example, that the number of
s be equal to the number of
s - is definitely not of this form.)
A little background. The adjacency matrix




This may seem a little backwards, so let me explain. Many nice graphs have integer eigenvalues: for example, the complete graph






Let me demonstrate with two problems I've been thinking of.
Problem 1: Find a combinatorial proof of Binet's formula,

This amounts to finding a combinatorial interpretation of the eigenvalues of the graph with adjacency matrix
![$ \mathbf{F} = \left[ \begin{array}{cc} 1 & 1 \\
1 & 0 \end{array} \right]$](http://latex.artofproblemsolving.com/5/0/c/50c900745124c7cd3fb6673b8926a0bb3818aba6.png)
I have discussed the relationship between the Fibonacci numbers and walks on this graph (i.e. tilings with




It is likely that the answer to this question is well-known. It may not even be terribly difficult, but I have been working on it for some time without success.
Problem 2: Find a combinatorial proof of the identity

This amounts to finding a combinatorial interpretation of the eigenvalues of the graph with adjacency matrix
![$ \mathbf{S} = \left[ \begin{array}{ccc} 1 & 0 & 1 \\
1 & 1 & 0 \\
0 & 1 & 1 \end{array} \right]$](http://latex.artofproblemsolving.com/8/e/a/8ea55e0ab5107034bf90b7b345d1071ab1b3ed7c.png)
We've seen this matrix connected to this problem before. The bijection is as follows: a walk on the above graph has the property that you have two choices at each vertex, either stay or move to the next vertex. Interpret "stay at a vertex during step









The above two examples can be summarized, a little tongue-in-cheek, as follows: is there a combinatorial proof of the quadratic formula? (This would allow us to explain the role of the discriminant.) Perhaps I really mean a categorical proof, but that's taking me too far afield.
The difficulty I have in resolving either of these questions is that I cannot imagine a way in which a solution would naturally generalize. While I think there might be a nice proof in the Fibonacci case, it is difficult to generalize beyond two vertices. As for the other, it is a special property of the third roots of unity that


There is a sense in which what I've been looking for in these last two posts is a "categorified graph theory" something along the lines of what John Baez calls the groupoidification of linear algebra, but equipped to handle the notion of similarity. I definitely need more background to see if this idea makes any sense, though.
More questions:
The spectral theorem tells us that undirected graphs have real eigenvalues. Can this be deduced from purely graph-theoretic concerns, i.e. the number of walks of length



Directed acyclic graphs have a natural interpretation as the Hasse diagrams of finite posets. Are two Hasse diagrams for the same poset isospectral? Do two posets with isospectral Hasse diagrams have the same width? Can we prove a weaker version of the nilpotent Jordan normal form theorem via Dilworth's theorem? (Part of this question should be settled quickly by some computations I should get around to doing; it will be very obvious if the answer to either of the first two questions is "no.")
The Fibonacci discussion above generalizes as follows: it appears that any problem asking for the number of words of length






A question and some thoughts
by t0rajir0u, Jan 30, 2009, 7:01 AM
Question: Does there exist a satisfying graph-theoretic interpretation of the condition that two graphs are isospectral?
I've been thinking about this question for some time now. I don't really have the background to discuss it in detail, but briefly, the spectrum of a graph is the set of eigenvalues of its adjacency matrix; the powers of those eigenvalues then describes the powers of the adjacency matrix, hence the distribution of paths of length
on the graph. Intuitively, I think of the spectrum as describing "flows" on a graph.
Two graphs that are isomorphic are also isospectral, but the converse does not hold. This is not surprising: two graphs are isomorphic if and only if their adjacency matrices are similar up to permutation matrices, while two graphs are isospectral if and only if their adjacency matrices are similar up to any invertible matrix in
. Needless to say, this is a much weaker condition, and it's hard to come up with a convincing description of it in graph-theoretic terms.
I have a vague idea that some intermediate notion of isomorphism is possible first by considering weighted graphs and then by finding some way of formalizing what it means to conjugate an adjacency matrix by something other than a (weighted) permutation matrix. But I don't really know what's going on here.
Anyway, the motivation for this question is the following characterization of a class of nilpotent matrices:
Observation. A matrix
with non-negative real coefficients is nilpotent if and only if it is the weighted adjacency matrix of a directed acyclic graph.
I like this because it is highly intuitive: there are no paths of length greater than
on a directed acyclic graph on
vertices, or else they wouldn't be acyclic. A nicely behaved class of directed acyclic graphs is the set of directed trees with edges pointing away from or towards the root. This observation is the basis of the following
Observation. The nilpotent Jordan normal form theorem is (almost) equivalent to the statement that every (weighted) directed acyclic graph is isospectral to a directed [url=hhttp://en.wikipedia.org/wiki/Path_graph]path graph[/url].
Again, I like this because it is highly intuitive. A path is the simplest example of a directed acyclic graph, and to suggest that they are representative of the set of directed acyclic graphs in a particular way is a statement that I think is manageable by other means; I also think such statements should generalize to representative statements about other classes, and it would be very interesting to see the extent to which such statements translated into pure linear algebra.
Along those lines, I have become very interested lately with the idea that linear algebra is really just combinatorics, but I don't think I have the background to understand what Baez really means when he says that. Nevertheless, it is interesting to think about, and is one of the motivations for the above question.
I've been thinking about this question for some time now. I don't really have the background to discuss it in detail, but briefly, the spectrum of a graph is the set of eigenvalues of its adjacency matrix; the powers of those eigenvalues then describes the powers of the adjacency matrix, hence the distribution of paths of length

Two graphs that are isomorphic are also isospectral, but the converse does not hold. This is not surprising: two graphs are isomorphic if and only if their adjacency matrices are similar up to permutation matrices, while two graphs are isospectral if and only if their adjacency matrices are similar up to any invertible matrix in

I have a vague idea that some intermediate notion of isomorphism is possible first by considering weighted graphs and then by finding some way of formalizing what it means to conjugate an adjacency matrix by something other than a (weighted) permutation matrix. But I don't really know what's going on here.
Anyway, the motivation for this question is the following characterization of a class of nilpotent matrices:
Observation. A matrix

I like this because it is highly intuitive: there are no paths of length greater than


Observation. The nilpotent Jordan normal form theorem is (almost) equivalent to the statement that every (weighted) directed acyclic graph is isospectral to a directed [url=hhttp://en.wikipedia.org/wiki/Path_graph]path graph[/url].
Again, I like this because it is highly intuitive. A path is the simplest example of a directed acyclic graph, and to suggest that they are representative of the set of directed acyclic graphs in a particular way is a statement that I think is manageable by other means; I also think such statements should generalize to representative statements about other classes, and it would be very interesting to see the extent to which such statements translated into pure linear algebra.
Along those lines, I have become very interested lately with the idea that linear algebra is really just combinatorics, but I don't think I have the background to understand what Baez really means when he says that. Nevertheless, it is interesting to think about, and is one of the motivations for the above question.
Coefficients in the closed forms of linear recurrences
by t0rajir0u, Jan 12, 2009, 10:39 PM
Terence Tao recently wrote a post that describes, among other things, how to use test cases to quickly recover parts of formulas. Exercise 1, in particular, should remind you of a recent post here where the same trick was used to tease out the coefficients of the DFT matrix of order
, but using it to derive Lagrange interpolation is something I had missed when writing up the Lagrange interpolation post, and it reminded me of another context in which I'd seen the same trick.
But first, to clarify: let
be a polynomial of degree
. Suppose we are given that
for
where the
are distinct (so the interpolation problem is well-posed). Write
.
What are the coefficients
? The test case
for a particular
tells us that the residue at
on both sides must be equal; equivalently, multiplying by
on both sides gives

and substituting
gives
,
precisely the same Lagrange interpolation coefficient we can find by other methods. The actual interpolation is given by multiplying

as usual. The DFT problem is recovered by considering the interpolation question of what polynomial
of degree at most
satisfies

which is, of course,
, and consequently finding the partial fraction decomposition of
.
Note the similarity between this and the way the DFT problem was posed earlier in terms of finding the "closed form" of a sequence satisfying
. In my previous discussions of the general issue of understanding homogeneous linear recurrences, I've neglected the exact coefficients of the various terms in the closed form in favor of a clear understanding of the bases of the exponentials that appear in those terms. It's been enough to know that the coefficients are a linear function of the initial values. There are certain special cases, however, where the value of the coefficient becomes important besides the DFT case.
Problem: A six-sided die is rolled repeatedly and its values tallied up. Let
denote the probability that at some point the sum of the values of the rolls is
. Compute
.
Setup. Before I get into the computation, let's take a moment to think about this problem from an elementary perspective. As
gets large, the partial sums increase by
on average (the expected value of a dice roll) and it's intuitive (although not immediately obvious) that the probability
becomes uniform (I should say that the limit exists); in fact, the limit should clearly be
. Generally, given an arbitrary die we should expect the limit to be one over the expected value.
Can we be more rigorous? A moment's thought reveals that the probability
satisfies a linear recurrence: the partial sum can only be
if it was one of
before, so we have the straightforward recurrence
.
The characteristic polynomial of this recurrence is
. Here we run into a problem: we don't know the roots of this polynomial (other than the obvious one,
). The root
should control the limiting behavior of the sequence, so we want to show a few things:
1) All the other roots have absolute value strictly less than
. This (almost) ensures that the limit exists.
2) The root
occurs with multiplicity
. This ensures that the limit exists. (If you're not sure why any of this is true, review the previous post on this subject.)
3) We can compute the coefficient of
in the closed form of
. This tells us what the limit is.
At this point, there are two related approaches we can take.
Solution 1. 1) can be reworded as follows: the companion matrix
of
(which acts on the column vector with components
) has eigenvalue
with multiplicity
and all other eigenvalues have absolute value less than
. This sort of statement generalizes significantly; the Perron-Frobenius theorem states that this is true of any matrix with strictly positive entries (that is, there exists a strictly maximal eigenvalue). Of course,
isn't of this form, but it's enough to show that some power of it is.
Regard
as the adjacency matrix of a weighted, directed graph on
vertices. We begin with a directed path of length
(where each vertex
points to the next vertex
) and in addition specify weighted edges from
to every vertex, each of weight
. The powers
describe the number of paths of length
between vertices.
These paths are easy to exhibit explicitly: the only paths between two vertices
occur when we either travel directly between the two (when
) or when we travel to
, stay there for awhile, and travel back. In other words, for
there exist paths between any two vertices of length exactly
. (It is instructive to count exactly how many paths there are as a polynomial in
; the actual entry in the corresponding power of
is weighted by the number of times the path stays at
.) This lets us apply the Perron-Frobenius theorem to
and then we have our result.
has a special property that admits an addditional interpretation: it is the transition matrix of a finite Markov chain, and the vertices of the above graph describe its state space. That is, our Markov process consists of
states where the first five states transition to the next and the sixth state transitions, with uniform probability, to any of the other states. The Perron-Frobenius applied to a stochastic matrix tells us that
is a strictly maximal eigenvalue, which is exactly the nice result we'd like for a Markov process: it tells us that there exists a stationary distribution.
We are done here if we can compute the stationary distribution. The Perron-Frobenius theorem gets us 2) and, in fact, also gives us 3): it says that there is a unique row eigenvector
with entries summing to
such that

and
is the row eigenvector associated with the eigenvalue
. At this point it's only a quick calculation to verify that this eigenvector is

and now we only have to compute the initial values. Considering how much work I will do to avoid this calculation, it is not in fact a difficult one, but instead I will motivate a different calculation that will get us the answer without needing to compute the initial values. The important result to keep in mind is that we did not have to compute the other five eigenvectors of
(or the corresponding eigenvalues).
Solution 2. In the interest of pursuing the original point and avoiding the use of the Perron-Frobenius theorem, the proof of 1) is fairly straightforward. Suppose
is a root of
. If
, then

with equality if and only if
and
all point in the same direction (the last step is the triangle inequality), which is true if and only if
. Alternately,
is a convex combination of points that lie on the unit circle, which itself lies on the unit circle if and only if it one of the vertices of the convex hull of
(which is just the points themselves). This implies that
must be a root of unity of order at most
, and then one can verify directly that only
works. 2) can be proven by a similar argument applied to
or by dividing by
and substituting in
. How should we calculate 3)?
Considering I built up this discussion by talking about partial fraction decompositions, let's take the generating functions approach and try to figure out what
should look like. The generating function that describes the probability distribution of the sums of
independent dice rolls is just

and since the number of independent dice rolls takes values over all
, the generating function we want is
.
It's worth pointing out that
is easy to calculate even if the initial conditions aren't because of the way the problem is set up. Its denominator has roots which are the reciprocal of those of
as expected. Now, the important observation: we don't have to compute the complete partial fraction decomposition to isolate the coefficient of
! All we need to do is to factor
out of the denominator, like so:

and then we can compute the partial fraction decomposition using these factors alone. To be even lazier, the residue at
can be evaluated using the same trick as we used before: multiplying by
, this is just

exactly as we expected. Note that the computation we are performing here is, by, l'Hopital's rule, the evaluation of the derivative of the denominator of
at
, which is just the expected value of a single dice roll. This observation, as well as the rest of the argument, generalizes (see the Practice Problem).
The main result to take away from this discussion is the following
Proposition: Let
be a sequence satisfying a homogeneous linear recurrence with characteristic polynomial
of degree
, and suppose that
is a known root of
. Then the coefficient of
(which is in general a polynomial in
) in the closed form of
is a rational function of the initial conditions and
(and in particular does not depend on the nature of the rest of the roots of
).
I mention this largely because it violated my intuition: I had previously thought that it wasn't possible to find the coefficients of the closed form of a given recurrence without computing every root of the characteristic polynomial.
Despite the manner in which we first arrived at this statement, it can be proven in a very straightforward manner. We'll suppose for brevity that
has multiplicity
. Write
and write
. Then
satisfies the (shorter) recurrence with characteristic polynomial
rather than
. Given, therefore, the initial conditions


...

along with the expression of
in terms of
provides a system of
equations in
variables which we can solve. In particular,
has some coefficients
such that

hence such that (adding appropriate multiples of the equations above)

.
The numerator can be understood as "quotienting out" the initial values by the roots of
whereas the denominator can be understood either as a Lagrange interpolation coefficient or as
.
Clearly there is an interesting connection between the Lagrange interpolation problem and the problem of figuring out the coefficients of the closed form of a linear recurrence. Well, perhaps it's not too interesting. Approaching the problem "forwardly" (that is, using the most basic tools first), let's set up a system. If the roots of
are given by
(forgive me for switching around notation all the time), then the problem of determining the coefficients in

(where again I have assumed that all roots occur with multiplicity
) is simply the problem of solving the system
![$ \left[ \begin{array}{cccc} 1 & 1 & \hdots & 1 \\
r_1 & r_2 & \hdots & r_n \\
r_1^2 & r_2^2 & \hdots & r_n^2 \\
\vdots & \vdots & \ddots & \vdots \\
r_1^{n - 1} & r_2^{n - 1} & \hdots & r_n^{n - 1} \\
\end{array} \right] \left[ \begin{array}{c} c_1 \\
c_2 \\
c_3 \\
\vdots \\
c_n \end{array} \right] = \left[ \begin{array}{c} s_0 \\
s_1 \\
s_2 \\
\vdots \\
s_{n - 1} \end{array} \right]$](//latex.artofproblemsolving.com/5/5/5/55562909b6c76e3234921b2395d1c273bef358f4.png)
and what is the matrix that should appear but the transpose of the Vandermonde matrix of the roots! We have already seen that the finding-coefficients problem and the Lagrange interpolation problem are in some sense adjoint to each other in the special case of the DFT problem; how should we generalize?
One interpretation uses the partial fractions approach. In the Lagrange interpolation problem we are given the values of a polynomial
at some points; these values determine the coefficients of the partial fraction decomposition of a certain rational function with
, from which the coefficients of
can be computed. In the linear recurrence problem we are given the initial values of a recurrence
; these values determine the coefficients of the numerator of a certain rational function, from which the coefficients of its partial fraction decomposition can be computed.
Practice Problem 1: Generalize the conclusion of the solution to Problem 1. That is, given a die with probability distribution
(where
describes the probability of rolling a
) such that
and
, what is the limiting behavior of the probability that partial sums of repeated rolls hits a value of
? (In particular, when does this sequence converge? The answer is not "always.") This is equivalent to the problem of finding the outcome of a sequence of repeated weighted averages

when the number of faces of the die is finite. (The first thing you should do is generalize the verification of condition 1) and examine the cases in which it fails.)
Practice Problem 2: Interpret the
Jordan block with eigenvalue
as an adjacency matrix and compute its powers by a counting argument. How does this prove 1) the binomial theorem and 2) the balls-and-urns formula? Compare with the ordinary generating functions proof.
Practice Problem 3: In keeping with our observation that for special cases the ordinary generating function is easier to compute than the initial conditions, verify that the identity
is equivalent to Newton's sums.
(Can you find a proof of Newton's sums by computing the trace of the
power of the companion matrix of
? in a clever way? I can't.
)
Practice Problem 4: One way to state the fundamental theorem of symmetric polynomials is that any function of the roots of an irreducible polynomial
invariant under any permutation of the roots is a function of its coefficients. The discriminant is an example of such a function. Prove this fact by taking tensor powers of the companion matrix of
. (I don't actually know if this works.
)

But first, to clarify: let






What are the coefficients






and substituting


precisely the same Lagrange interpolation coefficient we can find by other methods. The actual interpolation is given by multiplying

as usual. The DFT problem is recovered by considering the interpolation question of what polynomial



which is, of course,


Note the similarity between this and the way the DFT problem was posed earlier in terms of finding the "closed form" of a sequence satisfying

Problem: A six-sided die is rolled repeatedly and its values tallied up. Let



Setup. Before I get into the computation, let's take a moment to think about this problem from an elementary perspective. As




Can we be more rigorous? A moment's thought reveals that the probability




The characteristic polynomial of this recurrence is



1) All the other roots have absolute value strictly less than

2) The root


3) We can compute the coefficient of


At this point, there are two related approaches we can take.
Solution 1. 1) can be reworded as follows: the companion matrix







Regard









These paths are easy to exhibit explicitly: the only paths between two vertices












We are done here if we can compute the stationary distribution. The Perron-Frobenius theorem gets us 2) and, in fact, also gives us 3): it says that there is a unique row eigenvector



and



and now we only have to compute the initial values. Considering how much work I will do to avoid this calculation, it is not in fact a difficult one, but instead I will motivate a different calculation that will get us the answer without needing to compute the initial values. The important result to keep in mind is that we did not have to compute the other five eigenvectors of

Solution 2. In the interest of pursuing the original point and avoiding the use of the Perron-Frobenius theorem, the proof of 1) is fairly straightforward. Suppose




with equality if and only if











Considering I built up this discussion by talking about partial fraction decompositions, let's take the generating functions approach and try to figure out what



and since the number of independent dice rolls takes values over all


It's worth pointing out that





and then we can compute the partial fraction decomposition using these factors alone. To be even lazier, the residue at



exactly as we expected. Note that the computation we are performing here is, by, l'Hopital's rule, the evaluation of the derivative of the denominator of


The main result to take away from this discussion is the following
Proposition: Let










I mention this largely because it violated my intuition: I had previously thought that it wasn't possible to find the coefficients of the closed form of a given recurrence without computing every root of the characteristic polynomial.
Despite the manner in which we first arrived at this statement, it can be proven in a very straightforward manner. We'll suppose for brevity that









...

along with the expression of







hence such that (adding appropriate multiples of the equations above)


The numerator can be understood as "quotienting out" the initial values by the roots of


Clearly there is an interesting connection between the Lagrange interpolation problem and the problem of figuring out the coefficients of the closed form of a linear recurrence. Well, perhaps it's not too interesting. Approaching the problem "forwardly" (that is, using the most basic tools first), let's set up a system. If the roots of



(where again I have assumed that all roots occur with multiplicity

![$ \left[ \begin{array}{cccc} 1 & 1 & \hdots & 1 \\
r_1 & r_2 & \hdots & r_n \\
r_1^2 & r_2^2 & \hdots & r_n^2 \\
\vdots & \vdots & \ddots & \vdots \\
r_1^{n - 1} & r_2^{n - 1} & \hdots & r_n^{n - 1} \\
\end{array} \right] \left[ \begin{array}{c} c_1 \\
c_2 \\
c_3 \\
\vdots \\
c_n \end{array} \right] = \left[ \begin{array}{c} s_0 \\
s_1 \\
s_2 \\
\vdots \\
s_{n - 1} \end{array} \right]$](http://latex.artofproblemsolving.com/5/5/5/55562909b6c76e3234921b2395d1c273bef358f4.png)
and what is the matrix that should appear but the transpose of the Vandermonde matrix of the roots! We have already seen that the finding-coefficients problem and the Lagrange interpolation problem are in some sense adjoint to each other in the special case of the DFT problem; how should we generalize?
One interpretation uses the partial fractions approach. In the Lagrange interpolation problem we are given the values of a polynomial




Practice Problem 1: Generalize the conclusion of the solution to Problem 1. That is, given a die with probability distribution







when the number of faces of the die is finite. (The first thing you should do is generalize the verification of condition 1) and examine the cases in which it fails.)
Practice Problem 2: Interpret the


Practice Problem 3: In keeping with our observation that for special cases the ordinary generating function is easier to compute than the initial conditions, verify that the identity

(Can you find a proof of Newton's sums by computing the trace of the



Practice Problem 4: One way to state the fundamental theorem of symmetric polynomials is that any function of the roots of an irreducible polynomial



The pretty picture post (crystals and cyclotomics)
by t0rajir0u, Dec 25, 2008, 7:00 AM
Today I'd like to talk about tesselations of the plane. With regards to regular polygons, it is not hard to see that only triangles, squares, and hexagons work, and of course one may extend this to any shearing or stretching of those shapes (parallelograms). What happens if we don't restrict ourselves to regular polygons? Still if we confine ourselves to a single repeating unit, we find that the symmetries obeyed by the resulting pattern continue to be 2-fold, 3-fold, 4-fold, or 6-fold.
Invalid image file
For example, the above Escher drawing exhibits 3-fold symmetry. This simple observation turns out to be fundamental in describing the structure of crystals, Nature's very own tesselations of space.
Invalid image file
The above depicts the crystal structure of diamond, which is cubic; specifically, it's known among crystallographers as the face-centered cubic Bravais lattice. Up to shearing and stretching, the crystal systems on which Bravais lattices are based are either cubic or hexagonal. Why do no other crystal systems (and hence no other Bravais lattices) appear? Why, for example, are there no icosahedral lattices?
To investigate this question, we'll make the notion of a lattice more precise as follows and end up proving the result known as the crystallographic restriction theorem. Define a lattice in
to be a discrete subgroup of
that spans it as a vector space. More concretely, we can describe a lattice in
by specifying
linearly independent vectors
and considering the set

For example, the set of points with integer coordinates forms a lattice, and is in some sense the only lattice (but not for our purposes). We relate lattices to tesselations as follows: in any tesselation of a single repeating unit, we can identify a point in that repeating unit. The set of all points in the entire tesselation should form a lattice because we want tesselations to have translational symmetry in
independent directions. Now that we have mathematically formulated the notion of a lattice, what can we say about its symmetries?
By "symmetries" we mean orientation-preserving isometries that map every point in the lattice to another point in the lattice; this is intuitive. The isometries of
are always linear, so we are talking about a subgroup of the Euclidean group that we can exhibit as a m atrix group. A lattice is said to have
-fold symmetry there exists an orientation-preserving isometry
(in two and three dimensions, rotations about a point and an axis, respectively) such that
, the identity map. We now wish to show that the only possible orders in dimensions
and
occur when
.
Proof. Written in the basis
, every symmetry of the lattice must take each basis vector to an integer combination of the other basis vectors, hence any such symmetry must have integer coordinates in this basis, in other words, must be an element of
.
In dimensions
and
, the minimal polynomial of such a matrix has degree at most
, has integer coefficients, and must divide the polynomial
. On the other hand, the monic irreducible factors of
are precisely the cyclotomic polynomials. The cyclotomic polynomials of degree at most
(in fact, exactly
) are precisely
. Hence the only nontrivial finite orders an element of
can have are
as desired.
The beauty of the above approach is that it is absolutely general. For example, the possible symmetries of a lattice in
dimensions occur when
.
Corollary: Let
be positive integers. If
is rational, then it can only be
. (See also MellowMelon's demonstration using Chebyshev polynomials.) This is not, strictly speaking, a corollary; one needs the additional observation that the irreducible factors of
are still the cyclotomic polynomials over
as well as over
, that is, Gauss's lemma.
There are a few interesting questions to ponder from here. One is purely mathematical: what interest are lattices to a mathematician? I hope I have already convinced you of the power of lattice methods in number theory. Indeed, you may have spotted that the lattices with the symmetries described above could be said to include the Gaussian and Eisenstein integers, which have shown up more than once in this blog.

In Lie theory, lattices play the role of root systems, a method by which Lie groups and Lie algebras are analyzed and classified. Certain extremely symmetrical lattices correspond to what are called the exceptional Lie groups, and these extremely symmetrical objects occur in physical theories; for example,
has some tantalizing connections to theoretical physics.
Lattices in the complex plane also occur in the theory of elliptic curves via a type of doubly periodic function called an elliptic function. Elliptic functions parameterize elliptic curves over
and have periods which are two complex numbers whose ratio is not real. Since a doubly periodic function has the same value if its domain is taken "modulo" the lattice
spanned by its periods, we can identify the range of the elliptic function - that is, the curve itself - with a fundamental parallelogram of the lattice with its sides identified - which is a torus!
Invalid image file
And now you know why all those specials on Fermat's Last Theorem tell you that elliptic curves are doughnuts.
The other question we could ask is scientific: we assumed that crystals are described by lattices with
independent translational symmetries. Do there exist crystals that do not have this property? Scientists did not discover physical evidence of such quasicrystals until the 80s, but
- the mathematical community was aware that these structures could, in principle, exist decades earlier, and
- Islamic architects were already using aperiodic tilings centuries earlier!

Some aperiodic tilings occur with
-fold,
-fold, or
-fold symmetry, and that's because they are projections down from
-dimensional lattices along non-lattice planes, which explains both the large amount of structure and the lack of translational symmetry.
This scientific point brings up another mathematical question. Crystals tend to organize themselves into lattices because that is the structure that maximizes interaction among the atoms and therefore minimizes energy and excess volume. In other words, the relationship between crystals and lattice structures is related to the circle-packing problem. Is it always the case, then, that the densest packing in
dimensions is a lattice packing?
This seems a silly question to ask at first. Intuitively, any irregularity in a packing correlates to some excess that could be removed by increasing the regularity of the packing. However, even the statement for dimension
was only proven very recently, and this question is actually open in dimensions
and higher.
Practice Problem 1: Prove the general form of Minkowski's theorem. (Try to find a proof other than the one given in the Wikipedia article! In two dimensions, there is a proof using Pick's theorem.)
Practice Problem 2: Let
and
. Show that
for all
. Show, on the other hand, that
.
Practice Problem 3: The generic example of a potential function gives the energy of two spherical charges with charges
at a distance
from each other as
. Compute the average potential energy per charge of an infinite lattice of alternately charged circles of radius
and charges
in the plane (with circles centered at lattice points with even coordinates). This is essentially a Madelung constant. (This sum obviously doesn't converge absolutely, so be careful. Also, I haven't done this problem and I'm not sure the answer comes out nicely at all
)
Invalid image file
For example, the above Escher drawing exhibits 3-fold symmetry. This simple observation turns out to be fundamental in describing the structure of crystals, Nature's very own tesselations of space.
Invalid image file
The above depicts the crystal structure of diamond, which is cubic; specifically, it's known among crystallographers as the face-centered cubic Bravais lattice. Up to shearing and stretching, the crystal systems on which Bravais lattices are based are either cubic or hexagonal. Why do no other crystal systems (and hence no other Bravais lattices) appear? Why, for example, are there no icosahedral lattices?
To investigate this question, we'll make the notion of a lattice more precise as follows and end up proving the result known as the crystallographic restriction theorem. Define a lattice in






For example, the set of points with integer coordinates forms a lattice, and is in some sense the only lattice (but not for our purposes). We relate lattices to tesselations as follows: in any tesselation of a single repeating unit, we can identify a point in that repeating unit. The set of all points in the entire tesselation should form a lattice because we want tesselations to have translational symmetry in

By "symmetries" we mean orientation-preserving isometries that map every point in the lattice to another point in the lattice; this is intuitive. The isometries of







Proof. Written in the basis


In dimensions










The beauty of the above approach is that it is absolutely general. For example, the possible symmetries of a lattice in


Corollary: Let






There are a few interesting questions to ponder from here. One is purely mathematical: what interest are lattices to a mathematician? I hope I have already convinced you of the power of lattice methods in number theory. Indeed, you may have spotted that the lattices with the symmetries described above could be said to include the Gaussian and Eisenstein integers, which have shown up more than once in this blog.

In Lie theory, lattices play the role of root systems, a method by which Lie groups and Lie algebras are analyzed and classified. Certain extremely symmetrical lattices correspond to what are called the exceptional Lie groups, and these extremely symmetrical objects occur in physical theories; for example,

Lattices in the complex plane also occur in the theory of elliptic curves via a type of doubly periodic function called an elliptic function. Elliptic functions parameterize elliptic curves over


Invalid image file
And now you know why all those specials on Fermat's Last Theorem tell you that elliptic curves are doughnuts.
The other question we could ask is scientific: we assumed that crystals are described by lattices with

- the mathematical community was aware that these structures could, in principle, exist decades earlier, and
- Islamic architects were already using aperiodic tilings centuries earlier!

Some aperiodic tilings occur with




This scientific point brings up another mathematical question. Crystals tend to organize themselves into lattices because that is the structure that maximizes interaction among the atoms and therefore minimizes energy and excess volume. In other words, the relationship between crystals and lattice structures is related to the circle-packing problem. Is it always the case, then, that the densest packing in

This seems a silly question to ask at first. Intuitively, any irregularity in a packing correlates to some excess that could be removed by increasing the regularity of the packing. However, even the statement for dimension


Practice Problem 1: Prove the general form of Minkowski's theorem. (Try to find a proof other than the one given in the Wikipedia article! In two dimensions, there is a proof using Pick's theorem.)
Practice Problem 2: Let





Practice Problem 3: The generic example of a potential function gives the energy of two spherical charges with charges






Fun with Newton polynomials, Part III
by t0rajir0u, Dec 1, 2008, 11:20 PM
This post isn't really about Newton polynomials 
First, we will need the following
Proposition: Let
be two sequences with exponential generating functions
and suppose that
. Then

Exponential generating functions are used to study sequences like the Bell numbers whose growth rate makes their ordinary generating functions bad (for example, they might converge nowhere). The multiplication of exponential generating functions, as opposed to ordinary generating functions, occurs in situations where order counts. Certainly there's much more to be said about this point.
But the interesting thing here is the correspondence between the convolution above and an operation I referred to as a "Newton transform" (this is not standard terminology) that assigns to a sequence
the function

If we shift our focus of attention instead to the sequence

we can find in it a combinatorial interpretation: if
is the number of ways to arrange
objects in some way such that order matters,
is the number of ways to arrange some objects from among
distinguishable objects in that way. It is now obvious that

This is poweful! If
is a polynomial, it consists of a finite number of terms whose coefficients specify the coefficients, in Newton form, of a polynomial whose positive integer values are the coefficients of the LHS! It appears that multiplication by
alone is enough for us to automatically compute values of a polynomial by specifying only its Newton coefficients. You may also recall a combinatorial identity whose (non-combinatorial!) proof is now trivial: letting
gives

hence
and

as before. Also note that the combinatorial interpretation of
I gave above is consistent with the interpretation I gave in that post.
Earlier I noted another special case when
was a geometric series, which gives
and

exactly as shown earlier. Now let's talk about the roots of unity filter again: this corresponds to the case where
.
For example, if
, the corresponding generating function is
. The product
will then tell us how to evaluate the sum
.
By inspection,
is a solution to the differential equation
, so it is a linear combination of functions of the form
where
and
. This is all material we have more or less encountered already. An application of the DFT gives us



.
When
, this gives

as before. Now, these generating functions are a little unwieldy (complex numbers as an argument in an exponent?), so we might like to see what ordinary generating functions have to say about all of this. But what happens to the exponential convolution? We could use the properties of the Laplace transform, but let's work backwards instead: we want to find a relationship between the generating functions

and

Exchanging the order of summation in
, we obtain

We've seen a generating function very much like this one before; it gives

and therefore
.
This is either very nice or very ugly depending on your point of view; in any case, it agrees with the answer you get using the properties of the Laplace transform (
). We're interested in the function
, which gives

The factorization of the denominator into linear terms of the form
gives
as before, which is very nice, but to find the coefficients we found before requires a partial fraction decomposition, either of
or of
. We can compute the partial fraction decomposition of
using the DFT as before - which is already a nice result - but we can be tricky: observe that if we write

then we can compute the residue
by computing

l'Hospital's rule then gives

which is in exact agreement with our previous answer, although we need to rewrite the partial fraction decomposition as

to see the agreement. This can be understood as an alternate proof that the DFT works; the proof sketched in a previous post was about orthogonality and can be understood in the context of Parseval's theorem.
So it seems that both ordinary and exponential generating functions have something to bring to the table. Exponential generating functions reveal that a convolution with
is what makes the "Newton transform" work (see binomial transform), but there were difficulties - we could not explicitly write down
in terms of a well-known function. The convolution became a Mobius transformation of the ordinary generating function, but there we had partial fractions and analysis available.
Practice Problem 1: Repeat the above computation using the formula
.
Practice Problem 2: Show that
cannot be partitioned into a finite number of disjoint arithmetic sequences with each common difference distinct.
Practice Problem 3: The Bell numbers
count the number of partitions of a set with
elements, where neither the order of the partitions nor the order of the elements of a partition matter. Show that they satisfy the recurrence
.
Hence compute the exponential generating function
.

First, we will need the following
Proposition: Let




Exponential generating functions are used to study sequences like the Bell numbers whose growth rate makes their ordinary generating functions bad (for example, they might converge nowhere). The multiplication of exponential generating functions, as opposed to ordinary generating functions, occurs in situations where order counts. Certainly there's much more to be said about this point.
But the interesting thing here is the correspondence between the convolution above and an operation I referred to as a "Newton transform" (this is not standard terminology) that assigns to a sequence


If we shift our focus of attention instead to the sequence

we can find in it a combinatorial interpretation: if





This is poweful! If




hence


as before. Also note that the combinatorial interpretation of

Earlier I noted another special case when



exactly as shown earlier. Now let's talk about the roots of unity filter again: this corresponds to the case where

For example, if




By inspection,









When


as before. Now, these generating functions are a little unwieldy (complex numbers as an argument in an exponent?), so we might like to see what ordinary generating functions have to say about all of this. But what happens to the exponential convolution? We could use the properties of the Laplace transform, but let's work backwards instead: we want to find a relationship between the generating functions

and

Exchanging the order of summation in


We've seen a generating function very much like this one before; it gives

and therefore

This is either very nice or very ugly depending on your point of view; in any case, it agrees with the answer you get using the properties of the Laplace transform (



The factorization of the denominator into linear terms of the form






then we can compute the residue


l'Hospital's rule then gives

which is in exact agreement with our previous answer, although we need to rewrite the partial fraction decomposition as

to see the agreement. This can be understood as an alternate proof that the DFT works; the proof sketched in a previous post was about orthogonality and can be understood in the context of Parseval's theorem.
So it seems that both ordinary and exponential generating functions have something to bring to the table. Exponential generating functions reveal that a convolution with


Practice Problem 1: Repeat the above computation using the formula

Practice Problem 2: Show that

Practice Problem 3: The Bell numbers



Hence compute the exponential generating function

Various mathematical thoughts, ranging from problem-solving techniques to attempts to tie together related concepts.
Archives




















Shouts
Submit
128 shouts
Tags
About Owner
- Posts: 12167
- Joined: Nov 20, 2005
Blog Stats
- Blog created: Dec 5, 2006
- Total entries: 48
- Total visits: 321827
- Total comments: 202
Search Blog