An "elementary" perspective on Dirichlet's theorem
by t0rajir0u, May 16, 2008, 4:15 AM
In the following discussion,
always denotes a prime. Most of the discussion on convergence is not completely rigorous. Much of (the rigorous part of) this material is taken from this paper.
So we've talked about how to prove special cases of Dirichlet's theorem using polynomials. (The comment I posted contains a generalization). Let's talk about (the easy parts of) the classic proof of the theorem itself.
The Amazing Euler Product:
. (We can consider this either as a formal identity or as an actual sum equality for
.)
Proof. Write
. The product on the LHS then takes the form
where the product is taken over all primes and
are arbitrary non-negative integers. But by the Fundamental Theorem of Algebra the numbers
are precisely the natural numbers, once each.
The Euler Product is one of the most beautiful identities I know. It relates the Riemann Zeta function (of Riemann Hypothesis fame) to a product over the primes through prime factorization. Magical. Now we need a lemma.
Lemma: Let
be a sequence of complex numbers such that
. Suppose that
converges. Then
converges if and only if
converges.
Proof. See this thread. (Yes, I cheated.
) I needed to verify this lemma as part of this discussion; the intuitive notion is that
.
Proposition: There are an infinite number of primes. In fact,
diverges. (The latter implies the former.)
Proof. By our lemma,
converges if and only if
converges (for
), which is asymptotic (by comparison) to
.
Now,
turns out to be just the harmonic series, which diverges. Hence
also diverges.
N.B. It's more elementary to prove that there are an infinite number of primes alone; as
we know
approaches the harmonic series, so the Euler product must contain an infinite number of factors (or else it is finite).
This is a proof of a quite different flavor than Euclid's proof that there are an infinite number of primes (generalized by the polynomial proof from the other post). It is strangely analytic. Such ideas belong to the fascinating realm of analytic number theory, another example of which is the celebrated Riemann hypothesis.
A short excursion. One of the first successes of analytic number theory was the prime number theorem, which states (in an equivalent form) that the
prime
is asymptotic to
, and we know that
diverges. The proof is usually done by the integral test, but the more illuminating proof is by Cauchy condensation, which reduces the above to the question of the convergence of the harmonic series.
The substitution required to perform Cauchy condensation is analogous to (the inverse of) our use of the natural log to transform the Euler product into the sum of the reciprocal of the primes. Indeed, the argument we used to prove the divergence of the prime harmonic series suggests that, just as
, we have
.
A second short excursion.
Problem: Let
be a polynomial of even degree. Show that the set
cannot contain the primes.
Solution.
converges and contains a finite number of negative terms. Its positive terms therefore cannot contain the primes.
Again, this solution has a strange flavor. We have not appealed to any facts intrinsic to polynomials (other than their loose connection to the
-series test). We have not required that the coefficients be integer or even rational. We have not talked about divisibility (which is how we show that an integer polynomial takes on composite values infinitely often).
The above does suggest that there may exist multivariate polynomials (with no terms of degree
) whose values at integers include the primes. Indeed, by the four-square theorem, such a polynomial is
, whose values also contain all the non-negative integers. Amazingly enough, a 26-variable polynomial exists whose positive values at integer inputs is precisely the set of primes.
This is all thanks to the magic of the Euler product. Of course, my use of "the" Euler product is misleading. Let me show you what I mean.
Another Euler Product:
.
Proof. The integers congruent to
are precisely the integers with an odd number of prime factors congruent to
(counting multiplicity). The rest follows by a similar argument as the first identity.
Proposition: There are an infinite number of primes congruent to
and an infinite number of primes congruent to
. In fact,
and
both diverge.
Proof. We are going to call the above sum
for reasons that will be explained later. Observe that it is essentially an alternating harmonic series and therefore converges as
, and moreover it is nonzero (because we can write it as a sum of positive terms). Now take its natural log. Our lemma allows us to conclude that the sum

converges as
. On the other hand,

diverges. Adding and then subtracting the two sums, we have our result.
This seems like a rather powerful technique. Here it is motivated; by changing the sign of the part of the product corresponding to primes congruent to
we simply distinguish the integers congruent to
. But mod larger numbers we don't have such simple patterns. To generalize this result, we need a more general way to construct Euler products.
The (Fully Multiplicative) Euler product: Let
be a function such that
. Then
.
The proof is identical to the proof of the original Euler product. So what function
did we choose to create our second Euler product?
It turns out that the only functions
we want to look at (for the purposes of Dirichlet's Theorem) are very special functions called Dirichlet characters, and the corresponding series are very special functions called Dirichlet L-functions. A Dirichlet character relative to a positive integer
repeats its values
and has the following properties:
- Whenever
, we have
.
-
.
- For
we have $ \chi(a)^{\varphi(n)} = \chi(a^{\varphi(n)}}) = 1$. In other words,
is a
root of unity.
The Dirichlet characters relative to a particular
form a group under the multiplication
with the identity element
; we denote this group
. In fact, we can show that this group has order
. We denote the inverse of
by
. Note that
.
The character we chose (with respect to
) was actually the Jacobi symbol (here just the Legendre symbol). The corresponding group has order
, so its only elements are the Jacobi symbol and the identity. (This is also true when
.)
We have the following two related sums (which are left as an exercise):
-
if
and
otherwise.
-
if
and
otherwise.
We are now somewhat ready to discuss Dirichlet's Theorem.
Dirichlet's Theorem: For
and
such that
, there are an infinite number of primes congruent to
.
Sketch. Let
be a character relative to
.
The hardest part of the proof of Dirichlet's Theorem is showing that whenever
we know that
is nonzero and converges. In other words,
is not a zero of the function
. The natural way to ask this question is within the setting of complex analysis; in fact, the generalized Riemann hypothesis is precisely about the zeroes of Dirichlet L-functions. Details on (one approach to) this step can be found in the paper I linked to initially; we will ignore them, as our focus is on the elementary parts of the proof.
Define
. Our lemma allows us to conclude that
converges as
if and only if
converges and is nonzero, which is whenever
. When
we recover a sum that is almost the usual zeta function, but the numbers relatively prime to
have been removed. This sum still diverges as
. Now consider the sum
$ \begin{eqnarray*} \sum_{\chi \in X_n} N(s, \chi) \bar{\chi}(a) & = & \sum_{p} \frac {1}{p^s} \left( \sum_{\chi \in X_n} \chi(pa^{ - 1}) \right) \\ & = & \sum_{p \equiv a \bmod n} \frac {\varphi(n)}{p^s}. \\ \end{eqnarray*}$.
The LHS has a single term that diverges (when
); the result follows. Notice how this generalizes our argument for
. The Dirichlet characters act as a roots of unity filter, causing the corresponding coefficient to evaluate to
unless
. QED. 
So here we've got two very different approaches to proving that certain congruence classes contain infinitely many primes. One uses divisibility properties of polynomials; this is something solid. Primes relate to divisibility. We can sink our teeth into this method of proof. Unfortunately, it's not enough.
The other proceeds by showing that certain sums are nonzero and therefore certain other sums diverge; this is a very powerful technique, but very far removed from what we usually consider "number theory." Can we strengthen the first technique, then?
Let's return to the main step of the proof from the other post, which is about primes that divide
for some integers
. We know that such primes either divide
or are congruent to
. Unfortunately, no similar divisibility relations exist for primes congruent to
generally. So what do we do?
We saw some interesting consequences of this divisibility when
. When
,

and when
,
.
This viewpoint is generalized by the ring of integers of a cyclotomic field. Now instead of doing analytic number theory, we're doing algebraic number theory. Watch what happens.
When
, the corresponding cyclotomic field is
, the ring of integers is the Gaussian integers that we know and love, and the primes congruent to
factor into a product of primes
. (These are primes over
, rather than primes over
.) The primes dividing
(here, just
) also factor:
.
When
, the corresponding cyclotomic field is
, the ring of integers is the Eisenstein integers, and the primes congruent to
again factor into a product of primes
. The primes dividing
(here, just
) again factor:
.
This is interesting. It suggests the following: given
, we can partition the ordinary (rational) primes into classes depending on whether (and how) they factor any further, and quite possibly those classes correspond to congruence classes
(with a special class for primes dividing
). Two special cases:
Case:
: if
, then

over the cyclotomic integers
. If
is prime over
, then it must divide one of these factors, which is clearly absurd. Therefore
must be composite. (I believe it's possible to show that, at least when we have prime factorization,
factors into
factors.)
Case:
. Because
, it divides
for some
. But then

and the primitive
roots are clearly contained in
, so again
must factor.
These results are easy enough to see explicitly when
because in those cases the degree of
(which is
in general) is
, so we can write an arbitrary element as
or as
, respectively. This is more difficult when
is larger. An example:
,
so
(a prime) factors into four factors in
. (The only such primes are
and the primes congruent to
.)
These notions (divisibility! We like divisibility!) can be generalized to give a proof of Dirichlet's theorem through the magic of Chebotarev's density theorem, of which Dirichlet's theorem is but a small, small subcase.
Practice Problem 1: Prove the Dirichlet character sums.
Practice Problem 2: Factor
in
. (Try looking at other values of
.) How about
?
Practice Problem 3 (if I recall properly; this is very hard): Show that
.

So we've talked about how to prove special cases of Dirichlet's theorem using polynomials. (The comment I posted contains a generalization). Let's talk about (the easy parts of) the classic proof of the theorem itself.
The Amazing Euler Product:


Proof. Write




The Euler Product is one of the most beautiful identities I know. It relates the Riemann Zeta function (of Riemann Hypothesis fame) to a product over the primes through prime factorization. Magical. Now we need a lemma.
Lemma: Let





Proof. See this thread. (Yes, I cheated.


Proposition: There are an infinite number of primes. In fact,

Proof. By our lemma,




Now,


N.B. It's more elementary to prove that there are an infinite number of primes alone; as


This is a proof of a quite different flavor than Euclid's proof that there are an infinite number of primes (generalized by the polynomial proof from the other post). It is strangely analytic. Such ideas belong to the fascinating realm of analytic number theory, another example of which is the celebrated Riemann hypothesis.
A short excursion. One of the first successes of analytic number theory was the prime number theorem, which states (in an equivalent form) that the




The substitution required to perform Cauchy condensation is analogous to (the inverse of) our use of the natural log to transform the Euler product into the sum of the reciprocal of the primes. Indeed, the argument we used to prove the divergence of the prime harmonic series suggests that, just as


A second short excursion.
Problem: Let


Solution.

Again, this solution has a strange flavor. We have not appealed to any facts intrinsic to polynomials (other than their loose connection to the

The above does suggest that there may exist multivariate polynomials (with no terms of degree


This is all thanks to the magic of the Euler product. Of course, my use of "the" Euler product is misleading. Let me show you what I mean.
Another Euler Product:

Proof. The integers congruent to


Proposition: There are an infinite number of primes congruent to




Proof. We are going to call the above sum



converges as


diverges. Adding and then subtracting the two sums, we have our result.
This seems like a rather powerful technique. Here it is motivated; by changing the sign of the part of the product corresponding to primes congruent to


The (Fully Multiplicative) Euler product: Let



The proof is identical to the proof of the original Euler product. So what function

It turns out that the only functions



- Whenever


-

- For



The Dirichlet characters relative to a particular








The character we chose (with respect to



We have the following two related sums (which are left as an exercise):
-



-



We are now somewhat ready to discuss Dirichlet's Theorem.
Dirichlet's Theorem: For




Sketch. Let


The hardest part of the proof of Dirichlet's Theorem is showing that whenever




Define








$ \begin{eqnarray*} \sum_{\chi \in X_n} N(s, \chi) \bar{\chi}(a) & = & \sum_{p} \frac {1}{p^s} \left( \sum_{\chi \in X_n} \chi(pa^{ - 1}) \right) \\ & = & \sum_{p \equiv a \bmod n} \frac {\varphi(n)}{p^s}. \\ \end{eqnarray*}$.
The LHS has a single term that diverges (when





So here we've got two very different approaches to proving that certain congruence classes contain infinitely many primes. One uses divisibility properties of polynomials; this is something solid. Primes relate to divisibility. We can sink our teeth into this method of proof. Unfortunately, it's not enough.
The other proceeds by showing that certain sums are nonzero and therefore certain other sums diverge; this is a very powerful technique, but very far removed from what we usually consider "number theory." Can we strengthen the first technique, then?
Let's return to the main step of the proof from the other post, which is about primes that divide





We saw some interesting consequences of this divisibility when



and when


This viewpoint is generalized by the ring of integers of a cyclotomic field. Now instead of doing analytic number theory, we're doing algebraic number theory. Watch what happens.
When

![$ \mathbb{Q}[\imath]$](http://latex.artofproblemsolving.com/5/9/e/59ecf36888a095cd045ea7fcf2b2a79cd445ce43.png)


![$ \mathbb{Z}[ \imath]$](http://latex.artofproblemsolving.com/7/f/f/7ffa45ad423293a7307407fbdcee23331c3bf115.png)




When

![$ \mathbb{Q}[\omega]$](http://latex.artofproblemsolving.com/a/2/a/a2a00b836d4048dfe93f6b5a77ba911a1e45bc19.png)





This is interesting. It suggests the following: given
![$ \mathbb{Z}[\zeta_n]$](http://latex.artofproblemsolving.com/2/f/4/2f4970148c6742ff4e2e27edce791b688b7a71a9.png)


Case:



over the cyclotomic integers
![$ \mathbb{Z}[\zeta_n]$](http://latex.artofproblemsolving.com/2/f/4/2f4970148c6742ff4e2e27edce791b688b7a71a9.png)

![$ \mathbb{Z}[\zeta_n]$](http://latex.artofproblemsolving.com/2/f/4/2f4970148c6742ff4e2e27edce791b688b7a71a9.png)



Case:





and the primitive

![$ \mathbb{Z}[\zeta_n]$](http://latex.artofproblemsolving.com/2/f/4/2f4970148c6742ff4e2e27edce791b688b7a71a9.png)

These results are easy enough to see explicitly when

![$ \mathbb{Q}[\zeta_n] : \mathbb{Q}$](http://latex.artofproblemsolving.com/1/c/0/1c02d6cc1c0d7ca0909447b38841783c97439df5.png)






so

![$ \mathbb{Z}[\zeta_5]$](http://latex.artofproblemsolving.com/c/9/d/c9de224aab8988fcf162463b422a46280df686f2.png)


These notions (divisibility! We like divisibility!) can be generalized to give a proof of Dirichlet's theorem through the magic of Chebotarev's density theorem, of which Dirichlet's theorem is but a small, small subcase.
Practice Problem 1: Prove the Dirichlet character sums.
Practice Problem 2: Factor

![$ \mathbb{Z}[\zeta_5]$](http://latex.artofproblemsolving.com/c/9/d/c9de224aab8988fcf162463b422a46280df686f2.png)


Practice Problem 3 (if I recall properly; this is very hard): Show that
