A formal look at formal differentiation
by t0rajir0u, Sep 10, 2008, 1:47 AM
MIT is keeping me very busy! Here's a post I started several weeks ago and haven't gotten around to finishing until now.
The average student first encounters differentiation on polynomials in the context of calculus, where the derivative of a function describes the slopes of its tangent lines. However, it is often desirable to compute the derivative of polynomials in a calculus-free context (for example,
) for a few reasons. It is therefore customary to introduce the formal derivative, defined as the operation on polynomials in
for an arbitrary ring
that mimics the usual notion of derivative.
I find the usual presentation of the formal derivative unsatisfying: it is simply defined by its actions on the standard basis
of
, that is,
, with the only motivation being the analytic notion of a derivative. In other words, because we know the derivative has certain properties when defined analytically, it also has the same properties when defined formally.
This is personally deeply unsatisfying. Besides, we have already seen that as far as generating functions go, the derivative is intimately related to the binomial theorem and nice simple ideas like that, so there ought to be a more elegant way of defining the formal derivative. The following formulation relies on an observation about intuitive infinitesimal logic:
Given a polynomial
, we can compute
by ignoring any terms in the numerator where
appears with degree
or larger.
This is because after dividing by
, any remaining terms containing an
approach zero. In other words, since the standard notion of a derivative only wants a first-order approximation in
, we can "mod out" by second-order terms
and higher. This observation can be made precise, but first we will need to emphasize the distinction between a polynomial as an abstract member of
and as a function
defined by the so-called "evaluation map." Because we can compose polynomials, it is better to think of a polynomial as a function from
to
.
Therefore, we will extend
by working over
and think of polynomials
as functions from
to
.
Note the similarity to how the complex numbers can be defined abstractly. This time,
is not an irreducible polynomial, so the resulting ring contains zero divisors, but the important part is that we can now define the derivative as

and immediately recover all the familiar properties of the derivative without any further work.
There is a nice intuitive explanation as to why
contains zero divisors; since
, the only non-invertible elements are the ones with
; in other words, one cannot divide by
. Intuitively, because
represents an infinitesimal change, its inverse is an infinitely large change. While this is all in accord with our intuition about limits and derivatives, again, there is no need to actually cite any notion of a limit.
What properties of the formal derivative remain useful over an arbitrary ring? Well, the product rule, for one. The proofs here are easier than in (standard) analysis because we don't have to worry about limits or convergence; we just write, for example,
where
is the part of
divisible by
and make the observation that the product of any two such expressions is defined to be zero, so
.
The product rule implies another much more useful property which is priceless in number theory, which is that (at least when
is a field) the formal derivative of a polynomial detects repeated roots. In other words, we have the following
Proposition: If
where
, then
where
.
As the Wikipedia article mentions, this proposition holds true regardless of whether
is actually in
; in other words, although it may not be obvious that a given polynomial has repeated roots over some
, the appropriate field extension will reveal repeated roots that the formal derivative detects without any "knowledge" of field extensions.
The formal derivative is used to develop the theory of finite fields, but instead I'd like to discuss an interesting theorem. Fermat's last theorem may be extremely hard on the integers, but for polynomials it is not only true but a straightforward exercise. The main tool here is the Mason-Stothers theorem, which states that if polynomials
with greatest degree
satisfy

then the number of distinct roots of
is at least
. This theorem is proven with formal differentiation (and is left as an exercise), but two important points need to be made.
- The first step is the statement that
.
- No corresponding theorem is known about the prime factors of integers such that
.
There is only a conjecture known as the ABC conjecture that implies Fermat's Last Theorem on the integers for sufficiently large exponents, but that has resisted proof even though Fermat's Last Theorem itself has been proven!
The statement of the ABC conjecture makes use of a notion called the radical of an integer.
is defined as the product of the distinct prime factors of
. The corresponding notion for polynomials, the product of the distinct irreducible factors, is given by
.
The degree of the radical is the number of distinct factors, and this is essentially the motivation for the statement of the ABC conjecture. When we try to deduce a corresponding formula for the radical of an integer, we are led to the following question: can we define a formal derivative on the integers that detects multiple roots?
The biggest obstruction to a satisfactory answer to this problem is that the formal derivative is linear on polynomials, but any linear function on the integers is determined by its value at
, so any such formal derivative is necessarily nonlinear and we lose a big tool (linear algebra) in passing from Mason's theorem to the ABC conjecture. Nevertheless, let's persevere. Without even the
construction to help us out, we must be abstract. A general notion of derivative is characterized by the fact that it obeys the product rule
.
Any function
that obeys the product rule is therefore determined by its values at the primes. Going back to polynomials, over an algebraic closure the derivative of a "prime" (that is, a linear factor
) is
, so let's take that as our starting point. This uniquely defines a function known as the number derivative, which has an explicit formula as follows: if
, then
.
Compare with the formula

where
, which is a straightforward consequence of logarithmic differentiation. Unfortunately, because we cannot distinguish between the multiplicities
and the prime factors themselves, things like the radical formula and the Proposition above about repeated roots do not remain true; for example,
, so the radical
does not have a simple expression in terms of
and
and there is no way to detect the repeated roots of
this way. The number derivative appears, nevertheless, to have deep properties, and if it were well-understood we might gain valuable insight into problems such as the Goldbach conjecture. Unfortunately, the number derivative does not relate to the ABC conjecture in the sense that the formal derivative relates to Mason's theorem because, again, the number derivative is not linear, so the first step in the proof of Mason's Theorem fails.
The structure of
is therefore "richer" in some sense than the structure of
; it has vector space structure that
lacks entirely, and this can be understood as one reason Fermat's Last Theorem is much harder to prove over the integers than over polynomial rings. Linearity makes computing the "formal indefinite integral" (the set of polynomials with a given formal derivative) on polynomials easy, whereas the "number integral" (the set of integers with a given number derivative) is notoriously difficult to understand, although insight in this regard would be extremely interesting.
Practice Problem 1: Show that
.
Practice Problem 2: Prove the Mason-Stothers theorem.
Practice Problem 3: Let
be a prime. Show that an irreducible polynomial over (edit:) a field of prime characteristic can have repeated roots only if each exponent is divisible by
. (Do all of these polynomials in fact have repeated roots?)
The average student first encounters differentiation on polynomials in the context of calculus, where the derivative of a function describes the slopes of its tangent lines. However, it is often desirable to compute the derivative of polynomials in a calculus-free context (for example,

![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)

I find the usual presentation of the formal derivative unsatisfying: it is simply defined by its actions on the standard basis

![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)

This is personally deeply unsatisfying. Besides, we have already seen that as far as generating functions go, the derivative is intimately related to the binomial theorem and nice simple ideas like that, so there ought to be a more elegant way of defining the formal derivative. The following formulation relies on an observation about intuitive infinitesimal logic:
Given a polynomial




This is because after dividing by




![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)

![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)

Therefore, we will extend
![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)
![$ R[x, h] / (h^2)$](http://latex.artofproblemsolving.com/e/5/6/e561f63019f6d4be3de51c574b6986ba1d8be1bc.png)
![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)
![$ R[x, h] / (h^2)$](http://latex.artofproblemsolving.com/e/5/6/e561f63019f6d4be3de51c574b6986ba1d8be1bc.png)

Note the similarity to how the complex numbers can be defined abstractly. This time,


and immediately recover all the familiar properties of the derivative without any further work.
There is a nice intuitive explanation as to why
![$ R[h]/(h^2)$](http://latex.artofproblemsolving.com/3/8/2/3826aa8e1e8e59c5ea9780fcc04afd164b708bdf.png)




What properties of the formal derivative remain useful over an arbitrary ring? Well, the product rule, for one. The proofs here are easier than in (standard) analysis because we don't have to worry about limits or convergence; we just write, for example,





The product rule implies another much more useful property which is priceless in number theory, which is that (at least when

Proposition: If




As the Wikipedia article mentions, this proposition holds true regardless of whether



The formal derivative is used to develop the theory of finite fields, but instead I'd like to discuss an interesting theorem. Fermat's last theorem may be extremely hard on the integers, but for polynomials it is not only true but a straightforward exercise. The main tool here is the Mason-Stothers theorem, which states that if polynomials



then the number of distinct roots of


- The first step is the statement that

- No corresponding theorem is known about the prime factors of integers such that

There is only a conjecture known as the ABC conjecture that implies Fermat's Last Theorem on the integers for sufficiently large exponents, but that has resisted proof even though Fermat's Last Theorem itself has been proven!
The statement of the ABC conjecture makes use of a notion called the radical of an integer.



The degree of the radical is the number of distinct factors, and this is essentially the motivation for the statement of the ABC conjecture. When we try to deduce a corresponding formula for the radical of an integer, we are led to the following question: can we define a formal derivative on the integers that detects multiple roots?
The biggest obstruction to a satisfactory answer to this problem is that the formal derivative is linear on polynomials, but any linear function on the integers is determined by its value at

![$ R[h]/h^2$](http://latex.artofproblemsolving.com/6/5/1/651045d9c6dfd8dff7d636a5b6ba9ea211af9fb3.png)

Any function





Compare with the formula

where







The structure of
![$ R[x]$](http://latex.artofproblemsolving.com/b/a/a/baa6cdd5df65f82633b7bd4805d039492d7756ff.png)


Practice Problem 1: Show that
![$ R[x, h] / (h^2) \simeq \left( R[h]/(h^2) \right)[x]$](http://latex.artofproblemsolving.com/a/9/5/a95328cf0d7ea400a66b29d3be3ba47c8da1d0aa.png)
Practice Problem 2: Prove the Mason-Stothers theorem.
Practice Problem 3: Let

