A formal look at formal differentiation

by t0rajir0u, Sep 10, 2008, 1:47 AM

MIT is keeping me very busy! Here's a post I started several weeks ago and haven't gotten around to finishing until now.


The average student first encounters differentiation on polynomials in the context of calculus, where the derivative of a function describes the slopes of its tangent lines. However, it is often desirable to compute the derivative of polynomials in a calculus-free context (for example, $ \bmod p$) for a few reasons. It is therefore customary to introduce the formal derivative, defined as the operation on polynomials in $ R[x]$ for an arbitrary ring $ R$ that mimics the usual notion of derivative.

I find the usual presentation of the formal derivative unsatisfying: it is simply defined by its actions on the standard basis $ \{ x^k \}$ of $ R[x]$, that is, $ x^k \mapsto kx^{k - 1}$, with the only motivation being the analytic notion of a derivative. In other words, because we know the derivative has certain properties when defined analytically, it also has the same properties when defined formally.

This is personally deeply unsatisfying. Besides, we have already seen that as far as generating functions go, the derivative is intimately related to the binomial theorem and nice simple ideas like that, so there ought to be a more elegant way of defining the formal derivative. The following formulation relies on an observation about intuitive infinitesimal logic:

Given a polynomial $ f(x)$, we can compute $ f'(x) = \lim_{h \to \infty} \frac {f(x + h) - f(x)}{h}$ by ignoring any terms in the numerator where $ h$ appears with degree $ h^2$ or larger.

This is because after dividing by $ h$, any remaining terms containing an $ h$ approach zero. In other words, since the standard notion of a derivative only wants a first-order approximation in $ h$, we can "mod out" by second-order terms $ h^2$ and higher. This observation can be made precise, but first we will need to emphasize the distinction between a polynomial as an abstract member of $ R[x]$ and as a function $ R \to R$ defined by the so-called "evaluation map." Because we can compose polynomials, it is better to think of a polynomial as a function from $ R[x]$ to $ R$.

Therefore, we will extend $ R[x]$ by working over $ R[x, h] / (h^2)$ and think of polynomials $ R[x]$ as functions from $ R[x, h] / (h^2)$ to $ R$.

Note the similarity to how the complex numbers can be defined abstractly. This time, $ h^2$ is not an irreducible polynomial, so the resulting ring contains zero divisors, but the important part is that we can now define the derivative as

$ f'(x) = \frac {f(x + h) - f(x)}{h}$

and immediately recover all the familiar properties of the derivative without any further work.


There is a nice intuitive explanation as to why $ R[h]/(h^2)$ contains zero divisors; since $ (a + bh)(a - bh) = a^2$, the only non-invertible elements are the ones with $ a = 0$; in other words, one cannot divide by $ h$. Intuitively, because $ h$ represents an infinitesimal change, its inverse is an infinitely large change. While this is all in accord with our intuition about limits and derivatives, again, there is no need to actually cite any notion of a limit.


What properties of the formal derivative remain useful over an arbitrary ring? Well, the product rule, for one. The proofs here are easier than in (standard) analysis because we don't have to worry about limits or convergence; we just write, for example, $ f(x + h) = f(x) + df$ where $ df$ is the part of $ f(x + h)$ divisible by $ h$ and make the observation that the product of any two such expressions is defined to be zero, so $ f(x + h) g(x + h) - f(x) g(x) = f(x) dg + g(x) df$.

The product rule implies another much more useful property which is priceless in number theory, which is that (at least when $ R$ is a field) the formal derivative of a polynomial detects repeated roots. In other words, we have the following

Proposition: If $ f(x) = (x - a)^k g(x)$ where $ g(a) \neq 0$, then $ f'(x) = (x - a)^{k - 1} h(x)$ where $ h(a) \neq 0$.

As the Wikipedia article mentions, this proposition holds true regardless of whether $ a$ is actually in $ R$; in other words, although it may not be obvious that a given polynomial has repeated roots over some $ R$, the appropriate field extension will reveal repeated roots that the formal derivative detects without any "knowledge" of field extensions.

The formal derivative is used to develop the theory of finite fields, but instead I'd like to discuss an interesting theorem. Fermat's last theorem may be extremely hard on the integers, but for polynomials it is not only true but a straightforward exercise. The main tool here is the Mason-Stothers theorem, which states that if polynomials $ a(x), b(x), c(x)$ with greatest degree $ d$ satisfy

$ a(x) + b(x) = c(x)$

then the number of distinct roots of $ a(x) b(x) c(x)$ is at least $ d + 1$. This theorem is proven with formal differentiation (and is left as an exercise), but two important points need to be made.

- The first step is the statement that $ a'(x) + b'(x) = c'(x)$.
- No corresponding theorem is known about the prime factors of integers such that $ a + b = c$.

There is only a conjecture known as the ABC conjecture that implies Fermat's Last Theorem on the integers for sufficiently large exponents, but that has resisted proof even though Fermat's Last Theorem itself has been proven!

The statement of the ABC conjecture makes use of a notion called the radical of an integer. $ \text{rad}(n)$ is defined as the product of the distinct prime factors of $ n$. The corresponding notion for polynomials, the product of the distinct irreducible factors, is given by

$ \text{rad}(f(x)) = \frac {f(x)}{\gcd(f(x), f'(x))}$.

The degree of the radical is the number of distinct factors, and this is essentially the motivation for the statement of the ABC conjecture. When we try to deduce a corresponding formula for the radical of an integer, we are led to the following question: can we define a formal derivative on the integers that detects multiple roots?

The biggest obstruction to a satisfactory answer to this problem is that the formal derivative is linear on polynomials, but any linear function on the integers is determined by its value at $ 1$, so any such formal derivative is necessarily nonlinear and we lose a big tool (linear algebra) in passing from Mason's theorem to the ABC conjecture. Nevertheless, let's persevere. Without even the $ R[h]/h^2$ construction to help us out, we must be abstract. A general notion of derivative is characterized by the fact that it obeys the product rule

$ (ab)' = a'b + b'a$.

Any function $ \mathbb{N} \to \mathbb{N}$ that obeys the product rule is therefore determined by its values at the primes. Going back to polynomials, over an algebraic closure the derivative of a "prime" (that is, a linear factor $ x - a$) is $ 1$, so let's take that as our starting point. This uniquely defines a function known as the number derivative, which has an explicit formula as follows: if $ n = p_1^{c_1} p_2^{c_2} ... p_n^{c_n}$, then

$ \frac {n'}{n} = \frac {c_1}{p_1} + \frac {c_2}{p_2} + ... + \frac {c_n}{p_n}$.

Compare with the formula

$ \frac {f'(x)}{f(x)} = \frac {c_1}{(x - r_1)} + \frac {c_2}{(x - r_2)} + ... + \frac {c_n}{(x - r_n)}$

where $ f(x) = (x - r_1)^{c_1} (x - r_2)^{c_2} ... (x - r_n)^{c_n}$, which is a straightforward consequence of logarithmic differentiation. Unfortunately, because we cannot distinguish between the multiplicities $ c_k$ and the prime factors themselves, things like the radical formula and the Proposition above about repeated roots do not remain true; for example, $ 4' = 4$, so the radical $ \text{rad}(4) = 2$ does not have a simple expression in terms of $ 4$ and $ 4'$ and there is no way to detect the repeated roots of $ 4$ this way. The number derivative appears, nevertheless, to have deep properties, and if it were well-understood we might gain valuable insight into problems such as the Goldbach conjecture. Unfortunately, the number derivative does not relate to the ABC conjecture in the sense that the formal derivative relates to Mason's theorem because, again, the number derivative is not linear, so the first step in the proof of Mason's Theorem fails.

The structure of $ R[x]$ is therefore "richer" in some sense than the structure of $ \mathbb{Z}$; it has vector space structure that $ \mathbb{Z}$ lacks entirely, and this can be understood as one reason Fermat's Last Theorem is much harder to prove over the integers than over polynomial rings. Linearity makes computing the "formal indefinite integral" (the set of polynomials with a given formal derivative) on polynomials easy, whereas the "number integral" (the set of integers with a given number derivative) is notoriously difficult to understand, although insight in this regard would be extremely interesting.


Practice Problem 1: Show that $ R[x, h] / (h^2) \simeq \left( R[h]/(h^2) \right)[x]$.

Practice Problem 2: Prove the Mason-Stothers theorem.

Practice Problem 3: Let $ p$ be a prime. Show that an irreducible polynomial over (edit:) a field of prime characteristic can have repeated roots only if each exponent is divisible by $ p$. (Do all of these polynomials in fact have repeated roots?)

Comment

3 Comments

The post below has been deleted. Click to close.
This post has been deleted. Click here to see post.
Maybe I'm missing something, but for the third exercise, how can an irreducible polynomial over $ \mathbb{Z}/p\mathbb{Z}$ have roots over $ \mathbb{Z}/p\mathbb{Z}$? Or are we looking at roots in a field extension or something?

by tjhance, Sep 10, 2008, 2:03 AM

The post below has been deleted. Click to close.
This post has been deleted. Click here to see post.
Yes. Perhaps I should've made that clearer, but formal differentiation detects multiple roots in the algebraic closure.

by t0rajir0u, Sep 10, 2008, 5:18 AM

The post below has been deleted. Click to close.
This post has been deleted. Click here to see post.
The way I always preferred to look at differentiation of polynomials is if $ f(x) = c(x - r_1)(x - r_2) \ldots (x - r_n) = c \prod (x - r_i)$, then the product rule generalized for $ n$ factors gives $ f(x) = c \sum_{i=1}^n \prod_{j \neq i} (x - r_j)$. This gives immediate proofs of propositions like a polynomial with no repeated roots shares no roots with its derivative.

by MellowMelon, Sep 12, 2008, 11:12 PM

Various mathematical thoughts, ranging from problem-solving techniques to attempts to tie together related concepts.

avatar

t0rajir0u
Archives
+ March 2009
+ October 2007
+ May 2007
Shouts
Submit
  • orz $~~~~$

    by clarkculus, Jan 10, 2025, 4:13 PM

  • Insanely auraful

    by centslordm, Jan 1, 2025, 11:17 PM

  • Fly High :(

    by Siddharthmaybe, Oct 22, 2024, 8:34 PM

  • Dang it he is gone :(( /revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive/revive.

    by 799786, Aug 4, 2022, 1:56 PM

  • annoying precision

    by centslordm, May 16, 2021, 7:34 PM

  • rip t0rajir0u

    by OlympusHero, Dec 5, 2020, 9:29 PM

  • Shoutbox bump xD

    by DuoDuoling0, Oct 4, 2020, 2:25 AM

  • dang hes gone :(

    by OlympusHero, Jul 28, 2020, 3:52 AM

  • First shout in July

    by smartguy888, Jul 20, 2020, 3:08 PM

  • https://artofproblemsolving.com/community/c2448

    has more.

    -πφ

    by piphi, Jun 12, 2020, 8:20 PM

  • wait hold up 310,000
    people visited this man?!?!??

    by srisainandan6, May 29, 2020, 5:16 PM

  • first shout in 2020

    by OlympusHero, Apr 4, 2020, 1:15 AM

  • in his latest post he says he moved to wordpress

    by MelonGirl, Nov 16, 2019, 2:43 AM

  • Please revive!

    by AopsUser101, Oct 30, 2019, 7:10 PM

  • first shout in october fj9odiais

    by bulbasaur., Oct 14, 2019, 1:14 AM

128 shouts
Tags
About Owner
  • Posts: 12167
  • Joined: Nov 20, 2005
Blog Stats
  • Blog created: Dec 5, 2006
  • Total entries: 48
  • Total visits: 321832
  • Total comments: 202
Search Blog
a