The first digits of powers of 2 and Benford's Law

by greenturtle3141, Dec 5, 2023, 4:08 PM

Reading Difficulty: 3/5

Intro

The distribution of the last digits of powers of $2$ is pretty easy to find. After $2^0$, the last digits are $2, 4, 8, 6$, repeating thereafter. Hence, $25\%$ of powers of two end in $2$, $25\%$ end in $4$, $25\%$ end in $8$, and $25\%$ end in $6$. To make this "percentage" more formal, you would use a limit. For example,
$$\text{Fraction of powers that end in 2} := \lim_{n \to \infty} \frac{\text{\# of integers in $\{2^0,2^1,\cdots,2^{n-1}\}$ that end in 2}}{n} = \frac14.$$What if I told that the distribution of the first digit is... also not terribly difficult to obtain? This surprised me, since I expected the first digits of powers of $2$ to be ill-behaved and hard to tame. It turns out that with a little trick, you can reason about the first digits with ease!

I think the trick is a pretty cute, so I'll let you try to come up with it yourself via the following exercise before we continue.

Exercise for you: Prove that there exists a power of $2$ whose first nine digits are $123456789\ldots$.

Hint 1

Hint 2

Solution


Weyl Equidistribution Theorem and Ergodic Theory

Studying successive multiplication by $2$ is unwholesome. Therefore, the (first) key idea was to take a log! We convert the problem to studying the arithmetic sequence
$$\log_{10}2,2\log_{10}2,3\log_{10}2,\cdots$$taken "mod 1" (i.e. taking the fractional parts). This is because, for example, $2^k$ begins with a $5$ if and only if $\log_{10} 5 \leq \{\log_{10}(2^k)\} < \log_{10} 6$, and conveniently $\log_{10}(2^k) = k\log_{10}2$.

A nice way to visualize the sequence $\{k\log_{10}2\}_k$ "mod 1" is to draw a circle with circumference $1$. Then, the fractional parts of $k\log_{10}2$ form an infinite sequence of points on the circle, formed by making "jumps" of length $\log_{10}2$ along the circumference. If you keep jumping forever, the points you land on intuitively form a dense subset of the circle.

It's also intuitive that the points we land on are "equally distributed". Over the infinitely many points we jump on, there shouldn't be more points near $x$ than $y$. This intuition is formalized by the following theorem, called the Weyl Equidistribution Theorem.

Theorem (Weyl): Suppose we jump around a circle of unit circumference, with each jump being of length $x$ where $x$ is an irrational number. Then, for any arc $I$, we have that the proportion of jumps that land in $I$ is precisely the length of $I$.

More formally... Click to reveal hidden text

Using this theorem, we can compute the exact proportion of powers of $2$ that begin with the digit $5$! The numbers between $\log_{10}5$ and $\log_{10}6$ form an interval of length $\log_{10}6 - \log_{10}5$. On the circle, that corresponds to an arc of the same length. So by the Weyl Equidistribution Theorem, the proportion of powers of $2$ that begin with $5$ is exactly $\log_{10}6 - \log_{10}5$.

More generally, the proportion of powers of $2$ that begin with the digit $d$ is $\log_{10}(d+1) - \log_{10}d = \log_{10}(1+1/d)$.

There is a surprising connection with ergodic theory, which is essentially the study of dynamical systems. Essentially, we can view "jumping by $x$" along the circle as iterated compositions of the transformation function $T(y) = y+x \pmod{1}$, which is a setting that is handled incredibly well by ergodic theory. It turns out that the transformation $T$ is ergodic when $x$ is irrational, and results in ergodic theory can be used to prove the Weyl Equidistribution theorem.

Benford's Law

If you look at sets of data that span various magnitudes, such as population sizes, it's quite likely that the most common first digit in the data is $1$. In fact, the proportion of numbers that begin with $1$ in such data sets is typically around $\log_{10}2$. The pattern continues for other possible starting digits:
  • $2$: $\log_{10}(3/2)$
  • $3$: $\log_{10}(4/3)$
  • $4$: $\log_{10}(5/4)$
  • $5$: $\log_{10}(6/5)$
  • $\cdots$
  • $9$: $\log_{10}(10/9)$
The phenomenon that the first digits tend to follow this distribution is called Benford's Law. If the numbers here look familiar, that's because you just saw them like 30 seconds ago. The distribution of the first digits of powers of $2$ satisfy Benford's Law exactly!

Beyond the First Digit

Reading Difficulty: lol/10

What proportion of powers of $2$ begin with $3141$? We can follow the arguments we've made up to now to see that this is not hard at all. The $n$th power of $2$ begins with $3141$ exactly when $n\log_{10}2$ lies within the "arc" between $\log_{10}3141$ and $\log_{10}3142$. So by Weyl Equidistribution, the proportion is exactly $\log_{10}3142 - \log_{10}3141 = \log_{10}\left(1 + \frac{1}{3141}\right)$. This generalizes in an obvious way: For any positive integer $N$, the proportion of powers of $2$ that "begin with $N$" is exactly $\log_{10}\left(1+\frac1N\right)$.

Now here's a fun question: What is the distribution of the $k$th digit? Does the distribution of the $k$th digit approach some limiting distribution as $k \to \infty$? It turns out that it does! It approaches the uniform $(1/10,\cdots,1/10)$ distribution exponentially fast.

Theorem: Let $F_k$ be the proportion of powers of $2$ whose $k$th digit is $d$. Then $\lim_{k \to \infty} F_k = \frac1{10}$.

Proof. For $k \geq 2$, the $k$th digit is $d$ exactly when the first $k$ digits end in $d$. That is, the first $k$ digits are $10i+d$ where $i$ is a $(k-1)$-digit number. For such an $i$, the proportion of powers of $2$ that begin with $10i+d$ is given by
$$\log_{10}\left(1+\frac{1}{10i+d}\right),$$as we discussed previously. Since $i$ ranges over $(k-1)$-digit numbers, it will range from $10^{k-2}$ to $10^{k-1}-1$. Hence the proportion of powers whose $k$th digit is $d$ is given exactly by
$$F_k = \sum_{i=10^{k-2}}^{10^{k-1}-1}\log_{10}\left(1+\frac{1}{10i+d}\right).$$Base $10$ is stupid, so write this as
$$F_k = \frac{1}{\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1}\log\left(1+\frac{1}{10i+d}\right).$$We now use the bounds $x - \frac{x^2}2< \log(1+x) < x$ for $x > 0$. This gives
$$\frac{1}{\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{10i+d} - \frac{1}{10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{2(10i+d)^2} < F_k < \frac{1}{\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{10i+d}.$$The error term $\frac{1}{10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{2(10i+d)^2}$ is extremely small. Indeed, the sloppy bounds
$$\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{(10i+d)^2} \leq \sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{(10^{k-1})^2} \leq \frac{10^{k-1}}{(10^{k-1})^2} \to 0$$show that it vanishes rapidly as $k \to \infty$. Thus $\lim_k F_k = \lim_k \frac{1}{\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{10i+d}$. To handle the new limit, we write the sum as
$$\frac{1}{\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{10i+d} = \frac{1}{10\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{i+d/10}$$and use the standard integral bound
$$\int_{i}^{i+1} \frac{1}{x+d/10}\,dx \leq \frac{1}{i+d/10} \leq \int_{i-1}^{i} \frac{1}{x+d/10}\,dx$$to find that
$$\frac{1}{10\log 10}\int_{10^{k-2}}^{10^{k-1}} \frac{1}{x+d/10}\,dx \leq \frac{1}{10\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{i+d/10} \leq \frac{1}{10\log 10}\int_{10^{k-2}-1}^{10^{k-1}-1} \frac{1}{x+d/10}\,dx,$$or
$$\frac{1}{10\log 10} \log\left(\frac{d/10 + 10^{k-1}}{d/10 + 10^{k-2}}\right) \leq \frac{1}{10\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{i+d/10} \leq \frac{1}{10\log 10} \log\left(\frac{d/10 + 10^{k-1}-1}{d/10 + 10^{k-2}-1}\right).$$But it's clear that
$$\lim_{k \to \infty} \log\left(\frac{d/10 + 10^{k-1}}{d/10 + 10^{k-2}}\right) = \lim_{k \to \infty} \log\left(\frac{d/10 + 10^{k-1}-1}{d/10 + 10^{k-2}-1}\right) = \log 10,$$hence
$$\lim_{k \to \infty} F_k = \lim_k \frac{1}{10\log 10}\sum_{i=10^{k-2}}^{10^{k-1}-1} \frac{1}{i+d/10} = \frac{1}{10 \log 10} \log 10 = \frac{1}{10},$$as needed. $\square$

One Last Thing...

Throughout the whole post, we were talking about powers of two. But is two actually important here? Absolutely not! As long as $b$ is not a power of $10$, this entire post still holds for powers of $b$!

So, for instance, the first digits of $9001420^n$ satisfy Benford's Law, and the $k$th digits of the powers of $9001420$ approach a uniform distribution as $k \to \infty$.
This post has been edited 4 times. Last edited by greenturtle3141, Dec 5, 2023, 10:05 PM

Comment

0 Comments

Turtle math!

avatar

greenturtle3141
Archives
+ October 2024
Shouts
Submit
  • Can you give some thought to dropping a guide to STS? Just like how you presented your research (in your paper), what your essays were about, etc. Also cool blog!

    by Shreyasharma, Mar 13, 2025, 7:03 PM

  • this is so good

    by purpledonutdragon, Mar 4, 2025, 2:05 PM

  • orz usamts grader

    by Lhaj3, Jan 23, 2025, 7:43 PM

  • Entertaining blog

    by eduD_looC, Dec 31, 2024, 8:57 PM

  • wow really cool stuff

    by kingu, Dec 4, 2024, 1:02 AM

  • Although I had a decent college essay, this isn't really my specialty so I don't really have anything useful to say that isn't already available online.

    by greenturtle3141, Nov 3, 2024, 7:25 PM

  • Could you also make a blog post about college essay writing :skull:

    by Shreyasharma, Nov 2, 2024, 9:04 PM

  • what gold

    by peace09, Oct 15, 2024, 3:39 PM

  • oh lmao, i was confused because of the title initially. thanks! great read

    by OlympusHero, Jul 20, 2024, 5:00 AM

  • It should be under August 2023

    by greenturtle3141, Jul 11, 2024, 11:44 PM

  • does this blog still have the post about your math journey? for some reason i can't find it

    by OlympusHero, Jul 10, 2024, 5:41 PM

  • imagine not tortoise math

    no but seriously really interesting blog

    by fruitmonster97, Apr 2, 2024, 12:39 AM

  • W blog man

    by s12d34, Jan 24, 2024, 11:37 PM

  • very nice blog greenturtle it is very descriptive and fascinating to pay attention to :-D

    by StarLex1, Jan 3, 2024, 3:12 PM

  • orz blog

    by ryanbear, Dec 6, 2023, 9:23 PM

67 shouts
Tags
About Owner
  • Posts: 3555
  • Joined: Oct 14, 2014
Blog Stats
  • Blog created: Oct 23, 2021
  • Total entries: 54
  • Total visits: 41119
  • Total comments: 126
Search Blog
a