2023 CMO Problems/Problem 3

Given a prime number $p \geq 5$ , let $\Omega=\{1,2, \ldots, p\}$ . For any $x, y \in \Omega$ , define: $r(x, y)= \begin{cases}y-x, & \text { if } y \geq x \\ y-x+p, & \text { if } y<x\end{cases}$

For a non-empty subset $A$ of $\Omega$ , define: $f(A)=\sum_{x \in A} \sum_{y \in A}(r(x, y))^2$

A subset $A$ of $\Omega$ is called a "good subset" if $0<|A|<p$ and for any subset $B$ of $\Omega$ with $|B|=|A|$ , it holds that $f(B) \geq f(A)$ . Find the largest positive integer $L$ such that there exist $L$ pairwise distinct good subsets $A_1, A_2, \ldots, A_L$ of $\Omega$ satisfying $A_1 \subseteq A_2 \subseteq \cdots \subseteq A_L$

Solution 1

Given a prime $p$ not less than 5 , define the directed distance between two points $x, y \in$ $\Omega:=\{1,2, \ldots, p\}$ as $D(x, y):=(y-x \bmod p)$ , taking the smallest non-negative remainder. For any non-empty proper subset $A$ of $\Omega$ , define $\mu(A):=\sum_{x \in A} \sum_{y \in A}(D(x, y))^2 .$

We call a non-empty proper subset $A$ a "good set" if and only if $\forall B:(B \subseteq \Omega, \# B=\# A) \rightarrow \mu(B) \geq \mu(A) .$ (where the \# symbol denotes the number of elements). Find the largest positive integer $l$ such that there exists a chain of good sets $A_1 \subsetneq A_2 \subsetneq \cdots \subsetneq A_l .$

Related Problems/Lemma 1: Given positive integers $m$ and $n$ , let $a_1, a_2, \ldots, a_m$ be integers summing to $n$ . Find the necessary and sufficient condition for $a_1^2+a_2^2+\cdots+a_m^2$ to attain its minimum possible value. Proof of Lemma 1: Since there are only finitely many ways to choose $a_i$ , there must be a minimum value. The assertion is: For the minimum value, for any $i \neq j, a_i-a_j \leq 1$ . Otherwise, suppose $a_i-a_j \geq 2$ . Then we can replace $a_i$ and $a_j$ with $a_i-1$ and $a_j+1$ , respectively, reducing the objective function, which contradicts minimality. Thus, the assertion holds, and the minimum is achieved when there are $r$ numbers equal to $k+1$ and $(n-r)$ numbers equal to $k$ , where $k$ and $r$ come from the division with remainder: $n=k m+r, \quad 0 \leq r<m$

Lemma 2: For any $1 \leq m \leq p-1$ and any real number $1 \leq x<2$ , the set consisting of the following $m$ elements is a good set: $a_t:=\left\lfloor x+\frac{t p}{m}\right\rfloor, \quad t=0,1,2, \ldots, m-1 .$ (The notation $\lfloor\cdot\rfloor$ denotes the floor function; indices taken modulo $m$ ). Furthermore, if the values of all $a_t$ are taken modulo $p$ to the smallest positive remainder, the conclusion still holds for any real number $x$ .

Proof of Lemma 2: For integers $s: 1 \leq s<m$ , we have $D\left(a_t, a_{t+s}\right)=\left(\left\lfloor x+\frac{(t+s) p}{m}\right\rfloor-\left\lfloor x+\frac{t p}{m}\right\rfloor\right) \bmod p$ , which is at least $\frac{s p}{m}-1$ and at most $\frac{s p}{m}+1$ . Note that $\frac{s p}{m} \notin \mathbb{Z}$ (using that $p$ is prime), meaning there are at most two adjacent values. For a set $B=\left\{b_0, b_1, \ldots, b_{m-1}\right\} \subseteq \Omega$ with $b_j<b_{j+1}$ , we have $\mu(B)=\sum_{s=1}^{m-1} \sum_{t=0}^{m-1} D\left(b_t, b_{t+s}\right)^2$ interpreted modulo $m$ . It is easy to verify that $\sum_{t=0}^{m-1} D\left(b_t, b_{t+s}\right)=s p$ . According to Lemma 1 , the minimum possible value of $\sum_{t=0}^{m-1} D\left(b_t, b_{t+s}\right)^2$ is achieved when $b_t=a_t$ (sufficient but not necessarily necessary in general). Thus, under the constraint of fixed element numbers, these $a_t$ minimize $\mu$ , proving Lemma 2.

Lemma 3: Continuing from Lemma 2's notation, if $B=\left\{b_0, b_1, \ldots, b_{m-1}\right\} \subseteq \Omega$ is a good set with $b_j<b_{j+1}$ , then the multiset $\left\{D\left(b_t, b_{t+s}\right), 0 \leq t<m\right\}=\left\{D\left(a_t, a_{t+s}\right), 0 \leq t<m\right\}$ and both satisfy the necessary and sufficient condition from Lemma 1. Proof of Lemma 3: A good set must satisfy $\mu(A)=\mu(B)=\sum_{s=1}^{m-1} \sum_{t=0}^{m-1} D\left(b_t, b_{t+s}\right)^2 \geq \sum_{s=1}^{m-1} \sum_{t=0}^{m-1} D\left(a_t, a_{t+s}\right)^2=\mu(A),$ thus all $\sum_{t=0}^{m-1} D\left(b_t, b_{t+s}\right)^2=\sum_{t=0}^{m-1} D\left(a_t, a_{t+s}\right)^2$ and by Lemma 1 (necessary and sufficient condition for minimizing the sum of squares), they are equal as multisets.

Lemma 4: The complement of a good set is also a good set.

Proof of Lemma 4: $\mu(A)-\mu(\Omega \backslash A) =\sum_{x \in A} \sum_{y \in A}(D(x, y))^2-\sum_{x \notin A} \sum_{y \notin A}(D(x, y))^2$ $=\sum_{x \in A} \sum_y(D(x, y))^2-\sum_{x \notin A} \sum_y(D(x, y))^2=(2 \# A-p)\left(1^2 \# A-p\right)\left(1^2+\cdots+(p-1)^2\right).$ which only depends on \#A, not the specific elements. Thus, the minimality of $\mu(A)$ is equivalent to the minimality of $\mu(\Omega \backslash A)$ when fixing the number of elements, making both good sets.

Returning to the original problem: First, in the case of non-Mersenne primes (i.e., $p \neq 2^k-1$ ), construct a chain of good sets of length $l=2\left(\left\lfloor\log _2 p\right\rfloor\right)=2\left(\left\lfloor\log _2(p+1)\right\rfloor\right)$ . Choose any $x$ and denote $a_{t, m}:=\left\lfloor x+\frac{t p}{m}\right\rfloor \bmod p, \quad t=0,1,2, \ldots, m-1$ taking the smallest positive remainder. According to Lemma 2, for each given $m$ , the above $a_{t, m}$ forms a good set denoted as $S_m$ . Construct a chain of good sets of length $\frac{l}{2}$ (choosing the same $x$ ): $A_1=S_1, A_2=S_2, A_3=S_4, \ldots, A_{\left[\log _2 p\right]}=S_{2^{\left[\log _2 p\right]-1}} .$

Their containment is quite obvious: in fact, $a_{t, m}=a_{2 t, 2 m}$ . The key is how to connect the good set of $2^{\left[\log _2 p\right]-1}$ elements to the good set of $p-$ $2^{\left[\log _2 p\right]-1}$ elements. In fact, let $B:=S_{2\left(\left[\log _2 p\right]-1\right)+1} \bmod p($ taking the smallest positive remainder). It is easy to verify $B \cap A_{1+\left[\log _2 p\right]}=\emptyset$ . Obviously, $B$ is also a good set, so $\Omega-$ $B$ is also a good set with $p-2^{\left[\log _2 p\right]-1}$ elements and contains $A_{1+\left[\log _2 p\right]}$ . We only need to let $A_{\left[\log _2 p\right]+j}:=\Omega \backslash\left(A_{\left[\log _2 p\right]+1-j}+1 \bmod p\right), \quad j=1,2, \ldots,\left\lfloor\log _2 p\right\rfloor,$ to complete the construction of the chain of good sets of length $l$ . For Mersenne primes (i.e., $p=2^k-1$ ), we need to construct a chain of good sets of length $l=2 k$ . Choose any $x$ , and according to Lemma 2 , the good set of elements $2^{k-1}=\frac{p+1}{2}$ is denoted as $S_{2^{k-1}}$ as $A_{k+1}$ . Let $A_k=\Omega-\left(A_{k+1}+1 \bmod p\right)$ . Note that the intervals between elements of $A_{k+1}$ are 1 and all 2 , so $A_k \subseteq A_{k+1}$ and contains one less element. We alternately split the elements of $A_{k+1}$ into two parts: $\left\{\left\lfloor x+\frac{t p}{2^{k-1}}\right\rfloor \bmod p\right\}=\left\{\left\lfloor x+\frac{t p}{2^{k-2}}\right\rfloor \bmod p\right\} \cup\left\{\left\lfloor x+\frac{1}{2^{k-1}}+\frac{t p}{2^{k-2}}\right\rfloor \bmod p\right\}$ which are disjoint and both good sets (Lemma 2), with one part contained in $A_k$ (corresponding to $y=x$ or $y=x+\frac{1}{2^{k-1}}$ ). Define them as $A_{k-1}$ and let $A_j:=S_{2^{j-1}}(y)=\left\{\left\lfloor y+\frac{t p}{2^{j-1}}\right\rfloor \bmod p\right\}, \quad j=1,2, \ldots, k-1,$ and then let $A_{2 k+1-j}:=\Omega-\left(A_j+1 \bmod p\right), \quad j=1,2, \ldots, k-1,$ which are all good sets by Lemma 4, thus constructing a chain of good sets of length $l=$ $2 k$ .

Now prove the optimality of $l=2\left(\left\lfloor\log _2(p+1)\right\rfloor\right)$ . We attempt to estimate the gap: if $1 \leq a<b<\frac{p}{3}$ , and $A \subseteq B$ are both good sets with $\# A=a$ and $\# B=b$ , we will prove $b \geq 2 a$ . We use contradiction. Suppose $b<2 a$ . Consider the interval composition of elements of $A$ (in the $\bmod p$ ring), assuming they are some number $t(\geq 4)$ and some number $t+1$ . Since the number of new points in $B$ (i.e., $B \backslash A$ ) is fewer than in $A$ , they cannot exist between each pair of elements in $A$ , so the interval composition of elements in $B$ must include $t$ or $t+1$ . If the interval composition of $B$ includes $t+1$ , then splitting the interval in $A$ will result in an interval less than $t$ , contradicting $B$ being a good set (Lemma 1). Thus, the interval composition of elements in $B$ can only be $t$ and $t-1 \geq 3$ (note $b<\frac{p}{3}$ ). But splitting an interval of $t+1$ in $A$ results in intervals still within $t$ and $t-1$ , so $t=3$ , contradiction.

If $1 \leq a<b<\frac{p}{2}$ , and $A \subseteq B$ are both good sets with $\# A=a$ and $\# B=b$ but $b<2 a$ , similarly to the above reasoning, we infer: the interval composition of elements in $A$ must be 4 or 3 , and the interval composition of elements in $B$ must be 3 or 2 . The elements of $B \backslash A$ must all divide the intervals of length 4 in $A$ .

We further assert that $B$ (or $A$ ) does not contain two adjacent intervals of length 3. Otherwise, the "sum of lengths of two adjacent segments" in $B$ will simultaneously include 4, 5, and 6, contradicting the characterization of good sets by Lemma 3.

We further assert: the number of intervals of length 3 in $B$ (or $A$ ) must be 1 . Otherwise, let the number of intervals of length 3 be $k$ , and the number of intervals of length 4 between any two elements in $A$ be positive integers $c_1, c_2, \ldots, c_k$ (they split into twice as many intervals of length 2 in $B$ ), satisfying $3 k+4\left(c_1+c_2+\cdots+c_k\right)=p .$

Clearly, these $c_i$ are not all equal (since $p$ is prime). Let $c=\min c_i$ , consider the "sum of lengths of $2 c+2$ adjacent segments" in $B$ , which can span over $2 c$ intervals of length 2 and extend by one interval of length 3 on each side (note that some $c_i=c$ ), resulting in a total length of $4 c+6$ . But since not all $c_i$ are equal, at least one $c_i \geq c+1$ , so there exists $2 c+2$ adjacent intervals of length 2 in $B$ , with a total length of $4 c+4$ . This contradicts the characterization of good sets by Lemma 3.

That is, "in most cases", if $1 \leq a<b<\frac{p}{2}$ , and $A \subseteq B$ are both good sets with \#A $=a$ and \#B=b, then $b \geq 2 a$ . The only exceptional case is $p=4 t-1, a=t$ , and $b=2 t-1$ .

Within the range of element numbers not exceeding $\frac{p}{2}$ , the number of elements in each $A_k$ in the chain of good sets is $\{1\},\{2,3\},\{4,5,6,7\}, \ldots,\left\{2^{\left[\log _2 p\right]-1}, \ldots, \frac{p-1}{2}\right\} .$

When $p$ is a non-Mersenne prime, at most one number can appear in each of these sets; for Mersenne primes, the last set may contain at most 2 , while the others can contain only 1. This upper chain has a length of $\left[\log _2(p+1)\right]$ .

By Lemma 4, within the range of element numbers exceeding $\frac{p}{2}$ , the upper limit of the length of the chain of good sets is also $\left[\log _2(p+1)\right]$ , proving the overall upper limit.

~xiaohuangya|szm

2023 CMO(CHINA) (Problems • Resources)
Preceded by Problem 2	Followed by Problem 4
1 • 2 • 3 • 4 • 5 • 6
All CMO(CHINA) Problems and Solutions

Page

Toolbox

Search

2023 CMO Problems/Problem 3

Solution 1

See Also