Mathcamp 2021 Qualifying Quiz
Go back to the Math Jam ArchiveJoin the Canada/USA Mathcamp staff for this exciting Math Jam, where they discuss solutions to the 2021 Mathcamp Qualifying Quiz! Mathcamp is an intensive 5-week-long summer program for mathematically talented high school students. The Qualifying Quiz is the centerpiece of the Mathcamp application each year, and the problems are designed to be fun, challenging, and thought-provoking. We invite anybody who's interested to come to this event, whether you applied to Mathcamp this year or just want to talk about some cool problems with Mathcamp instructors. It'll be even more fun for you if you bring your own solutions so you can participate in the discussion during the Jam!
Copyright © 2024 AoPS Incorporated. This page is copyrighted material. You can view and print this page for your own use, but you cannot share the contents of this file with others.
Facilitator: MiraBernstein
Welcome to the Canada/USA Mathcamp Qualifying Quiz Math Jam! The Math Jam will begin at 7:00 pm ET (4:00 pm PT).
If you are enrolled in a course and trying to attend it, then you should leave the classroom now, click Classroom in the Online School dropdown, and then choose the appropriate link on the next page.
The classroom is moderated: students can type into the classroom, but only the moderators can choose a comment to drop into the classroom. So, when you send a message, it will not appear immediately, and may not appear at all.
Hello and welcome to the Canada/USA Mathcamp Qualifying Quiz Math Jam!
Before I introduce our guests, let me briefly explain how our online classroom works.
This room is moderated, which means that all your questions and comments come to the moderators. We may share your comments with the whole room if we so choose.
Canada/USA Mathcamp is an intensive five-week-long summer program for high-school students interested in mathematics, designed to expose students to the beauty of advanced mathematical ideas and to new ways of thinking. You can learn more about Canada/USA Mathcamp here: www.mathcamp.org
Many AoPS instructors, assistants, and students are alumni of this outstanding program!
Tonight we'll have Marisa Debowsky leading our discussion!
Marisa Debowsky (MarisaD) is the Executive Director of Mathcamp. She's been teaching Topological Graph Theory and singing pop songs at Mathcamp every summer since 2006.
And now, I'll hand it off to Marisa!
Hi, everybody, and welcome to the annual Mathcamp Qualifying Quiz Jam!
A big thanks as always to @AngelBell, @vapodaca, @rrusczyk, and the AoPS team for hosting us.
My Mathcamp colleagues and I are here to talk about the Mathcamp 2021 QQ. To follow along, you should all have the quiz open in another window: https://www.mathcamp.org/qualifying_quiz/current_quiz/
The Quiz problems are written by Mathcamp alumni, staff, and friends each year, and the solutions we'll be walking through today are a collaboration by lots of Mathcamp staff (with good ideas from the applicants, too!).
If you applied this year, I strongly recommend having your own solutions open. That way, you can reply more quickly to the questions we ask of the room. And we're expecting you all to pitch in to the solutions!
Students can use LaTeX in this classroom, just like on the message boards. Specifically, place your math LaTeX code inside dollar signs. For example, type:
We claim that \$\frac{1}{2} + \frac{1}{3} = \frac{5}{6}\$.
This should give you:
We claim that $\frac{1}{2} +\frac{1}{3} = \frac{5}{6}$.
Also, as @AngelBell and @vapodaca pointed out: this chat room is moderated. That means your messages go only to us, and we will choose which to pass on, so please don't be shy to contribute and/or ask questions about the problems at any time (and we'll do our best to answer).
Today, we'll just be talking about the Quiz. If you have questions about Mathcamp itself, or the application process, you'll find lots of info on our website (e.g., at http://mathcamp.org/gettoknowmathcamp/), or check out the AoPS Jam about the program and the application process from a few months ago: https://artofproblemsolving.com/school/mathjams-transcripts?id=572
If we don't end up getting to your questions, feel free to post them on the Mathcamp forum on AoPS: http://www.artofproblemsolving.com/community/c135
We've got a lot to cover in the next two hours, so let's get started! I'll turn it over to Assaf to kick us off with Problem #1.
Hello there! I’m Assaf, and I’ll be presenting problem 1. This problem has to do with the circle of lies as described below:
In other words, mentors always lie to campers, campers always lie to JCs, and JCs always lie to mentors. All other interactions are truthful.
Fun fact: all of the people mentioned in this problem have actually been mentors, campers, and JCs. Some have even lied to me.
Part (a): Sachi and Yuval have a conversation. Sachi says to Yuval, "At least one of us is a mentor." Yuval replies to Sachi, "At least one of us is a camper." What can you deduce about Yuval?
There’s various approaches to solving this problem. We can use a truth table, or analyze all possible role pairs for Yuval and Sachi, or analyze whether Yuval or Sachi are lying. I’ll show the latter.
First, notice that it’s impossible for both Yuval and Sachi to be lying.
Sachi is lying: If Sachi is lying, then neither Yuval nor Sachi are mentors. What does this mean? *consults circle of lies*
Yuval is a JC
sachi is a camper and yuvi is a jc
Sachi is camper, Yuval is JC
If Sachi is lying, then Sachi must be a camper and Yuval must be a JC
Sachi is a camper and Yuval is a JC
Sachi is a camper and Yuval is a JC
yuval is lying: if yuval is lying, then neither yuval nor sachi are campers. what does the circle of lies tell us?
Yuval is a JC and Sachi is a Mentor
Yuval is a JC and Sachi is a mentor
Yuval is JC and Sachi is a mentor
Yuval is a JC and Sachi is a mentor
Yuval is a JC and Sachi is a mentor
Yuval is a JC, Sachi is a mentor
If Yuval is lying then Sachi must be a mentor and Yuval must be a JC
Neither Yuval nor Sachi are lying: In this case, one of them is a mentor, and one is a camper, but then one would lie to the other! This can’t happen
In all possible cases, Yuval is a JC
Part (b): You are a camper and you meet someone named Miranda at camp. What question can you ask Miranda about herself to determine whether she is a JC?
Some answers were of the form: I, a camper, will say: “Your name is Miranda”, and if the universe will implode in a paradox if Miranda is a JC.
We did not accept answers that imploded the universe in a paradox.
What happens if you asked Miranda “Are you a JC?”
JC would say yes if they are a JC, but yes if they are a mentor as well
So that’s no good - it detects campers instead of JCs. What question would detect JCs?
If we ask: “Are you a camper?” only a JC would say “no”. We can also ask “are you staff,” and only a JC would say “yes.”
Part (c) Don and Viv are having a conversation as you pass by. Is there a single statement you can overhear from that conversation from which you can deduce that Don and Viv are both in the same group of people (both campers, or both JCs, or both mentors) without giving you any additional information about which group it is?
We’d like to find a single statement that essentially says: “we are in the same group”. What distinguishes people in the same group?
they trust each other
do not lie to each other
They wont lie to each other
they both tell truth to each other
They tell the truth both ways
They bot tell the truth.
The two people tell the truth to each other
they tell the truth to each other
They don't lie to each other.
they tell each other the truth
They tell the truth to each other
there's no lying between the two of them
they don't lie to eachother
They don't lie to each other.
they'll both say the truth
They both will lie to a certain group of people
they say the truth to eachother
They both tell the truth.
Two people are in the same group if and only if they both tell the truth to each other. Can anyone in a lying pair say that both of them tell the truth to each other?
A lot of people have the right idea here!
Without loss of generality, assume that Viv says: “you are not lying to me.” If Viv was lying, then Don must be lying. However, by the circle of lies, Don must be telling the truth, a contradiction.
If Viv was telling the truth, then Don must also be telling the truth.
So if we overhear “you are not lying to me,” we know that Viv and Don are in the same group.
A super-short solution can be: “I agree” because if we overhear that, it’s equivalent to overhearing “you are not lying to me.”
Thanks for coming by! Enjoy problem 2!
Hi all, I'm Kevin, and I'll be presenting Problem 2, which involves a lot of probability!
Since the full problem with all of its parts is quite long, we'll keep only parts of the problem stickied at any given time. So let's dive in:
Problem 2 Statement:
Kai has a collection of magic, shape-changing dice. When he rolls a blue die, it changes shape to have a number of sides equal to the number rolled minus one. For example, if a blue die starts out with 12 sides, then Kai has an equal probability of rolling any of the twelve integers from 1 through 12. If he rolls an 8, then the blue die changes shape to have 7 sides, and the next roll will have an equal probability of rolling any of the seven integers from 1 through 7. After rolling the same blue die over and over, eventually Kai will roll a 1, at which point the die will disappear.
Part a: Suppose the blue die starts with 4 sides (and so it has an equal probability of rolling any of the four integers from 1 through 4). How many rolls, on average, will it take for the die to disappear?
Part a (and b, coming up soon), are warm-ups with explicit numbers rather than variables, though they can still be tricky if probability is somewhat new to you, and then c, d, and e will involve proofs.
Time for some classic expected value
Indeed! "Expected value" is a fancier way of saying "average" for our purposes, since we don't necessarily expect people working on the quiz to have a background in probability
The pace that we move through this problem will accelerate as we go through all the parts, but the transcript will be available online shortly after this Math Jam if you'd like more time to puzzle over some of the trickier details.
Here in Part a, Kai's rolling a 4-sided die, which means there are four outcomes: he rolls 1, 2, 3, or 4. If he rolls a 1, then we're done after 1 roll; for any other roll, we'll need to know how many rolls it takes for the die to disappear starting from 1, 2, or 3 sides. So let's build this up step by step!
If Kai rolls a 1-sided die: The die always disappears after 1 roll, so the average number of rolls with a 1-sided die is 1.
What about a 2-sided die? How many rolls on average before it disappears?
I'm seeing a bunch of answers between 1 and 2 -- 1 and 2 are both possibilities, so the average number should be between them.
1.5
$\frac{3}{2}$
1.5
$\frac{3}{2}$
1.5
3/2
1.5
1.5
We'll write this out carefully:
2-sided die: There's a 50% chance to roll a 1, meaning the die disappears after 1 roll. This contributes 1/2*1 to the average number of rolls. There's a 50% chance to roll a 2, meaning the die becomes 1-sided, and it disappears after one more roll. This contributes 1/2*(1+1) to the average. In total, then, the average number of rolls is 3/2.
What about a 3-sided die?
(I might not give you quite enough time to compute this from scratch, but hopefully you have a chance to think about it a bit!)
I'm seeing a lot of 2s -- and it's very close to 2, but not quite.
11/6
1 + (0 + 1 + 3/2)/3
\frac{11}{6}
$\frac{11}{6}$
3-sided die: Let's list this out quickly:
* 1/3 chance to roll a 1, and the die disappears. This contributes 1/3*1 to the average.
* 1/3 chance to roll a 2, and the die becomes 1-sided, which we've computed. This contributes 1/3*(1+1).
* 1/3 chance to roll a 3, and the die becomes 2-sided, which we've computed. This contributes 1/3*(1+3/2), where the 3/2 is the average number of rolls for a 2-sided die to disappear.
Adding up all these contributions, we get that the average number of rolls is 11/6.
(The way w_aops wrote it out is very nice: it was the first roll, then a 1/3 chance each of 0, 1, or 3/2 further rolls!)
Okay, now let's actually do part 4. In the interest of time, I'll write it out quickly:
(Though some of you are beating me to it!)
$\frac{25}{12}$
25/12
4-sided die: Computing as before, rolling a 1 contributes 1/4*1, rolling a 2 contributes 1/4*(1+1), rolling a 3 contributes 1/4*(1+3/2), and rolling a 4 contributes 1/4*(1+11/6). Adding them up, we get a final answer of 25/12 rolls.
1+(0+1+3/2+11/6)/4
This is very similar to how we wrote it out, and the analog of what w_aops wrote for 3-sided dice!
If you play with these numbers a bit, you might discover that $25/12 = 1 + 1/2 + 1/3 + 1/4$. Sums of reciprocals like this are called harmonic numbers, and there are several ways to prove that the $N$th harmonic number gives the average number of rolls before an $N$-sided die disappears. Play around with this more later if it seems interesting, but let's keep moving for now!
So, that's one of our warmups in Part a; now we're going to have Kai start playing games with Helena.
Now Kai is playing a game with Helena. Kai rolls a blue die first (causing it to change shape), then Helena rolls the same blue die (which changes shape again), and then they continue to take turns rolling the same die until one of them rolls a 1. Whoever rolls the 1 loses.
Part b: Suppose the blue die starts with 4 sides. What is the probability that Kai wins?
Part c: Suppose the blue die starts with $N$ sides. What is the probability that Kai wins?
Let's define $p(N)$ to be the probability that the first player wins the game with an $N$-sided die.
Note that I said "first player," not "Kai," because if Kai rolls, say, a 4 on his first roll, then Helena now is acting as if she's the first player in a game with a 3-sided die, so Helena wins with probability $p(3)$ from here, and hence Kai wins this specific case with probability $1 - p(3)$.
Okay, let's now try to compute $p(4)$, which is Part b. Like with Part a, let's build this up from smaller values:
What's $p(1)$? In other words, what's the chance Kai wins a game with a 1-sided die?
0
0!
0
0% I mean
0
0
0
(I'm assuming the "0!" is enthusiasm, not factorial )
And yes:
* $p(1) = 0$, as Kai is guaranteed to lose immediately with a 1-sided die.
What about $p(2)$?
1/2
50%
$\frac{1}{2}$
$\frac{1}{2}$
$\tfrac{1}{2}$
50
1/2
1/2 either Kai rolls a 1 or gives Helena P(1)
Great explanation!
* $p(2) = 1/2$, as there's a 50% chance Kai rolls a 1 and loses immediately, and a 50% chance Kai rolls a 2 and hands Helena a 1-sided die, causing her to lose immediately.
$p(3)$? (getting trickier...)
I'm seeing some interesting guesses: remember that even if Kai doesn't immediately lose (by rolling a 1), that doesn't necessarily mean he wins
1/2
1/2?
Also $\frac{1}{2}$
1/2 again
I'll write out the full logic:
* For $p(3)$, there's a 1/3 chance of rolling a 3, handing Helena a 2-sided die, where we already know each player has a 1/2 chance of winning. Then there's a 1/3 chance of rolling a 2, causing Helena to lose instantly, and a 1/3 chance of rolling a 1, causing Kai to lose instantly. Thus $p(3) = 1/3*(1/2 + 1 + 0) = 1/2$.
$1/3*(0+1/2+1)$
1/3*0+1/3*1+1/3*1/2
You may not send the same message twice in a row
Uh oh, that might be an issue with this problem
is it the case that it is 1/2 with any die?
Hmm, things are getting suspicious, aren't they?
Indeed, we can do a similar calculation for $p(4)$, and we'll get 1/2
1/4(0+1+1/2+1/2)
It's always(except for 1) $\frac{1}{2}$.
$\frac{1}{2}$ wao fun game
Fair games are fun! Maybe.
So we're starting to suspect that, other than $N=1$, it's always a completely fair game: 1/2 chance for Kai to win, 1/2 for Helena.
Induction to prove for all die?
This was the most popular strategy in the applications, but we'll do some induction later, so we're going to present a different argument.
Often in math, if you want to prove that two sets have the same size, you can do this by constructing a bijection between them.
what is a bijection
. What's a bijection
Excellent question!
1-1 correspondence
Yes: a bijection is a way to have a 1 to 1 correspondence between the two sets, so that everything in each set corresponds to exactly one object in the other.
When working with probability, we can similarly use a bijection to prove that something happens 1/2 the time, but we have to be careful: we have to make sure that our bijection always takes one successful outcome to an equally likely unsuccessful outcome.
So, let's construct a bijection between possible sequences of die rolls where Kai wins and ones where Helena wins, making sure that our bijection preserves probabilities. Any ideas?
So I'm seeing some ideas about rolling a 1 vs rolling a 2, and that's a key idea here.
rolling 1 vs rolling a 2
That's exactly what I just said
(But gordonhero did say it just before me, I promise)
If, say, Kai rolls a 2 at any point, then Helena automatically loses. However, Kai is just as likely to roll a 1 at that point instead of a 2, causing him to lose.
This gives us a pair of equally likely games, one of which Kai wins, and one of which Helena wins.
Similarly, if Helena rolls a 2, this is just as likely as her rolling a 1 at that point, and we get a similar pair of equally likely games.
So, that's our bijection: every game that Kai wins is in bijection to an equally likely game where Helena wins, simply by inserting or removing a roll of 2.
The existence of this bijection immediately proves that $p(N) = 1/2$ for $N \ge 2$!
This kind of argument can be a little tricky if you haven't seen something like it before, so I encourage you to think about it more if you need to.
and if they roll another number n it can simply be considered as a new game with n-1 sides?
So we're thinking about entire games here: for example, if Kai rolls 5, then Helena 3, then Kai 1, that sequence of rolls has exactly the same probability of occurring as if Kai rolled 5, then Helena 3, then Kai 2, then Helena 1, but now the winner has swapped.
You may have some pondering to do, but since we have a lot to cover, we're going to move onwards to some fancier green dice!
Kai also has green dice, which work slightly differently from blue dice. When he rolls a green die, it changes shape to have a number of sides equal to the number rolled (so if Kai rolls the maximum possible value, the green die doesn't change shape at all).
Part d: He plays the same game with Helena, but now with an $N$-sided green die; the first player to roll a 1 still loses. What is the probability that Kai wins?
So: green dice are very similar to blue dice, but now the number of sides doesn't automatically decrease.
It's similar to the last game, so we might expect the probability to be close to 1/2, but it's unclear if it'll be exactly 1/2, like it was before.
So let's look for a pattern!
Keeping $p(N)$ as the probability that the first player wins, what's $p(2)$?
(Already this is a little trickier to compute!)
$\frac{1}{3}$
1/3
1/3
We can think of it this way: Kai has a 50% chance of losing immediately, and a 50% chance of handing Helena a 2-sided die, where Helena has by definition a $p(2)$ chance of winning, and thus Kai has a $1-p(2)$ chance of winning.
Putting it together, $p(2) = 1/2*(0 + 1-p(2))$, or $3/2 p(2) = 1/2$, or $p(2) = 1/3$.
(Sorry for the ugly latex *, my bad!)
In the interest of time, we won't explicitly compute more values, but we'd get $p(3) = 5/12$ and $p(4) = 9/20$. Notice any pattern?
All slightly less than 1/2
denominator is n(n-1)
Close to $\frac{1}{2}$
always a unit smaller than a half?
Closer and closer to 2. Maybe $\frac{1}{2}-\frac{1}{(n)(n+1)}$?
Aha! An explicit conjecture!
And indeed, bronzetruck2016's formula works with all our examples so far.
Let's think about Kai's first roll based on whether he rolls $N$ or not:
isn't it similar to the previous problem with an extra case?
It is, but that extra case -- where we can repeat the number that was just rolled -- makes it pretty tricky!
Okay, back to Kai's first roll:
* Kai rolls $N$, which occurs $1/N$ of the time: We're now playing exactly the same game (since the die still has $N$ sides), but now Helena goes first, so Kai wins with probability $(1 - p(N))$.
* Kai rolls something else, which occurs $(N-1)/N$ of the time: In this case, Kai is equally likely to roll each of the numbers 1 through $N-1$. But we have notation for Kai's win chance when he's equally likely to roll $1$ through $N-1$: it's simply $p(N-1)$. So $p(N-1)$ is the chance that Kai wins in this case.
Putting these together, the total probability that Kai wins is
$$
p(N) = \frac{1}{N} (1 - p(N)) + \frac{N-1}{N} p(N-1)
$$
This is great, as we can now solve for $p(N)$ in terms of $p(N-1)$!
$$
p(N) = \frac{1}{N+1} + \frac{N-1}{N+1} p(N-1)
$$
Recurrence relations like this are an excellent place to use induction to prove that it works for all possible values of $N$. If you're unfamiliar with induction, we won't be able to do a full introduction here, but the strategy is to prove a "base case" (generally the smallest value of $N$ we need to show), then prove that if it works for $N-1$, it works for $N$.
What's our base case?
N=1
$N=1$
n=2
Either $N=1$ or $N=2$ is fine! (As long as you've shown it for both $N=1$ and $N=2$, it doesn't really matter which you choose as your base case.)
for $N=1$, we have $p(1) = 0$ (since Kai immediately loses), which does agree with bronzetruck2016's conjectured formula (which I've pinned)
for $N=2$, we computed $p(2) = 1/3$, which also agrees with the formula.
Now we must carry out the inductive step: assuming our conjecture is true for $p(N-1)$, show that it's true for $p(N)$.
So let's plug in $p(N-1) = \frac{1}{2} - \frac{1}{(N-1)*N}$ into our recurrence!
$$
p(N) = \frac{1}{N+1} + \frac{N-1}{N+1} \left(\frac{1}{2} - \frac{1}{(N-1)N}\right)
$$
why $*$
I keep being bad at latex -- I just mean multiplication
We've plugged in $p(N-1)$ and now we can simplify:
$$
p(N) = \frac{N+1}{2(N+1)} - \frac{1}{(N+1)N} = \frac{1}{2} - \frac{1}{N(N+1)}
$$
And that's exactly the formula we wanted to compute for $p(N)$, which completes the inductive step and the proof!
Okay! If induction is unfamiliar to you, that might have been fast and/or mysterious. But we'll skip induction for this last part
Part e: Suppose now that Kai and Helena modify their game: the first person to roll one of the numbers 1, 2, …, $k$ loses. If they start with an $N$-sided green die, what is the probability that Kai wins?
isn't the base case k now..???
Yes! That is the key difference in how the math will play out.
If $p(N)$ is Kai's probability of winning in this new game, the exact same logic that led to our recurrence relation still holds as long as $N > k$, and so the exact same recurrence relation holds for $N > k$!
So indeed, the recurrence relation is the same, but our starting point is now $N = k$, not $N=1$.
So since the smallest instance of our recurrence relation writes $p(k+1)$ in terms of $p(k)$, we'll need to know $p(k)$ to get started. What is it?
0?
0
0%
0
0
Right: in this new game, any roll from 1 through $k$ is an instant loss, so Kai cannot survive the first roll of a $k$-sided die!
Starting with $p(k) = 0$ and using the recurrence for $p(k+1)$, we see $p(k+1) = 1/(k+2)$, and it gets messier from there. With enough experimentation, we could come up with a correct conjecture, but we're going to use a bit of a trick to simplify our recurrence.
In Part d, the answer turned out to be that Kai wins $1/(N(N+1))$ less than a fair game. Inspired by this, let's define $q(N) = 1/2 - p(N)$, so that $q(N)$ measures how far from fair the game is. We can rewrite our recurrence
$$
p(N) = \frac{1}{N} (1 - p(N)) + \frac{N-1}{N} p(N-1)
$$
by substituting $p(N) = 1/2 - q(N)$, and we hope it'll look nicer:
$$
1/2 - q(N) = \frac{1}{N} (1/2 + q(N)) + \frac{N-1}{N} (1/2 - q(N-1))
$$
And indeed, after simplifying:
$$
q(N) = \frac{N-1}{N+1} q(N-1)
$$
This is fantastic, as we can build up values of $q$ via a telescoping product! Working our way from $q(N)$ down to $q(k) = 1/2$ (corresponding to $p(k) = 0$ that we talked about earlier),
(oops, let's try that again)
$$
q(N) = \frac{(N-1)(N-2) \dots (k+1)k}{(N+1)(N) \dots (k+3)(k+2)} q(k)
$$
(whew! better this time )
This is the really fun part where you get to cancel (almost) everything!
And plugging in $q(k) = 1/2$, we end up with
$$
q(N) = \frac{k(k+1)}{N(N+1)} \frac{1}{2}.
$$
Since Kai's win probability is $p(N) = 1/2 - q(N)$, we have now shown that this is
$$
p(N) = \frac{1}{2} - \frac{k(k+1)}{2N(N+1)}
$$
and we're done!!
No induction needed, because our recurrence in terms of $q$ was so nice
So it is $\frac{k(k+1)}{2N(N+1)}$ from a fair game?
Indeed! So it becomes very close to fair as $N$ gets very big
$P(N)=\tfrac{T_n-T_k}{2T_n},$ which is pretty interesting.
Yes! There are a lot of interesting things to think more about
(There are some really nice ways to go through all this in terms of generating functions, if you've seen those before!)
We've had to go quickly, but the transcript will be available online for you to go through the details at a more reasonable pace.
But since we have a very finite amount of time tonight, let's pass things on to Sara for Problem 3!
Hi everyone! I’m Sara and I’ll be presenting Problem 3. Here is the problem statement:
Problem 3: A construction company is designing a new apartment complex. They have an $n \times n$ lot in which each $ 1 \times 1 $ square can either occupy an apartment or a garden.
The apartments must all have a scenic view: if an apartment is not on the edge of the apartment complex, then one of the $4$ adjacent lots must have a garden.
For reference, consider the first two diagrams below: diagram (i) shows a valid design, while the design in diagram (ii) has one apartment without a scenic view.
Problem 3a: What is the minimal number of gardens if $n = 6$?
Oh I remember this question, it was my favorite one
Mine too! That's why I'm happy to be presenting it
I'm seeing some correct answers already—let's go through the solution together.
Let's draw a $6 \times 6$ grid:
Which squares automatically have scenic views?
outside edge
the sides
ones on the border
the border
the outer squares
the ones on the border
the outline
Right! The squares on the outer edge already have scenic views, so we may as well place happy apartments there:
How many gardens do we need to satisfy the apartments in the inner $4 \times 4$ square? (Some of you beat me to this question.)
4
4
4
4
Center 4
4
4
4
4
4
4
4
Yep! Two ways we can do this are:
(For an extra challenge, think about why these are the only two ways to minimally satisfy a $4 \times 4$ grid.)
So, we've shown that the minimum is at most $4$. A common mistake students made was not showing that the minimum is exactly $4$. Specifically—how can we be sure that $4$ gardens are required? Why can't we satisfy all apartments with $3$ gardens?
One for each of the corners!
One way to show this is by looking at the corners of the $4 \times 4$ square.
Can one garden satisfy more than one corner?
no
noo
No!
No
No
no
no
Right! No garden can satisfy more than one corner, because any two corners have at least two squares between them, and any two squares satisfied by a garden have at most one square between them.
So, we've proven that the minimal number of gardens needed for a $6 \times 6$ apartment complex is $4$.
Do part B first and apply it to part A
Another way to show it is to just use part (b), which we'll prove now.
Problem 3b: Prove that the number of gardens must be at least $\frac{1}{5} (n-2)^2$.
We can start by drawing a $n \times n$ grid:
Let's start by looking at the $(n-2)^2$ term. What might it represent in terms of this problem?
the inner square
The central $n-2$ by $n-2$ grid.
the number of non-boundary spaces to be covered
the squares not on the borders
the number of squares in the non-border lots
all the squares without the bordering squares
The squares excluding the borders
Right. As in part (a), the outer squares already have a scenic view. So it suffices to place gardens to satisfy the inner squares—and there are $(n-2)^2$ of them.
Now, where could the $\frac{1}{5}$ come from?
each garden makes at most 5 squares happy
the center of a "cross"
There are multiple houses that get a scenic view from one garden
Each garden "takes care of" at most 5 squares.
4 spaces for 1 apartment 1/5
The number of lots a garden provides scenes to (5)
when you place a tree it can at most take care of itself and the edges
The garden satisfies 5 spaces, including the square it is in
Yes. Each garden satisfies at most $4$ apartments. So, for every $4$ apartments we need at least $1$ garden. So at least $\frac{1}{5}$ of the squares in the inside must be gardens.
This proves that any valid $n \times n$ apartment complex must have at least $\tfrac{1}{5} (n-2)^2$ gardens.
On to part c!
Problem 3c: Prove that it is possible to construct an apartment complex with no more than $\frac{1}{5} n^2$ gardens for any n.
This question is asking for a construction. Since we just proved in part (b) that any construction must use at least $\frac{1}{5} (n-2)^2$ gardens, we will have to place the gardens efficiently in order to use only $\frac{1}{5} n^2$ of them.
In particular, we want to avoid situations where a single apartment has a view of multiple gardens, since those gardens are being “wasted”.
For example, here is an inefficient construction which uses approximately $\frac{1}{4} n^2$ gardens. The apartment highlighted green has a view of $2$ gardens, which is more than strictly necessary.
So, how can we lay out the gardens efficiently?
tetrate the crosses.
tesselate the crosses that they affect
Connect them like jigsaw pieces
"knight's move away"
tesselate
Great! We can think about a garden satisfying 4 apartments neighboring it:
To design a valid apartment complex, all we need to do is place these + shapes so that no square is left uncovered. We can do this by tiling the + shapes in the following way:
Awesome! This is super space efficient. Are we done yet?
no
No
not yet
No, it isnt a square
No, some might go off the edge
Not yet... this shape isn't a square!
no
jagged edges
Right! We want to place gardens and apartments into a $n \times n$ square, not a blocky blob. When we turn the above configuration into a square, we encounter two problems.
(1) Some gardens end up outside the square. We need to be sure that the apartments they used to cover are still OK.
(2) Some gardens are no longer covering $4$ apartments, so we can no longer guarantee the $\frac{1}{5}$ fraction.
How do we deal with (1)?
ones they used to cover are on the sides so they are asfe
safe
nothing, they're already on the edge so they have a view
It's okay because all the outer squares have a view
Yep—we don’t have to do anything! Any garden which gets cut off would have only covered apartments on the border of the square. But, apartments on the border are already satisfied. Crisis averted!
Dealing with (2) is a bit trickier. In our above example, when we took a $6 \times 6$ square chunk from the tiling, we ended up with an apartment complex with $8$ gardens. But $\frac{1}{5} 6^2 = 7.2 < 8,$ so we are using too many gardens.
So, we'll need to somehow adjust the tiling so that every apartment in our $n \times n$ grid is covered, and only $\frac{1}{5} n^2$ squares are used. How might we go about this?
Lots of different ideas!
there is a pattern of the spilled over squares
One way to solve this problem is to study the + tiling pattern carefully and look for patterns in how the inefficient red + signs "spill over".
For each possible remainder of $n \mod 5$, a different pattern emerges. In each case, it can be shown that fewer than $\frac{1}{5} n^2$ gardens are needed.
Some of you suggested a different approach which doesn't require casework:
We can translate the gardens and apply the pigeonhole principle.
shift it over 1
Let's go back to the tiling we had of a $6 \times 6$ square.
Label each part of the + sign with a $1,2,3,4,$ or $5$:
Now back to the full picture:
Notice that if we remove the green outlines, there is nothing special about the number $3.$ We can place gardens on any other number and they would satisfy all of the apartments in the grid.
How can we make a construction with fewer than $\tfrac{1}{5} 6^2$ gardens from here?
5 as the center
the gardens should be on 4s
Choose the least frequently occurring number and put gardens in squares with that number
Right! Here, any number except $3$ appears $7$ times, which is less than $\tfrac{1}{5} 6^2 = 7.2.$ So, for example, we could place gardens on all of the $2$s, which would satisfy all apartments using only $7$ gardens.
(Many of you mentioned that the problem with the previous arrangement of gardens was that we were wasting $4$ gardens on corners. Indeed, this problem goes away when we use a different number as the center of the + sign!)
This works because of the way we numbered the squares—no matter which number we place the gardens on, all apartments except those on the boundary will be covered.
Note that it's sometimes helpful to put a garden on the edge, even if it's not being used efficiently—that garden can cover apartments not on the edge.
Now let's analyze the general $n \times n$ case. Let's label the squares of an $n \times n$ grid with the numbers $1, 2, 3, 4, 5$ in the same pattern.
Suppose there are $x_i$ squares labeled $i$ for $1 \leq i \leq 5.$ What is $x_1 + \dots + x_5$?
$x_1 + x_2 + x_3 + x_4 + x_5 = n^2.$
$n^2$
$n^2$
$n^2$
Good, we have $x_1 + \dots + x_5 = n^2.$ Now, let $x_{min}$ be the smallest $x_1, ..., x_5$. What can we say about $x_{min}$?
It is smaller than n^2/5
By the Pigeonhole, the largest possible smallest $x_i$ is $\tfrac{n^2}{5}.$
Yes! There will exist a number which has $\leq \frac15 n^2 $ squares, and we can place the gardens there.
For example, if $2$ appears less than $\frac{1}{5} n^2$ times, then our construction would look like this:
We have a valid construction which uses $\leq \frac15 n^2 $ gardens. We're done with part (c)!
Problem 3d: Now suppose that gardens have very tall trees that provide a scenic view to up to $20$ nearby lots. Specifically, if a garden is placed at the center of a $5 \times 5$ square, then every apartment in that square except the four $1 \times 1$ corners will have a scenic view of the garden.
In other words, a single garden satisfies the following apartments:
What upper and lower bounds can you prove on the number of gardens needed?
So the question is asking for two things: an upper bound and a lower bound. Let's unpack that a little bit.
What does it mean to prove that some number $L$ is a lower bound for this problem?
not possible for there to be less than $L$ gardens
There will be at least $L$ gardens in any arrangement
There is no setting that has less gardens than $L$, so $L$ is the most optimal setting of gardens
the least numer of gardebs needed
We need at least $L$ gardens to make everyone happy!
Right, you need to prove that at least $L$ gardens are required to make it work. This is like what we did in 3b, where we proved you need at least $\frac{1}{(n-2)^2}$ gardens to make it work.
What does it mean to prove that some number $U$ is an upper bound?
In any case, there will be at least one arrangement that uses no more than $U$ gardens.
We can place a max of U number of gardens to make everybody Happy
We cannot possibly do worse than $U$, so $U$ is the least optimal setting of gardens
Right, you need to show that it's possible to do it using only $U$ gardens. This is like what we did in 3c, where we showed that $\frac{1}{5} n^2$ is an upper bound by showing how to construct a pattern with that many gardens.
So part 3d is asking you to do what you did in 3b and 3c, but for this more complicated situation.
Let's start with the most straightforward lower bound, similar to 3b. How many gardens must we use at least, in order to give every apartment in a $n \times n$ grid a scenic view?
(n-2)^2/21
$\frac{1}{21}(n-2)^2$
$\frac{(n-2)^2}{21}$?
$\frac{1}{21}(n-2)^2$
1/21(n-2)^2
(1/21)(n-2)^2
Excellent!
The solution is similar to what we did in part (b)—try and apply similar ideas to get this bound.
Now, let's take a quick look at the upper bound.
To prove an upper bound on the minimum number of gardens needed, we need to provide an explicit construction using as few gardens as possible.
What makes this problem tricky is that the shapes don't perfectly tile, unlike the + signs we worked with in part (c).
No matter how we place the shapes, if we try to get them not to overlap, "gaps" will occur (marked red in the above picture).
Even if we can't make a perfectly efficient construction, we can get pretty close by letting the shapes slightly overlap:
If we use this tiling to place gardens in an $n \times n$ grid, roughly how many gardens do we need? In other words, which upper bound does this idea give us?
$\frac{n^2}{20}$
$\frac{n^2}{20}$
Right! It's possible to do a lot of detail-oriented work to get this bound very close to $\frac{1}{20} n^2.$
It turns out that the nice pigeonhole argument that we had in 3c doesn't quite work any more—I'll leave it as an exercise for you to figure out why. But you can use it to get a construction with $\frac{1}{20} n^2 + 4n - 12$.
So let's recap where we are so far in Problem 3d.
We showed that you need at least $\frac{1}{21}(n-2)^2$ gardens.
We showed that you can definitely do it with $\frac{1}{20} n^2$ plus something linear in $n$.
That's kind of unsatisfying! It's very different from 3b and 3c, where the leading term in both the upper and the lower bound was $\frac{1}{5}n^2$. So in that case, we basically figured out the true minimum number of gardens required, up to some lower order terms.
But here, all we know is that the true minimum is somewhere between $\frac{n^2}{21}$ and $\frac{n^2}{20}$.
There is no easy way to close the gap here, and before Mathcamp staff published this year's Quiz, we also had no clue what the exact growth rate of $(\text{minimum \# gardens})$ should be for this part.
A few students went above and beyond in making progress toward an improved lower bound.
Ethan from Vienna, VA improved the lower bound to roughly $\frac{n^2}{20.5}$.
And Kevin from North Potomac, MD as well as Edwin from Cupertino, CA managed to improve the lower bound to roughly $\frac{n^2}{20}$, matching the upper bound up to lower-order terms!
Their proofs use different techniques ranging from graph theory and convex geometry to show that there's no way to arrange the tiles perfectly so that each one satisfies more than $20.5$ or $20$ of the apartments.
All three of these proofs are extremely impressive and creative! They're also quite complicated, so unfortunately, we don't have time to go through them now.
That's it for this problem! If you enjoyed it, you might have fun playing around with different ways the garden can satisfy nearby apartments. For example:
And now I’ll pass it on to Laura, who’ll talk about the solution to Problem 4.
Hi everyone! I’m Laura, and I’ll be presenting the solution to problem 4. Part (b) will require using some inversive geometry, but we don’t need to know too much except that inversion can turn circles into lines and vice versa. We won’t need that for part (a), though, so let’s start with that.
Problem 4 Statement:
Let $S$ be a set of $8$ points in the plane. Call a circle abundant if it passes through at least $4$ points of $S$.
Part (a): Show that there can be at most $14$ abundant circles.
Problem 4a Solution:
For this part, we can just use a counting argument. We know that a circle can be defined by $3$ points. So how many ways are there to choose $3$ points out of the $8$ points in $S$ to form a circle?
$\binom83=56$
8c3
8c3=56
$\tbinom{8}{3}=56.$
$\binom{8}{3}=56$
8C3 = 56
Right! Since we don’t care about order, there are a total of $\binom{8}{3}=\frac{8\cdot 7 \cdot 6}{3\cdot 2 \cdot 1} = 56$ ways to choose $3$ points from $S$. So there are at most $56$ circles formed by points in $S$.
However, if some of the circles are abundant, they would be counted multiple times by this, since abundant circles have more than $3$ points on them. How many times has each abundant circle been counted?
4
$4$ because 4 points
4
$4$ times
$\tbinom{4}{3}=4.$
4c3=4
4
4
Yes! Since each abundant circle has at least $4$ points on it, there are at least $\binom{4}{3}=4$ ways to choose $3$ points to define an abundant circle. Thus, we’ve counted each abundant circle at least $4$ times.
Since we counted $56$ circles total and each abundant circle has been counted at least $4$ times, there can be at most $\frac{56}{4} = 14$ abundant circles.
Part (b): Can there be $14$ abundant circles? If not, what is the maximum number possible?
To solve this part, we’ll need to know a little about inversion. For the sake of time we won’t go through the proofs here, but let’s review the definition and some key properties of inversion from the handout. You can read through the handout for the proofs.
Inversion is sort of like “reflection across a circle.” It takes everything inside the circle and swaps it with everything outside the circle. Points on the circle are fixed, and the center of the circle maps to the “point at infinity.” Inversion can distort images in weird ways! Here are some examples of pictures and their inverted images:
More formally, the inverse of a point $P$ about a circle with center $O$ and radius $r$ is the point $P’$ on ray $OP$ such that $OP\cdot OP’ = r^2$.
Thus, the closer a point is to the circle, the closer its image will be to the circle. Some more examples of points and their inverted images are shown below.
The key property of inversion we’ll need here is that it can sometimes swap circles with lines. Let’s be a bit more specific about that, though, because there are a few different cases.
Inversion sends “generalized circles” to “generalized circles,” where “generalized circles” include lines, because they are like circles through the point at infinity.
Let’s say we’re inverting about a circle with center $O$. What should happen to a line through $O$?
it doesn't change
Stays the same
it stays the same?
Right! From the definition of inversion, a line through $O$ will be fixed under inversion, so its inverse is the same line through $O$. Every point on the line just goes to another point on the same line.
$O$ Disappears?
Yes! $O$ gets mapped to the "point at infinity."
But the "point at infinity" gets mapped to $O$.
Okay, what do you think will happen if we invert a circle that doesn’t pass through $O$?
it becomes another circle
becomes another circle
another circle
another circle
Yes! We are starting with a circle that doesn’t pass through either $O$ or the point at infinity, so its inverse also won’t pass through either of those points. Thus, the inverse is still a circle not through point $O$.
Here are some more examples of how this might look. In each case, the solid circles are images of each other under inversion about the dashed circle. If the starting circle is entirely inside the circle of inversion, its inverse will be entirely outside and vice versa. If the starting circle intersects the circle of inversion, so will its inverse.
What do we get if instead we invert a circle through $O$?
a line
Line
line?
a line
a line
a line
Exactly! The inverse of $O$ is the point at infinity, so a circle through $O$ inverts to a “circle” through the point at infinity - that is, a line!
Thus, the inverse of a circle through $O$ is a line not through $O$. Likewise, the inverse of a line not through $O$ is a circle through $O$. This is the way we swap lines with circles!
For example, in the picture below, the blue and orange circles invert to lines, because they pass through $O$, while the red circle inverts to a circle, because it doesn’t pass through $O$.
That covers all our cases! Any questions so far?
in the first scenario, what does o disappear mean
In all the scenarios, $O$ gets mapped to the point at infinity, which isn't really a point.
$O$ goes to infinity
Great! Now, onto the solution to part (b).
Problem 4b Solution:
We claim that in fact the maximum is not $14$ abundant circles, but $12$. To show this, there are two things we need to prove:
1. Upper bound: More than $12$ abundant circles is not possible.
2. Lower bound: At least $12$ abundant circles is possible.
First let’s prove the upper bound. To do so, we’ll think about how many abundant circles can pass through each point in $S$.
Let’s count the number of point/circle pairs consisting of a point in $S$ and an abundant circle through that point. If we assume there are $14$ abundant circles and we know each abundant circle passes through $4$ points, how many such pairs would there be?
56
$14*4=56$?
Right, there would be at least $4\cdot 14 = 56$ pairs. So how many circles must there be through each of the $8$ points, on average?
7
7
$\frac{56}{8}=7$
7?
Yes! Since there are $8$ points in $S$, on average each point should be on at least $\frac{56}{8} = 7$ abundant circles. In particular, some point would have to be on at least $7$ abundant circles.
However, we’ll show that this isn’t possible. Instead, we claim that each point can be on at most $6$ abundant circles. How many point/circle pairs and how many abundant circles would that give us, at most?
$12$
12
48
48
48
48
48
48 and 12
That would be $6*8=48$
Exactly! If each of the $8$ points is on at most $6$ circles, there are at most $6 \cdot 8 = 48$ point-circle pairs. Since each abundant circle contains at least $4$ points, there can be at most $\frac{48}{4} = 12$ abundant circles.
But we need to prove that each point can be on at most $6$ abundant circles
Right!
To prove our claim that each point is on at most $6$ abundant circles, let’s imagine inverting about a circle centered at one of the points. After inversion, what will happen to the abundant circles that don’t pass through that point?
still are circles
They remain circles
another circle
They will become... Circles
becomes another circle?
Right! These are circles not through the center of inversion, so they’ll stay circles.
What about the abundant circles that do pass through our point?
become lines
lines
line
become a line?
line
turns into lines
become a line
they become a straight line
Turns to lines!
Yes! These circles pass through the center of inversion, so they’ll invert to lines, each containing the images of at least $3$ of the $7$ remaining points.
So we can now rephrase our claim as follows:
Claim: Through $7$ points, we can draw at most $6$ lines such that each line contains $3$ of the points.
Proof: First, we claim that there can be at most $7$ such lines.
This uses a similar argument as in part (a). How many lines can there be that pass through at least $2$ of the $7$ points?
$\tbinom{7}{2}=21.$
21
7c2=21
21?
Right, there are $\binom{7}{2}=21$ ways to choose $2$ of the $7$ points, and any $2$ points define a line, giving at most $21$ lines passing through at least $2$ of the points.
If a line passes through $3$ points, it would be counted $\binom{3}{2} = 3$ times by this, so there can be at most $\frac{21}{3} = 7$ such lines.
Note that this actually gives us an alternate proof of part (a)! We’ve shown that each point is on at most $7$ abundant circles, which makes at most $8\cdot 7 = 56$ point-circle pairs, and so at most $\frac{56}{4} = 14$ abundant circles.
One way to complete the proof from here is using the Sylvester-Gallai theorem, which says that for any finite set of points in the plane, there must be a line through all of them or a line through exactly $2$ of them.
The only way we could have gotten $7$ lines above is if every line containing $2$ of the points actually contains $3$ of them. By the Sylvester-Gallai theorem, that can’t happen, so there can be at most $6$ such lines.
*What is the proof of that theorem?
We won't have time to go through that here, but you can look it up after class if you're interested!
However, if you’re not familiar with that theorem, another way to prove this is by drawing a picture and thinking carefully about how the points and lines could be positioned.
If there were $7$ lines, like we saw above, each of them would have to contain exactly $3$ of the points. Likewise, each of the $7$ points would have to lie on exactly $3$ of the lines.
Let’s say $A$ is the leftmost point, $B$ is the farthest point from $A$ on the top line, and $C$ is the farthest from $A$ on the bottom line. Then we will get something like the picture below:
Because of how we defined the points, there must be a third point $E$ in our set on line $AB$ between $A$ and $B$ and a third point $G$ in our set on line $AC$ between $A$ and $C$.
There also has to be a point $F$ in our set where line $BC$ intersects the third line through $A$. Finally, lines $BG$, $CE$, and $AF$ must all meet at the final point $D$ in our set.
But we can now see that there’s no way for the three points $E, G,$ and $F$ to lie on a line, because $F$ is on the wrong side of $D$ from $E$ and $G$! Thus, there’s no way to add a $7$th line to the picture that passes through $3$ of the points.
Another way to prove this is using Ceva’s Theorem and Menelaus's Theorem. Ceva’s Theorem tells us that if the lines $AF$, $BG$, and $CE$ are concurrent, then $$\frac{AG}{GC} \cdot \frac{CF}{FB} \cdot \frac{BE}{EA} = 1.$$
But Menalaus’s Theorem tells us that if the points $G, F,$ and $E$ are collinear, then $$\frac{AG}{GC} \cdot \frac{CF}{FB} \cdot \frac{BE}{EA} = -1.$$ We’re using directed lengths here, which is why the product can be negative.
The same product can’t equal both $1$ and $-1$, so we’re done!
This proves our claim, so we’re done with the upper bound! Any questions so far?
what is the menalaus's theorm
It gives a condition for when three points are collinear, using ratios like above.
what is the Ceva theorm
Ceva's Theorem gives a condition for when three lines are concurrent, using the product of ratios above.
what do you mean by directed lengths
Directed lengths basically mean line segments are assigned a direction and can be positive or negative depending on which direction they're in.
what does concurrent mean?
Concurrent means the lines pass through the same point.
Are there any recommended sources to check up on these theorems and practice with them?
I think there is probably some information about them on the Aops website, or you can try googling them!
so which direction yields negative sign
You get to choose the direction, you just have to be consistent about it.
But all we did is prove there isn't 7 lines, not that there is 6 lines
That's true! All we've shown so far is that there can be at most $12$ abundant circles.
So to finish the problem, we need to show that $12$ abundant circles actually is possible!
For the lower bound, we need an example of how we can have $12$ abundant circles. There are a couple possible constructions here.
For the construction we’ll show here, we’ll use inversion through a point in $S$ again. This means we’ll start with $7$ points, and either lines through $3$ of the points or circles through $4$ of the points will become abundant circles after inversion.
Our construction is to use a triangle, the feet of its altitudes, and its orthocenter as our points.
How many circles are there that pass through at least $4$ of these points?
6 because 6 cyclic quads?
6
6
6?
6
Right! There are $6$ abundant circles in this picture, specifically $ADHF, CFHE, BEHD, AFEB, BDFC,$ and $CEDA$. We can show each of these is cyclic because they all have two right angles that would be inscribed in the either same arc or opposite arcs.
And how many lines are there through at least $3$ of the points?
6
6
Exactly, there are $6$ such lines: the three sides of the triangle and the three altitudes.
So when we invert this picture, we’ll get $6 + 6 = 12$ abundant circles. That proves our lower bound!
Any questions on Problem 4?
Yay, we did it!
i thought it was 8 pts?
Right. The 8th point is the one we're inverting about. So in our original picture with the triangle it's actually at the point at infinity.
No, but what a nice problem!
I think so too
about how many lines are through at least 3 of the points
I meant on the picture with the triangle, how many lines pass through 3 of the 7 labelled points.
Each of those lines also passes through the point at infinity and so will invert to an abundant circle.
Alright! On to Problem 5!
Hi everyone, I’m Mira. I’m going to present the solution to Problem #5.
Before I start, I should say that this Math Jam is taking a bit longer than we were expecting. The problems are long and complicated (as you might have noticed when you tried to solve them for the Mathcamp application).
But we definitely expect to be done by about 7 PT/ 10 ET, hopefully earlier. If you need to go before then, you can find the transcript later at https://www.mathcamp.org/qualifying_quiz/solutions/ (or just on the Math Jams archive: https://artofproblemsolving.com/school/mathjams-transcripts).
OK, on to Problem 5!
Problem 5, Part a (paraphrased for brevity):
A small town of 192 people has exactly 2 people with COVID19. We want to find them. Instead of testing everyone individually, we can test up to 32 people at once in a “pooled sample”. If the test comes out negative, none of the people in the pooled sample are sick. If the test comes out positive, one or more of the people are sick, but you don’t know who or how many. Can you find the town’s 2 sick people using 16 tests?
Let’s show that the answer is “yes” by finding an algorithm that works. What’s our first step?
separate into groups of 32
test 32 people
separate 192 into groups of 32
OK, we’ve divided the 192 people into 6 groups of 32 and tested each group. This used up 6 tests. What are the two possible outcomes?
One test positive, or 2 tests are positive
1 yes, 2 yeses
Same group or different group
both in the same group or 2 groups
only one group is positive or two groups test positive
two positive in one pool or two positive in two pools
two or one groups are positive
2 separate positives or both being in the same group
one group tests positive or two groups
2 positive groups or 1 positive group
Right, we have two cases:
A) two groups test positive, which means they have one sick person each;
B) one group tests positive, which means it has both of the sick people.
Let’s deal with case A first. Time for a lemma!
Lemma 1: If you know that a group of $2^n$ has exactly one sick person, you can find that person with $n$ tests.
Proof: Split your $2^n$ people in half and test the first half. The result of the test tells you whether the sick person is in that half or in the other half.
In either case, you’ve used 1 test to narrow down your list of ``suspects” from $2^n$ to $2^{n-1}$. Keep going this way, and you’ll use $n$ tests to narrow it down to 1 person -- the sick one. QED
I use binary
Exactly!
(For those who know some CS: this is just binary search. For those who know induction: this is basically a proof by induction, but it’s such a simple case that I skipped the formalism. The next proof will be more formal.)
Getting back to our situation: If two of your 6 groups of 32 test positive, you’ll need 5 more tests to find the sick person in each group. So how many tests have you used in total?
16!!!
16, so it works
16 tests
16
Yup, $32 = 2^5$, so we’ll need 5 tests for each group that tested positive. So we can find the two sick people using $6+5+5 = 16$ tests, as required.
Now let’s do case B: both sick people are in the same group of 32.
Time for another lemma! This one is a little more complicated, so I’ll use formal induction:
Lemma 2: If you know that a group of $2^n$ people has 2 sick people, you can always find them using $2n$ tests (or fewer).
Proof: Induction on $n$.
Base case: When $n=1$, you have 2 people and you already know that both are sick, so you need 0 tests. And indeed, $0<2\cdot 1$.
Inductive hypothesis: Suppose the statement is true for $n = k-1$. Now say you have a group of $2^k$ people in which you know that 2 are sick. We need to show that you can find those 2 people with $2k$ tests.
Split your $2^k$ people in half (two groups of $2^{k-1}$) and test both halves. We’ve now used 2 tests. Like before, there are two possibilities:
Case 1: Both halves test positive. [\i] This means each of them contains one sick person. What can we say now?
(oops, sorry about the formatting)
But the question still stands
We can use our first lemma
Do another binary search on each one
this becomes essentially case A once one test has been done per group (so it works!)
It takes $k$ tests to identify a sick person in each group, and since we have two groups, we need $2k$ tests.
By Lemma 1, we’ll need $k-1$ more tests for each of the two groups. So the total number of tests we used for our group of $2^k$ is:
$\quad$* 2 at the beginning, to test each half.
$\quad$* $k-1$ to find the sick person in the first half (by Lemma 1)
$\quad$* $k-1$ to find the sick person in the second half (by Lemma 1)
This adds up to $2k$ tests, as required.
Case 2: Only one of the two groups of $2^{k-1}$ tests positive. Then this group has both sick people in it. What do we do now?
(hey look, better formatting!)
then split it in half and try again!
do another binary test only on the group with both!
Split in half again and repeat the strategy
Keep on binary searching
Repeat, split in half, check again, recursively, until you either chop it into 2 people, or you get 2 people in different groups sick, then proceed with the first lemma.
You guys are all on the right track
But remember, we're in the middle of a proof by induction!
Apply the inductive hypothesis!
Hurrah, apply the inductive hypothesis!
We have a group of $2^{k-1}$ people, so we know we can find the 2 sick people in $2(k-1)$ tests.
(That IS the inductive hypothesis.)
And we already used 2 tests.
$2k-2+2=2k$
yup!
Applying this to Problem 5a, Case B: If one of your 6 groups of 32 tests positive, Lemma 2 says you can find the two sick people in that group in 10 tests. So in total, you will have used 6+10 = 16 tests, as required.
Questions on 5a?
OK, onto 5b.
Part b (paraphrased): Now you need to find 2 people out of 1000, but you have to specify your pooled samples ahead of time. You can’t decide whom to test next based on the results of earlier tests.
Find the minimum number of pooled samples needed to identify who is sick, to within a factor of 2. That is, find an $n$-sample strategy, and prove that $n/2$ samples are not enough, for some value of $n$.
This is a very different problem!
Finding the actual minimal number of tests needed in this case is an open problem! We ourselves don’t know the answer.
When we put this problem on the Quiz, here’s what we knew how to do:
$\quad$* We could prove that you need at least 61 tests (a lower bound) .
$\quad$* We knew a strategy for finding the 2 sick people using 93 tests (an upper bound) .
These two things are enough to answer the specific question that we asked on the Quiz, since $n=93$ works and $n/2=47 < 61$ doesn’t work. That’s why we posed the problem this way!
So we knew that the true minimum was somewhere between 61 and 93, but we didn’t know how to narrow it down within that interval.
We love putting open problems on the Qualifying Quiz, because we’re always hoping that someone will come up with a better partial solution than ours.
Almost always, somebody does -- including this time! You guys (collectively) beat us by a teeny-tiny bit.
No one came up with a strategy with fewer than 93 tests, but an applicant named Andrew, from Boyds, MD proved that you need at least 62 tests.
Andrew used essentially the same approach as us (and as the rest of the people who solved this problem), but he managed to raise the lower bound from 61 to 62 with some clever casework.
I won’t present Andrew’s proof here, because the casework is a little too messy to do in real time. But I did want to give him a shoutout!
OK, to recap: the solution to Problem 5b has two parts -- proving a lower bound and constructing an upper bound. Let’s do the lower bound first:
Claim: You need at least 61 tests to find the two sick people.
Proof: Suppose you can do it with $k$ tests. We want to show that $k \geq 61$.
Here’s the key observation: You can’t have two people in exactly the same set of tests. Why not?
We wouldn't be able to tell which one is sick.
You wouldn't know which one is sick?
Let these people be A and B, choose a third person C, then A and C infected is indistinguishable from B and C infected
if one is infected and the other isn't there'll be no way to differentiate them
Right! If exactly one of those two people is sick, you have no way of knowing from the test results which one it is. I’m going to call this the distinct test principle .
OK, so what’s the maximal number of people that can be in zero tests?
you mean zero-positive test?
No, I mean people who are not tested at all
1
1
one person
1&
Yup, there’s only one set of size 0 (the empty set). So by the distinct test principle, we can leave at most 1 person untested. At least 999 people must be tested.
A lot of people said 0, but it IS OK to leave one person out
if you find only one sick person among the other 999, then you know the one you left out is also sick
What’s the maximal number of people whom we can test just once?
Remember, we are assuming we're doing $k$ tests total.
(We're trying to show $k>=61$)
What happens if we put two people in the same test, and that's the only test for both of them?
Well if its positive, how can we find it out for them?
If positive then how do we know if 1 or 2 of those people are positive?
Right, so that doesn't work: we can't have two once-tested people in the same test.
So: What’s the maximal number of people whom we can test just once?
A lot of people are saying 1, but that's 1 per each test. Remember, we're doing $k$ tests!
k?
k.
k
k
Yes! Since there are $k$ tests, the distinct test principle tells us that there can be at most $k$ people who are tested once (and all of them have to be in different tests). If there were two such people in the same test, then there would be no way for us to tell them apart.
OK, so at most 1 person gets tested zero times and at most $k$ people get tested once.
OK, say there are $a$ people who get tested only once. (We just showed that $a \leq k$.) Then we’re testing at least $999-a$ people two or more times.
That’s a total of at least $a + 2(999 - a) = 1998 - a$ individual samples to put into our tests.
Since $a \leq k$, we conclude that the number of individual samples we’ll need to process in our $k$ pooled tests is at least $1999 - k$.
Now, how can we get an inequality in the other direction?
We can include at most 32 individual samples in one test
So how many individual samples can we process total?
Uh-oh, I think I'm losing you guys.
Forget the $1999-k$.
We're doing $k$ tests.
Each one can process at most 32 individual samples
So what's an upper bound on the number of individual samples?
why 32?
That's in the problem statement
(sorry, it was in Part a)
In our $k$ tests, we can't include more than $32k$ individual samples
$32 \dot k$
Yup!
But we said that we had to test at least $1998-k$ individual samples.
(Sorry, the 1999 before was a typo.)
This gives us the inequality $32k \geq 1998 - k$.
1998 / 33
Right!
Solving for $k$, we get $k \geq 1998/33 \approx 60.5$.
Since $k$ is an integer, we conclude that $k \geq 61$, QED.
How're you guys doing? Any questions?
When you say it, it makes sense, but I can't find it on my own! :o
Confused but alive. Doable.
OK, that was the lower bound: we know we need at least 61 tests.
(People who are confused: the ipper bound proof is easier to follow.)
*upper
So let’s show how we can actually find the 2 sick people using $93$ tests.
Put the people into a 32 by 32 grid
One approach is to imagine our 1000 people arranged in a $32 \times 32$ square. (Some of the places in the square will be unoccupied, since $32 \cdot 32 > 1000$. It doesn’t matter.)
We can now refer to individuals by their location in this square array: for example $(a,b)$ is the individual whose sample is in row $a$ and column $b$.
Now we can do a pooled test on the people in each column and a pooled test on the people in each row, for a total of $32 + 32 = 64$ tests. How many positive row-tests will we get?
Either 1 or 2
2 or 1
Right, it can be 1 or 2, depending on whether the sick people are in the same row or in different rows. And similarly, we’ll get either 1 or 2 positive column tests.
So there are three possibilities:
Outcome 1: Row $a$ tests positive, columns $c$ and $d$ test positive. What does that tell us?
(What are the coordinates of the sick people?)
(a,c) and (a,d) are infected
$(a,c)$ and $(a,d)$
there are two people (a,c) and (a,d) who are sick
$(a,c)$ and $(a,d)$ are sick.
(a, c), (a, d)
(a,c) and (a,d)
Right, the sick people have to be $(a,c)$ and $(a,d)$.
Outcome 2: Rows $a$ and $b$ test positive, column $c$ tests positive. This tells us that the sick people must be $(a,c)$ and $(b,c)$.
Now here’s the interesting case:
Outcome 3: Rows $a$ and $b$ test positive, columns $c$ and $d$ test positive. What does that tell us?
Either (a, d) and (b, c); or (a, c) and (b, d)
Either $(a,c)$ and $(b,d)$ are sick, or $(a,d)$ and $(b,c)$ are sick.
Either (a, c) and (b, d), or (a, d) and (b, c)
(a,c), (b,d) or (a,d), (b,c)
Right, there are two options for who the sick people are. It could be either the pair $(a,c)$ and $(b,d)$ or the pair $(a,d)$ and $(b,c)$.
Here’s a picture ($10\times10$ instead of $32\times32$, but same idea):
We know that the sick people are either the two blue squares or the two green squares. Now we just need some way to tell which pair of squares it is!
But remember, we’re designing all these tests in advance, so we don’t know ahead of time which two pairs we’ll be confused between. What can we do?
test the diagonals
diagonals?
test diagonals?
Yes! I think this is the hardest idea to come up with.
So, maybe like this:
We do a pooled test on the squares through which each diagonal passes. (I drew all my diagonals going from top-left to bottom-right, but you can also have them going in the other direction.)
In our picture, if the people in the bright-blue squares are sick, then the two light-blue diagonals will test positive. If the people in the bright-green squares are sick, then the light-green diagonal will test positive.
why diagonals?
Just because it works!
Notice that the two sick people might be on the same diagonal as each other (like the green squares) or they might not (like the blue squares). It doesn’t matter.
The important thing is that the blue squares can never be on the same diagonal(s) as the green squares! So knowing which diagonal(s) tested positive will tell you which of the two possible pairs of people (green or blue) is sick.
Why do we do it in the first place though? What's the motivation?
Well, we have to distinguish between the two pairs of squares.
Rows and columns don't work any more, but this does.
You could do other things -- e.g. diagonals of other slopes could also work
Anything where the blue and green squares would not be in the same tests
Who came up with the idea to use diagnals?
Apparently many people independently!
But it's a very clever idea that probably took them a while to come up with.
It certainly took me a while!
OK, so we’ve decided to test all the diagonals. How many tests does this take? How many diagonals are there in our picture (the $10\times10$ version)?
19.
10 + 9 = 19
Right, the $10\times10$ square has $10+10-1$ diagonals. And in the same way, the $32\times32$ square has 63 diagonals.
But we already did 64 tests on the rows and columns, so it looks like our strategy requires 127 tests. That’s too many! I promised you 93.
Let’s get it down to 96 tests first, and then we can talk about how to get further down to 93. What can we do to decrease the number of diagonal tests from 63 to 32?
diagonals can wrap around
Yes! Here’s a picture:
We let the diagonals “wrap around” and combine the pieces of the same color into the same pooled test. So all the squares covered by the red diagonal are in one test, all the squares covered by the orange diagonal are in another test, etc.
Another clever idea!
You can think of this as combining all squares $(x,y)$ that satisfy the linear equation
$$
x+y = i mod 32,
$$
for each value $i = 0,1,\dots,31$. (Or, in our picture, mod 10.)
For example, if the bottom left square is $(0,0)$, then what is $i$ for the orange diagonal?
$$
x+y = i \mod 32,
$$
6
If the bottom left is $(0,0)$, then the orange diagonal has points $(0,6)$, $(1,5)$, etc.
The orange diagonal consists of all squares $(x,y)$ satisfying $x+y = 6 \mod 10$.
If you let the diagonals wrap around this way, a $32\times 32$ square will have only $32$ of them, since $i$ ranges from 0 to 31.
So now we’ve used 32 row tests, 32 column tests, and 32 diagonal tests, for a total of 96 tests.
This already solves the problem as posed on the Quiz, since 96 works and $96/2 = 48$ doesn’t work. But just for fun, how can we get it further down to 93?
We're running late, so I'll just give it away.
We can skip one row, one column, and one diagonal.
Then, for example, if only one row tests positive, you don’t know if it’s because both sick people are in that row or because one is in that row and the other is in the row that wasn’t tested. But the columns and the diagonals will always let you disambiguate that.
There are a bunch of cases to check here, but it always works.
(Check it for homework! )
Any questions about this strategy?
Nope! (Except the 93 part... I'll have to take a look at that some other time )
It's kind of like the way we could leave one person out of 1000 untested.
We can leave one row, one column, and one diagonal untested.
The wraparound idea is beautiful! I didn't find it, but managed to get down to 119 tests by moving people away from the corner...
yes, 119 was good enoughg!
Before we wrap up with Problem 5b, I wanted to mention a really nice alternative testing strategy that a lot of people used:
Choose three pairwise relatively prime numbers (e.g. 32, 33, and 35) and do three sets of pooled tests: one set pooling all numbers that are the same mod 32, one for mod 33, and one for mod 35. Then apply the Chinese Remainder Theorem. Do a little modular arithmetic and you’re done.
If you think about it, this is actually very similar to our row-column-diagonal strategy in spirit, even though one uses geometry and one uses number theory.
OK, on to 5c! (Sorry, this is running so late folks!)
(But 5c is very cool, so I hope you'll stay!)
Problem 5c (paraphrased). A city wants to test 10,000 people. You know that each person has a 5% chance of being sick, but you don’t know the exact number of sick people. This time, you don’t have to design the tests in advance (as in Part b); you can base each test on your previous results (as in Part a).
Find a strategy to reduce the number of tests that need to be done; it doesn't need to be provably optimal, but do your best. On average, how many tests will be required?
Problem 5c was probably the most fun problem on the Quiz for us to read, and also the hardest for us to evaluate!
People came up with so many different strategies and really clever ways of computing the expected number of tests for their strategies. But we also had to scrutinize each strategy carefully, because some of them had subtle probability mistakes that were really hard to spot.
And once again, you guys collectively beat us! Three applicants independently came up with a really elegant strategy that was both simpler and more efficient than our own best strategy.
This is the strategy I’ll present today -- the most efficient strategy of all the ones that were submitted by our 540 applicants and ourselves!
It is due to:
$\quad$Sumedh, from San Jose, CA
$\quad$Andrew, from Pennington, NJ
$\quad$Giorgi, from Tbilisi, Republic of Georgia
(There were also many other awesome approaches that were only a tiny bit less efficient. It makes me want to teach a whole Mathcamp class on the different solutions to this problem -- both the correct ones and the very subtly wrong ones!)
wow, 540 applicants!
Indeed.
Ready for the “winning” strategy? Here we go.
First, divide your 10000 people into 625 groups of 16. We’ll call these 625 groups the Big Groups..
(BTW, by “groups” I just mean “sets”; this has nothing to do with group theory. )
tests of 32 people are not necessarily optimal?
Yeah, we'll see why soon.
For each Big Group, do the following:
$\quad$* Let $G_0$ be the entire group.
$\quad$* Do one test on all of $G_0$:
$\qquad$* If it’s negative, you’re done with this Big Group. Go on to the next one.
$\qquad$* If it’s positive, use the approach from Lemma 1 in 5a (binary search) to find just one sick person in $G_0$.
What’s the fastest way to do this?
(to find one sick person)
binary search
$\left \lceil \log_{2} G_0 \right \rceil.$
Right, it’s just like Lemma 1 in 5a: you split your group in half and test the first half. If it tests positive, proceed with that half. If it tests negative, proceed with the other half. Keep splitting in half until you’re done.
So you’ll test 8 people, then 4 people, then 2 people, then 1 person: 4 tests.
( $|G_0| = 4$)
Are we done with this Big Group now? Can we move to the next one?
how about more than 1 sick
No - need to find all infected people in group
That’s right, there may be other sick people in our Big Group $G_0$. So let’s remove the sick person we just, and call the remaining group of 15 people $G_1$.
Now we go back and repeat all the preceding steps with $G_1$:
$\quad$* Do one test on all of $G_1$:
$\qquad$* If it’s negative, you’re done with this Big Group. Go on to the next one.
$\qquad$* If it’s positive, find one sick person in $G_1$.
How many tests does it take to find one sick person in $G_1$?
4?
4
Sort of.
$G_1$ has 15 people, which is not a power of 2. But you can still keep splitting it in (approximately) half, over and over until you reach 1.
Depending on luck, it might take you either 3 or 4 splits (tests) until you find the sick person. But it’s definitely no worse than with 16 people, so we know it’s at most 4 tests.
OK, say we’ve found a sick person in $G_1$. What do you think we should do next?
check g_2 which as 14 people
take them out, then we have G_3 and repeat the process
You guessed it! We form $G_2$ by removing the sick person we just found from $G_1$. And then we just keep going in the same way:
$\quad$* Do a pooled test on all of $G_i$, for whichever $i$ you’re on.
$\qquad$* If it’s negative, you’re done with this Big Group. Go on to the next one.
$\qquad$* If it’s positive:
$\quad\qquad$* Find one sick person in $G_i$.
$\quad\qquad$* Form $G_{i+1}$ by removing this sick person from $G_i$.
$\quad\qquad$* Do it all again for $G_{i+1}$.
So we start with a group of 16. We keep finding sick people one at a time, removing them, and checking whether the remaining group still tests positive.
When does this process end?
when the whole $G_i$ tests negative
Once some $G_i$ tests negative, we can be sure that we’ve found and removed all the sick people from our original big group $G_0$. So we move on to the next Big Group.
We go through all the Big Groups this way and we’re done. That’s the whole strategy.
what about the 5%
Right, the 5% becomes relevant when you try to compute the expected number of tests for this strategy
I don’t know about you, but when I first read this strategy, I thought: “This is SO inefficient!”
It seems so silly to retest all of $G_{i+1}$ each time!
For instance, why are we retesting all 15 people in $G_1$ as if we know nothing about them? Most likely, we’ve already ruled some of them out when we were looking for our first sick person in $G_0$.
E.g., if the first group of 8 that we tested came out negative, why include these 8 people in $G_1$?
So it’s clear that this strategy can be improved, possibly by a lot.
The problem is, adding all that extra complexity makes it more difficult to compute the expected number of tests that will be required. This is why all three people who came up with this strategy chose to keep it as simple as possible.
But it turns out that even this simplistic version, which seems so inefficient at first glance, does better in expectation than any other strategy proposed by our 540 applicants (or ourselves)!
I think in the interest of time, I should stop here?
Computing the expected number is the best part, but we really need to get to problem 6
I'll stay afterwards for anyone who wants to see it, but I feel bad about Misha waiting to do #6.
So let's pass it on to him!
The answer is 3125, as an upper bound
Okay. Hi everyone! I'm Misha, and I will tell you how to solve problem #6.
Problem 6. Infinitely many campers line up in a row for a Competitive Hanabi Set (CHS) tournament. Each camper has an unknown, numerical skill level in playing CHS. A more skilled player will always beat a less skilled one, but the game is very subtle; it is impossible to compare skill levels, except in a one-on-one match. We will write $A \prec B$ if camper $A$ loses to camper $B$ in CHS (which means that $A$'s skill level is a lower number than $B$'s).
A subsequence of $n$ campers is an $n$-tuple of campers $(A_1, A_2, \dots, A_n)$ such that the campers are standing in the infinite row in that order, not necessarily consecutively. ($A_1$ is to the left of $A_2$, who is to the left of $A_3$, and so on.)
(a) You must schedule CHS games between the campers to find a subsequence $(A_1,A_2,A_3)$ of three campers such that either $A_1 \prec A_2 \prec A_3$ or $A_1 \succ A_2 \succ A_3$. You can use the results of each game to schedule the next.
How can you minimize the number of games required in the worst case? Prove that your answer is optimal.
(I'll give you a moment to refresh your memory of this problem. Another optimization problem - like problem #3 and problem #5 were.)
Problem 6(a) solution.
Let's first go over the definitions and how to think about them. Here's a diagram of the outcomes of a few games: an arrow $A \to B$ means $A$ beats $B$, and I colored arrows red if they point right and blue if they point left.
Just to check that everyone's on the same page, where are the subsequences we want?
ACD or BCE
ACD, ECB
ACD and BCE?
We are happy with $(A,C,D)$ because $A \succ C \succ D$, and with $(B,C,E)$ because $B \prec C \prec E$. We are not happy with $(A,C,B)$ or $(E,C,D)$, because they're not subsequences!
Anyway, we can find one of these subsequences in 4 games. Here's one way to do that. Begin with two games $A$ vs. $B$ and $D$ vs. $E$, which can have four outcomes:
What should we do in case (ii) - or case (iii), which is symmetric? Here, one more game is enough.
Schedule a game between B and D
We play B vs D. Either $(A,B,D)$ or $(B,D,E)$ will be the subsequence we want.
(The color varies depending on which case we're in, but the two cases are otherwise identical.)
What about cases (i) and (iv)? Here, we'll need two more games.
B vs C and C vs D
compare C with B and D
We play B vs C and C vs D. In case (i), we either have $A \prec B \prec C$, or $B \succ C \succ D$, or $C \prec D \prec E$. In case (iv), we get the opposite of one of these.
Maybe an easy way to see this case is: we've ended up playing 4 games - A vs B, B vs C, C vs D, and D vs E.
If we don't get the subsequence we wanted, the arrows must alternate red-blue-red-blue
but we're looking at cases (i) and (iv) - where the first and last arrows are the same color!
There's more to the problem: verifying that 3 games are not enough. For reasons of time, I'll skip this; there's clean and less clean ways to do the casework. I'll just say that a common pitfall was to assume that the first two games have a player in common - the strategy above shows that other approaches are valid!
That's part (a) and I want to pause here for a bit for questions, just to make sure everything makes sense before we make everything more complicated. Everyone with me?
Problem 6(b). Now, suppose you want to find either a subsequence $(A_1, A_2, A_3)$ such that $A_1 \prec A_2 \prec A_3$, or a subsequence $(A_1, A_2, \dots, A_n)$ such that $A_1 \succ A_2 \succ \dots \succ A_n$.
Show that you can always do this in less than $10n$ games. (As a challenge, how much can you improve this bound?)
Problem 6(b) solution.
We gave you $10n$ games for flexibility, but there are shorter solutions. I'll present the best bound anyone actually found (though it's possible to do better).
One way to try to find the long subsequence we want is to keep trying to extend a sequence. Have a camper play against other campers to their left until they win a game, then repeat this with the loser of that game. After a while, we get a picture like the following:
(So for example 1 lost to 2, so 1 kept playing and lost to 3, so 1 kept playing and finally beat 4.)
this can go on forever
Okay so we have a problem if lots of people keep losing a lot.
So we need a backup plan to get the long subsequence we want a different way. Can anyone spot some games we can play in the diagram above whose outcomes are more or less forced?
2, 3 and 3, 4
3,4,6?
Campers with a blue arrow out of them must beat everyone to their right, or else we get the three-camper subsequence we wanted. For example, camper 3 must beat camper 5, or else we get $1 \prec 3 \prec 5$.
That's the "almost forced" games I mean: those outcomes are fixed, or else we're done immediately.
With this in mind, what's a long "backup" sequence of red arrows we can force?
(Still in the same diagram if you like, or you can try for a general claim.)
Well, this is tricky, especially late at night - so let me spoil the secret for you. Here's the diagram again:
Take all the "skipped" campers with a blue arrow pointing out of them, and have them play in order! If any of the outcomes give us another blue arrow, we're done. Otherwise, we get a long subsequence with $A_1 \succ A_2 \succ A_3 \succ \dots$: in this case, $2 \succ 3 \succ 5 \succ 8 \succ 9$.
so that's saying "play 2 vs 3, 3 vs 5, 5 vs 8, 8 vs 9" in this case.
they all have to be red
Exactly! If one of those games gives us a blue arrow, we're done immediately.
So we have two ways to win with a long subsequence. How many turns will this take?
(Either we get a long subsequence right away, or we skip lots of campers and get a long "backup" subsequence.)
So I'm seeing a good partial answer here that's missing one step. Think of it this way: we're happy if we get $n-1$ red arrows, or we can execute our backup plan if we get $n-1$ blue arrows. How many games until one of these happens?
2n
$2n-2?$
Around $2n$, if we don't worry about the constants. (Actually, $2n-3$ is enough.)
But that doesn't mean that $2n-3$ games are all we need! What else is missing?
$n$ games for the backup plan
you still need n more games
Well, $n-1$ more games - there are $n-1$ $\succ$ signs between $n$ campers.
But who cares about the constant, anyway.
So this strategy's worst case is $3n-4$ games total!
whats the runtime for the fastest known strategy?
(The best bound I know of is around $\frac83 n$, which involves doing this strategy at the same time as a mirror image of it, and playing them off against each other to take a shortcut. The only lower bound I know is $2n$, so we're not sure what the best possible answer is!)
Alright, now on to the last part of the last problem!
Problem 6(c). Next, the campers come together in an unorganized crowd, and decide to hold a Two-Player Castlefall (TPC) competition. TPC is a highly strategic and complex game, so it cannot be reduced to a single skill level: in TPC, it is possible that $A \prec B$ and $B \prec C$, but $A \succ C$.
You want to find $n$ campers $A_1, A_2, \dots, A_n$ such that whenever $i < j$, $A_i \prec A_j$. How can you minimize the number of TPC games required in the worst case?
Problem 6(c) solution.
I want to point that two things change in this problem. First, skill in TPC is not transitive, so we have to play $\binom n2$ games just to verify that $n$ campers are sorted by skill. Second, no more "subsequences": the campers are in an "unorganized crowd", and we can reorder them however we like!
So forget most things that happened in part (a) and (b)
Here's a kind of setup that will work for us. Take a camper $X$ and have them play LOTS AND LOTS of games. Either $X$ wins many of them, or $X$ loses many of them; these are symmetric, so let's say $X$ wins lots of games. How do we want to use this?
Use the campers losing against X to build a sequence, then add X to it
We want to find $n-1$ campers $A_1, A_2, \dots, A_{n-1}$ among the campers $X$ beats. Then $X$ can be camper $A_n$, because $X$ beats each of $A_1$ through $A_{n-1}$.
(If $X$ lost a bunch of games, we'd find $n-1$ campers $A_2, \dots, A_n$ among the people that beat $X$, and then $X$ could be camper $A_1$. Symmetric!)
To properly analyze this strategy, we have to keep track of two things. Let's say that finding a group of $k$ campers ordered by skill took us $N_k$ games involving $C_k$ campers; we want to find a formula for $N_k$, but we'll need to know $C_k$ as well.
To find $C_k$ campers that either all lose to $X$ or all beat $X$, we should play $2C_k-1$ more games (involving $C_{k+1} = 2C_k$ campers total). The total number of games is $N_{k+1} = N_k + 2C_k - 1$. This gives us a recurrence to solve for $N_k$ and $C_k$. First of all, what's $C_k$?
$2^{k-1}$
We start with $C_1 = 1$ camper necessary to find a group of $k=1$ campers. Then $C_{k+1} = 2C_k$, so it doubles each time: we have $C_k = 2^{k-1}$.
(I saw several $2^k$ answers, but watch out for the base case.)
Where did we find that $C_{k+1}=2C_k?$
That's our strategy for playing lots of games involving $X$ at work!
We need to wait until $X$ either loses to $C_k$ campers or beats $C_k$ campers.
So we should have $X$ play against $2C_k - 1$ other campers, and then the total number of campers together with $X$ is $2C_k$.
Anyway, after all that work, our other recurrence is now $N_{k+1} = N_k + 2^k - 1$.
That's going to tell us the number of games to find a group of $k$ campers ordered by skill.
This one's a bit trickier, and we already got a lot of practice solving recurrence relations way back in Question 2. So I'm just going to tell you that $N_k = 2^k - k - 1$; we can prove this by induction. To find a group of $n$ campers ordered by skill, we'll need $2^n-n-1$ games.
If you're familiar with Ramsey's theorem, you may recognize this argument! A result called "Ramsey's theorem for tournaments" tells us that if you play all possible games between $2^{n-1}$ campers, we find the group we wanted. But just citing this theorem gives a worse bound: there are $\binom{2^{n-1}}{2}$ possible games between those campers, which is closer to $4^n$ than $2^n$.
Anyway, that wraps up 6(c), which I went through a bit quicker because we're 90 minutes late.
Are there any other questions about problem 6?
is $2^n - n - 1$ the best known algorithm for the problem?
It's the best thing I knew writing the problem, and the best thing anyone came up with!
does this problem have a name?
Find the optimal answer and prove it's optimal, and we can name it after you!
There is an exponential lower bound, too, which comes from Ramsey's theorem.
Looks like that's it for questions about problem 6, so it's back over to MiraBernstein to wrap things up!
OK, guys, it's very late!
I promised I'd do 5c if people still wanted, and I'm still game.
But I can also just post the solution in the same place as this Math Jam
If I get 5 yes'es, I'll do it now.
Oh wow!
yes
YES
yes!
yes
Yes
yes
Yes!
yes'es
sure
yes
I eman 5 yes'es from distinct people
OK, let's do this thing!!
We want to find the expected number of tests for our strategy.
But you surely forgot what our strategy is
So let's review quickly
Problem 5c (paraphrased). A city wants to test 10,000 people. You know that each person has a 5% chance of being sick, but you don’t know the exact number of sick people. This time, you don’t have to design the tests in advance (as in Part b); you can base each test on your previous results (as in Part a).
Find a strategy to reduce the number of tests that need to be done; it doesn't need to be provably optimal, but do your best. On average, how many tests will be required?
Best strategy that we know:
First, divide your 10000 people into 625 groups of 16. We’ll call these 625 groups the Big Groups..
(BTW, by “groups” I just mean “sets”; this has nothing to do with group theory. )
For each Big Group, do the following:
$\quad$* Let $G_0$ be the entire group.
$\quad$* Do one test on all of $G_0$:
$\qquad$* If it’s negative, you’re done with this Big Group. Go on to the next one.
$\qquad$* If it’s positive, use the approach from Lemma 1 in 5a (binary search) to find just one sick person in $G_0$.
There may be other sick people in our Big Group $G_0$. So let’s remove the sick person we just, and call the remaining group of 15 people $G_1$.
Now we go back and repeat all the preceding steps with $G_1$:
$\quad$* Do one test on all of $G_1$:
$\qquad$* If it’s negative, you’re done with this Big Group. Go on to the next one.
$\qquad$* If it’s positive, find one sick person in $G_1$.
OK, say we’ve found a sick person in $G_1$. We form $G_2$ by removing the sick person we just found from $G_1$. And then we just keep going in the same way:
$\quad$* Do a pooled test on all of $G_i$, for whichever $i$ you’re on.
$\qquad$* If it’s negative, you’re done with this Big Group. Go on to the next one.
$\qquad$* If it’s positive:
$\quad\qquad$* Find one sick person in $G_i$.
$\quad\qquad$* Form $G_{i+1}$ by removing this sick person from $G_i$.
$\quad\qquad$* Do it all again for $G_{i+1}$.
Once some $G_i$ tests negative, we can be sure that we’ve found and removed all the sick people from our original big group $G_0$. So we move on to the next Big Group.
We go through all the Big Groups this way and we’re done. That’s the strategy.
Any questions about what the strategy is, before we analyze it?
Do we have to keep the inefficiency from the repeated tests so we can guarantee the average required tests?
That's exactly right. This strategy is very easy to make more efficient!
For instance, why are we retesting all 15 people in $G_1$ as if we know nothing about them? Most likely, we’ve already ruled some of them out when we were looking for our first sick person in $G_0$.
For instance, if the first group of 8 that we tested came out negative, why include these 8 people in $G_1$?
So it’s clear that this strategy can be improved, possibly by a lot.
The problem is, adding all that extra complexity makes it more difficult to compute the expected number of tests that will be required. This is why all three people who came up with this strategy chose to keep it as simple as possible.
(Sorry, I'm repeating some stuff I said already, but it's good to have the context)
So let’s compute the expected number of tests we’ll need. Actually, we’ll be computing not the exact expectation, but just an upper bound on it.
From here one, this is new stuff.
So in this strategy, there are two types of tests:
1) Tests that tell you that you’re done with your current Big Group and can move on to the next one.
2) Tests that you use to actually find a sick person.
How many tests of Type 1 are there?
625
625
625
In each Big Group, we keep removing sick people one by one, until we get to some $G_i$ that tests negative. So each Big Group will require one final negative test to clear it, after all the sick people have been removed.
That is the test of Type 1! (If the Big Group had no sick people to begin with, then this will be its only test.)
So there’s exactly one test of Type 1 per Big Group, which makes 625 such tests total.
By the way, what can we say about the results of all these 625 tests?
they're all negative
Right, it’s important to notice that Type 1 tests are all negative -- they’re the ones that finally clear each Big Group. All positive tests are automatically of Type 2.
To count the tests of Type 2, we’ll look at each sick person in turn. Remember, in this strategy we’re always looking for one sick person at a time!
So let’s focus on just one sick person. How many tests did we do that led us to this particular person ?
Depends on its position in the binary tree
True!
up to 4
Almost! You need to count the original test that told you that $G_i$ still had sick people in it.
To find the first sick person in our Big Group $G_0$, we needed 5 tests of Type 2: we tested the whole group (16 people), then 8 people, then 4, then 2, then 1.
(Remember, the original test of the whole group $G_0$ is of Type 2. The 625 tests of Type 1 are the ones that tell us that we’re done with a Big Group, not the ones that tell us we need to look further.)
$G_1$ has 15 people, so we saw that finding the sick person will take either 4 or 5 tests (including the original test of all of $G_1$). So on average, we’ll need a little less than 5 tests of Type 2, but definitely at most 5.
Finding the next sick person, at the $G_2$ stage, will take even fewer tests on average -- and still definitely fewer than 5.
So no matter at what stage we found the particular sick person we’re focusing on right now, it took us at most 5 tests of Type 2 to find them.
Say we have $S$ sick people total (in all the Big Groups together). What is an easy upper bound on the number of Type 2 tests that we’ll need to find them all?
5 * S
5xs
Right: Finding each sick person requires at most 5 tests of Type 2, for a maximum of $5S$ tests of Type 2.
Now what’s the expected number of sick people, if each person out of 10,000 has a 5% chance of being sick?
500
500
$500$
so 2500 Type 2 tests maximum
Yup, on average we expect 500 sick people. Each of these will require at most 5 tests of Type 2. Plus we have the 625 final tests of Type 1, one for each Big Group. So what can we say about the expected total number of tests?
3125
3125
$\le 3125$
To be clear: we don’t know the expected number of tests exactly, but we know it is at most
$625 + 5 \cdot 500 = 3125$.
(Finding the maximum of an expectation may sound a little weird. What we’re saying is: the expected number of tests under this strategy is some number $X$. We don’t know what $X$ is, but we know $X <= 3125$.)
Believe it or not, even this upper bound is better than the expected number of tests in any other strategy that anyone came up with!
Really?
Weird, right? I found it very unintuitive!
Even with all the inefficiency.
One last thing to discuss is why we chose 16 as the size of our Big Groups. If your big Groups are of size $N$, then what’s the upper bound you get using the reasoning above?
(think about where the 625 came from, where the 5 came from, and where the 500 came from)
Maximal power of two!
No, the maximum is 32
??? 16 is the largest power of 2 that divides 10,000
@peach09: Oh, yes, that's true. But you could do groups of size 32 and one group of size 16.
The formula for the upper bound is $\lceil\frac{10000}{N}\rceil + 500 (\lceil\log_2 N\rceil + 1)$.
People got close to this, but not quite
$\lceil\frac{10000}{N}\rceil$ is the Type 1 tests
The rest are the type 2 tests
Here’s the graph of this function for $1\leq N \leq 32$:
As you can see, the upper bound is minimized when $N=16$. (It’s possible that the actual expected value is minimized for a different $N$, but we don’t have a way to compute that at present.)
I want to spend a tiny bit of time talking about other strategies, but first, any questions about this one?
how do you even think this up?
Good question! It was not a common answer.
Other good strategies were much more common.
This one just happened to beat them!
So to recap, this “3125 strategy” is the best one we know of for now. Of course, all the improvements that you guys suggested earlier would make it even better, but we don’t know how to measure the improvement.
In case you’re curious, the next best strategy that anyone came up with requires about 3196 tests on average. It is very complicated: you have to split your groups not in half but into different proportions each time. This was our own best strategy, and one of the applicants came up with it as well.
I wish I could go over more of the strategies that different people came up with -- there were so many good ones! But I’ve been talking for too long already. So I leave you with the following challenge:
If you solved this problem and came up with a strategy that seems to beat 3125, then we promise you made a mistake somewhere. (We checked all the solutions very carefully; many of the errors were very subtle.) See if you can figure out where the error was!
If you came up with a strategy that doesn’t beat 3125, but intuitively feels like it should , figure out why it’s actually less efficient. For some of the best strategies, this is not so easy to understand: I still haven’t completely wrapped my mind around it…
(One last thing to remember as you think about this: the optimal strategy depends on the probability of each person being sick. For 5%, this one happens to be the best strategy we know, but other strategies will certainly beat it for other probabilities.)
Whew, that's it for #5, though there's lots more to tell!
Thanks for sticking it out for so long!
If we didn't get to your question, you can also post in the Mathcamp forum here on AoPS, at http://www.artofproblemsolving.com/community/c135_mathcamp - the Mathcamp staff will post replies, and you'll get student opinions, too.
We hope you enjoyed this year's QQ, and stay tuned for the 2022 Quiz (and the summer 2022 application) coming up in January. I'll stick around for another five minutes and answer non-Quiz questions about the program and the application process in case anybody has questions about Mathcamp itself.
Thanks again, everybody - good night!
But is there any strategy that is better asymptotically? Like something like $O(\log\log n)$ instead of $O(\log n)$
Not that we know of.
But again, it also depends on the probability of being sick
Thank you!
Good night!
Thanks, bye!
Thanks
TYSM!
Thanks
Thank you!
Thank you!!
Thank you!
Thank you guys!
I admire your attention span!
If you have further questions for Mathcamp, you can contact them at https://www.mathcamp.org/contact_us/
Copyright © 2024 AoPS Incorporated. This page is copyrighted material. You can view and print this page for your own use, but you cannot share the contents of this file with others.