AMC answer frequencies/adventures with Python
by BOGTRO, Oct 25, 2015, 2:52 AM
So tonight I decided I should actually figure out how Python works, since the language made no sense to me and yet it's somewhat important anyway. As usual, instead of actually learning how something works, I decided to start with a problem, write some god-awful attempt at making an attempt towards doing the problem, and eventually building up something working through (heavy) use of google.
Anyway, the problem I decided to do was to determine the frequency of AMC answers over the last few years (where "few" in this sense turned out to be 14), because
Anyway, after several adventures messing with the system path and trying to get the standard libraries to actually work, it appeared that "HelloWorld.py" in Notepad++ was actually running properly in the command prompt, so I could finally get started on the problem. But turns out that Python 3.5 and Python 2.7 are very distinct things, and so
All this is to lead up to a couple of hours later, where results were achieved!
For those interested, the code is below, but if you're planning on C&Ping be warned that it takes about a minute to run.
Python code
Anyway, the problem I decided to do was to determine the frequency of AMC answers over the last few years (where "few" in this sense turned out to be 14), because
- Someone once said, long ago, that this would be an interesting programming exercise.
- It seemed like something that would actually be a good practical programming exercise, as opposed to the USACO/Codeforces/etc. that I usually do.
- It's apparently extremely difficult to do this exercise in a major language other than Python, so it seemed like it would be well-suited to my purposes.
Anyway, after several adventures messing with the system path and trying to get the standard libraries to actually work, it appeared that "HelloWorld.py" in Notepad++ was actually running properly in the command prompt, so I could finally get started on the problem. But turns out that Python 3.5 and Python 2.7 are very distinct things, and so
print 'Hello World!'
didn't work. Oops. Ok, so now I actually got HelloWorld.py to run properly, so I could finally actually get started on the problem.All this is to lead up to a couple of hours later, where results were achieved!
_________AMC 12_________
Problem 1: 6 1 12 4 5
Problem 2: 6 6 6 6 4
Problem 3: 5 7 4 6 6
Problem 4: 6 11 5 2 4
Problem 5: 6 5 6 10 1
Problem 6: 9 6 4 6 3
Problem 7: 3 10 6 6 3
Problem 8: 4 3 11 6 4
Problem 9: 7 8 6 5 2
Problem 10: 6 7 4 6 5
Problem 11: 3 3 9 6 7
Problem 12: 2 8 2 12 4
Problem 13: 1 11 5 8 3
Problem 14: 3 4 6 12 3
Problem 15: 3 5 7 11 2
Problem 16: 3 6 9 5 5
Problem 17: 4 9 4 9 2
Problem 18: 3 8 9 5 3
Problem 19: 4 7 4 5 8
Problem 20: 3 9 8 3 5
Problem 21: 9 2 7 7 3
Problem 22: 3 5 9 6 5
Problem 23: 7 7 7 4 3
Problem 24: 3 6 8 7 4
Problem 25: 3 8 5 7 5
Total: 112 162 163 164 99
_________AMC 10_________
Problem 1: 3 1 13 7 4
Problem 2: 5 7 7 3 6
Problem 3: 6 6 4 7 5
Problem 4: 6 6 8 6 2
Problem 5: 3 9 6 5 5
Problem 6: 6 6 7 7 2
Problem 7: 6 9 7 3 3
Problem 8: 4 9 4 6 5
Problem 9: 6 10 2 8 2
Problem 10: 6 4 8 3 7
Problem 11: 5 7 9 5 2
Problem 12: 8 4 10 3 3
Problem 13: 3 7 6 6 6
Problem 14: 4 7 6 9 2
Problem 15: 3 5 5 8 7
Problem 16: 3 9 4 7 5
Problem 17: 2 9 8 8 1
Problem 18: 2 7 4 8 7
Problem 19: 4 3 11 5 5
Problem 20: 4 5 5 11 3
Problem 21: 5 6 8 6 3
Problem 22: 4 8 7 8 1
Problem 23: 1 8 7 9 3
Problem 24: 5 9 3 5 6
Problem 25: 4 12 6 4 2
Total: 108 173 165 157 97
Interestingly, B/C/D have almost exactly the same frequency, while A/E are significantly far behind on both tests, though the 10 has a very slightly higher concentration of Bs. Use this extremely important information wisely.For those interested, the code is below, but if you're planning on C&Ping be warned that it takes about a minute to run.
Python code
import urllib.request def print_12(answer_freq_12, letter_freq_12): ''' Prints the AMC 12 data in a more easily readable format ''' print('_________AMC 12_________') for i in range(25): print('Problem {0}:'.format(i+1), end=" ") for j in range(5): print(answer_freq_12[i][j], end=" ") print() print('Total:', end=" ") for i in range(5): print(letter_freq_12[i], end=" ") print() def print_10(answer_freq_10, letter_freq_10): ''' Prints the AMC 10 data in a more easily readable format ''' print('_________AMC 10_________') for i in range(25): print('Problem {0}:'.format(i+1), end=" ") for j in range(5): print(answer_freq_10[i][j], end=" ") print() print('Total:', end=" ") for i in range(5): print(letter_freq_10[i], end=" ") print() def map(i): ''' Maps an answer choice (A, B, C, D, or E) to its numerical value. Returns 5 (signifying an error) if the input is not a possible answer choice. ''' if i=='A': return 0 if i=='B': return 1 if i=='C': return 2 if i=='D': return 3 if i=='E': return 4 return 5 def update(year, grade, test, prev_ans_freq, prev_let_freq): ''' Updates answer frequencies, both by problem and overall. Takes in 5 values as input: Year: The year of the test Grade: Either 10 or 12, signifying the level of the test (AMC 10 or AMC 12) Test: Either 'A' or 'B', signifying the date of the test given (A-date or B-date) prev_ans_freq: Previous answer frequencies by problem prev_let_freq: Previous answer frequencies overall ''' url='http://artofproblemsolving.com/wiki/index.php/{0}_AMC_{1}{2}_Answer_Key'.format(year, grade, test) response=urllib.request.urlopen(url) html_str=str(response.read()) idx=0 ''' Answer choices in the html are of the form <li>answer</li> or #. answer; account for both possibilities ''' for i in range(len(html_str)-4): if(html_str[i:i+4] == '<li>'): if map(html_str[i+4]) > 4: continue if idx >= 25: continue prev_ans_freq[idx][map(html_str[i+4])] = prev_ans_freq[idx][map(html_str[i+4])] + 1 prev_let_freq[map(html_str[i+4])] = prev_let_freq[map(html_str[i+4])] + 1 idx = idx + 1 size=len('{0}. '.format(idx+1)) if(html_str[i:i+size] == '{0}. '.format(idx+1)): if map(html_str[i+size]) > 4: continue if idx >= 25: continue prev_ans_freq[idx][map(html_str[i+size])] = prev_ans_freq[idx][map(html_str[i+size])] + 1 prev_let_freq[map(html_str[i+size])] = prev_let_freq[map(html_str[i+size])] + 1 idx = idx + 1 return (prev_ans_freq, prev_let_freq) ''' Initializes answer_freq_10/12 to hold answer frequencies by problem, and letter_freq_10/12 to hold overall answer frequencies ''' answer_freq_12=[[0]*5 for i in range(25)] answer_freq_10=[[0]*5 for i in range(25)] letter_freq_12=[0]*5 letter_freq_10=[0]*5 ''' Updates answer frequencies for tests after 2002 ''' for i in range(2002,2016): (answer_freq_12, letter_freq_12)=update(i, 12, 'A', answer_freq_12, letter_freq_12) (answer_freq_12, letter_freq_12)=update(i, 12, 'B', answer_freq_12, letter_freq_12) (answer_freq_10, letter_freq_10)=update(i, 10, 'A', answer_freq_10, letter_freq_10) (answer_freq_10, letter_freq_10)=update(i, 10, 'B', answer_freq_10, letter_freq_10) print_12(answer_freq_12, letter_freq_12) print_10(answer_freq_10, letter_freq_10)