Frequency Analysis

by fortenforge, Jul 16, 2009, 3:19 AM

If you have been looking at the shouts you will notice someone mentioned something called "frequency analysis".
Frequency Analysis is an algorithm to easily decrypt substitution ciphers.

To understand how it works, you have to first understand that in the English language, or in any language, some letters will appear more often than others and some will appear less often.

For example, in English the most common letters are E, R, S, T, and A. The least common letters are Q, X, Z, and J.

In the table below, you can see the frequency analysis for the first 2 paragraphs of this Entry, and the average frequency analysis for English. The first row is the alphabet, the second the frequency analysis for the first 2 paragraphs, and the third is the average frequency analysis for english.

A B C D E .F G H I J K L M N O P Q R S T U V W X Y Z
8 1 2 3 11 2 3 4 7 0 1 6 2 9 8 2 1 5 8 9 4 1 2 0 3 0
8 2 3 4 13 2 2 6 7 0 1 4 2 7 8 2 0 6 6 9 3 1 2 0 2 0

The most difference between the actual English text and the average for English is the letters E, N, H, and S in which the difference is 2. The same is true for any piece of English text. The frequency of the letters is always going to be the same. Of course the larger the text you have the more the text will correspond to the averages. It is just like a coin flip. The more times you flip the coin the more the percentage of Heads is closer to 50%.

If I encrypt the first two paragraphs with a Caesar cipher of key 1 I get:

JGZPV IBWFC FFOMP PLJOH BUUIF TIPVU TZPVX
JMMOP UJDFT PNFPO FNFOU JPOFE TPNFU IJOHD
BMMFE GSFRV FODZB OBMZT JTGSF RVFOD ZBOBM
ZTJTJ TBOBM HPSJU INUPF BTJMZ EFDSZ QUTVC
TUJUV UJPOD JQIFS TUPVO EFSTU BOEIP XJUXP
SLTZP VIBWF UPGJS TUVOE FSTUB OEUIB UJOUI
FFOHM JTIMB OHVBH FPSJO BOZMB OHVBH FTPNF
MFUUF STXJM MBQQF BSNPS FPGUF OUIBO PUIFS
TBOET PNFXJ MMBQQ FBSMF TTPGU FO.

Let's do the frequency analysis on this ciphertext:

A B C D E .F .G H I J K L M N O P Q R S T U V W X Y Z
0 8 1 2 3 .11 2 3 4 7 0 1 6 2 9 8 2 1 5 8 9 4 1 2 0 3
8 2 3 3 13 2 .2 6 7 0 1 4 2 7 8 2 0 6 6 9 3 1 2 0 2 0

It is clear from looking at the frequency analysis that the second row has been shifted by 1 column. This tells us that the message has been encrypted using a Caesar cipher of key 1. If we shift it back one column the numbers match up better. Now we can easily decrypt it. If I had used a substitution cipher, it would have been a little more difficult, but still doable. We could immediately notice that whichever letter had 11 occurrences, must be E, because on average E has 13 occurrences. We could move on to the other letters from here thereby decrypting the cipher. In the next post I will show you an actual example.

One note is that if the language of the plaintext had not been English, the average frequency analysis would have been different thereby making it difficult to decrypt it well.

frequency analysis

Monoalphabetic Substitution

Comment

0 Comments

Submit

Good website!
by bluegoose101, Aug 5, 2021, 6:28 PM
uh-huh, a great place here
by fenchelfen, Sep 1, 2019, 11:30 AM
uh, yeah he is o_O
by SonyWii, Oct 8, 2010, 2:11 PM
dude i think you're my roommate from camp :O
by themorninglighttt, Aug 29, 2010, 10:06 PM
what i'm still not a contrib D:
by SonyWii, Aug 6, 2010, 2:20 PM
I see what you did there
by Jongy, Aug 1, 2010, 11:52 PM
omg, apparently you like cryptography; and apparently I'm not a contribb D:
by SonyWii, Jul 26, 2010, 9:48 PM
Thank You
by fortenforge, Jan 17, 2010, 6:35 PM
Wow this is a really cool blog
by alkjash, Jan 16, 2010, 7:04 PM
Hi
by fortenforge, Jan 7, 2010, 12:12 AM
Hi
by Richard_Min, Jan 5, 2010, 9:29 PM
Hi
by fortenforge, Jan 3, 2010, 10:14 PM
HELLO FORTENFORGE I AM THE PERSON SITTING NEXT TO YOU IN IDEAMATH
by ButteredButNotEaten, Dec 24, 2009, 4:19 AM
@dragon96 Not if you celebrate Christmas with neon lights
@batteredbutnotdefeated Sure, You are now a contributer
by fortenforge, Dec 20, 2009, 4:39 AM
I too share a love for cryptography and cryptanalysis, may I be a contrib?
by batteredbutnotdefeated, Dec 20, 2009, 2:38 AM
The green is too bright for Christmas.
by dragon96, Dec 20, 2009, 2:12 AM
I thought I'd change the colors for the Holidays
by fortenforge, Dec 13, 2009, 10:53 PM
hi, some "simple" cryptography here: http://www.artofproblemsolving.com/Forum/weblog_entry.php?t=317795
by phiReKaLk6781, Dec 12, 2009, 3:46 AM
Yeah, that is binary, for modern cryptography, most text is converted to binary first and then algorithm's for encryption are preformed on the binary rather than the English letters. The text is converted using the ASCII table or UNICODE.
by fortenforge, Oct 13, 2009, 10:33 PM
Whoa, I love your background! Is that binary?
by pianogirl, Oct 13, 2009, 8:34 PM
Sure, I'll add you as a contributer...
by fortenforge, Oct 2, 2009, 4:44 AM
May I make a post on one cipher I made up? (It's a good code for science people! *hint hint*)
by dragon96, Oct 2, 2009, 4:04 AM
Nice blog, this is interesting...

and guess who i am
by Yoshi, Sep 21, 2009, 4:02 AM
Thanks
by fortenforge, Sep 17, 2009, 1:33 AM
Very interesting blog. Nice!
by AIME15, Sep 16, 2009, 5:21 PM
When you mean 'write' do you mean like programming? Much of cryptography has to do with programming and most modern cryptographers are excellent programmers because modern complex ciphers are difficult to implement by hand.

See if you can write a program for the substitution cipher. The user should be able to enter the key and the message. I know it is possible to do it in pretty much any language because I was able to do it in c.
by fortenforge, Aug 7, 2009, 8:17 PM
Hello. I don't know much about advanced cryptography but I did write a Caeser Chipher encrypter and decrypter!
by Poincare, Jul 31, 2009, 8:55 PM

27 shouts

Cryptography

Frequency Analysis

by fortenforge, Jul 16, 2009, 3:19 AM

0 Comments