C code for encrypting with, and breaking The Vignère cipher

Here you can find C source code for the attack described in class Feb 1. I wrote two programs, one for encrypting/decrypting using a keyword, one for stripping out the keyword.

Get the code

The source code resides in these two files, vig.c and crackvig.c. Compile and run them like so:

bingsun2% gcc -ansi -o vig.exe vig.c
bingsun2% gcc -ansi -o crackvig.exe crackvig.c -lm

bingsun2% cat > test_input.txt
When all you have is a hammer, everything looks like a nail.
^D (Ctrl-D)

bingsun2% ./vig.exe BLAH < test_input.txt
You chose the password BLAH
Ytfv cxm gqg iixq ja c tbuoqs, mxqsgvtjvi xpwme mqmq b vcum.

You can decrypt by adding a -d flag after vig.exe. Try it out with different passwords before running the cracking software.

Running the cracking program:

Take a text file with several hundred characters, and encrypt using ./vig.exe blah < input.txt > ciphertext.txt. Now, feed the ciphertext into the cracking software like so:

bingsun2% ./vig.exe shelf < plain.txt > cipher.txt

You chose the password shelf

bingsun2% cat cipher.txt | crackvig.exe -1
Guess key length = 1
Guessing key:  L --- confidence -511.052023
bingsun2% cat cipher.txt | crackvig.exe -2
Guess key length = 2
Guessing key:  LL --- confidence -511.052023
bingsun2% cat cipher.txt | crackvig.exe -3
Guess key length = 3
Guessing key:  LSL --- confidence -474.605454
bingsun2% cat cipher.txt | crackvig.exe -4
Guess key length = 4
Guessing key:  LLES --- confidence -508.956592
bingsun2% cat cipher.txt | crackvig.exe -5
Guess key length = 5
Guessing key:  SHELF --- confidence 644.125537
bingsun2% cat cipher.txt | crackvig.exe -6
Guess key length = 6
Guessing key:  LELLSS --- confidence -462.178905
bingsun2% 

Note that all the confidence values are negative until we guess the right key length. Notice also that for any guess, the program tells you the most likely key.

You might notice another phenomenon: when you guess the wrong key length, the computer guesses a meaningless keyword (garbage in, garbage out), but these bad keywords contain letters from the real keyword. Why?