Administrivia Prerequisites, text and syllabus Exams, HW and grading policy Homework sets Homework solutions Handouts Lecture outlines Maple Demos TeX stuff

Topics in Applied Mathematics: Cryptography

 

Administrivia

The course meets at 11:30 MWF in Baker Hall A51 (the Giant Eagle auditorium). Office hours are 10:30-11:30 MWF or by appointment, please send email to jcumming@andrew.cmu.edu to arrange an appointment. Homework will be set most Mondays and will be due on the following Monday. Late homework will not be accepted under any circumstances, but the two lowest homework scores will be dropped.

The grader for the course is Danut Arama. Send him email if you have any questions about grading.

 

Prerequisites, text and syllabus

Prerequisites: A basic knowledge of calculus.

Text: Garrett "Making, breaking codes". Schneier "Applied cryptography" is a useful reference for some of the course material as is Koblitz "A course in number theory and cryptography".

It is quite likely that you have a copy of the first printing of Garrett which is known to have some typos. Here is a list of errata prepared by the author. If you spot any more please pass them on to me and I will forward our collection to the author at the end of term.

The most heinous typo (known to me) is on page 33 in the table of frequencies of letters in English text. I have made a scan of the offending page.

This syllabus is vague and provisional. As the course progresses I will create a detailed record of what was covered in each class.

We will use the computer algebra system Maple to automate some of the tedious computation required in cryptography. We will also learn to use TeX which is the standard system for computerised typesetting of mathematical documents.

If you are using an Andrew Unix workstation then TeX and Maple are already installed. TeX is freely available for most popular platforms, let me know if you are having any trouble finding a copy that works for you.

 

Exams, HW and grading policy

There will be a midterm and a final. Grades will be assigned according to a formula in which homework counts 30 percent, the midterm counts 25 percent and the final counts 45 percent.

 

Homework sets and exams

  1. This homework is in three parts: Part One, Part Two and Part Three. It is due by class time on Monday 27 January. You do not need to hand in anything for part 1 or part 2. You should email solutions to Part 3 to the dedicated address cryptohw@legba.math.cmu.edu.
  2. This homework is in two parts, you only need hand something in for part two. Homework should be emailed to the same address as last week. Part 1 is about getting Maple output into your TeX documents, Part 2 is about Vigenere encryption and related topics.
  3. Homework 3.
  4. Homework 4. Since this one was given out late it is due on Friday 28 February.
  5. The midterm. Due Wednesday 19 March.
  6. Homework 6. Due Wednesday April 9.
  7. Homework 7. Due Wednesday April 23.
  8. The final.

 

Homework solutions

  1. Solutions for HW 1 in TeX. You may find it helpful to read the TeX code as I have sprinkled it with comments about how to do certain things in TeX.
  2. Solutions for HW 2 in TeX.
  3. Solutions for HW 3 in PostScript.
  4. Solutions for HW 4 in PostScript.
  5. Solutions for HW 6 in PostScript and a Maple Worksheet with some related material.
  6. Solutions for HW7.
  7. Solution to Q1 on midterm together with solutions to the reminaing questions.

 

Handouts

 

Lecture outlines

Page references are to Garrett unless otherwise stated.

  1. Introduction to the course. Cryptology = Cryptography (making codes) + Cryptanalysis (breaking codes). Encryption and decryption keys. Symmetric cryptosystems (p. xviii). The Caesar cipher and its trivial variations (p. 1-5). Classifying attacks on cryptosystems by how much data is available (p. xviii). Question: how can we automate the process of recognising English text? (p.31-37).
  2. Trivial known ciphertext attack on the Caesar cipher by brute force (only 25 keys). Permutations. The number of permutations of an n-element set. 26! is quite large (403291461126605635584000000). But permuting letters (cryptograms) does not give a good cryptosystem because of statistical features of English and the "monoalphabetic" property of the cipher (p39-42). Counting letters, digrams and trigrams (p.31-37).
  3. More on the statistics of English. English has strong local correlation (say between successive letters) but at large distances there is little or no correlation between letters: we will use both of these features. The one-time pad (p. 10-13) What is a random sequence and how should we obtain one? Never reuse a one-time pad (explanation later). The Vigenere cipher (p. 58-61). Basic probability theory. Outcomes, events, random variables. Expectation value of a random variable (p. 21-30 and 69-72) .
  4. The Vigenere cipher continued. For a key of length m it looks like m interleaved shift ciphers. It is a polyalphabetic substitution cipher. Weakness: every m characters we come back to the same character in the key, this leaves some structure in the ciphertext which we can exploit. Friedman's index of coincidence (IOC) (p.74) . Typically this is about 0.065 when we compare two unrelated chunks of English text (considered just as streams of lowercase letters) (p. 77) . It is typically about 0.038 when we compare an English stream with a random stream (p. 77). Encrypting both streams with a substitution cipher does not change the IOC. At the end I did some computer demos, when I have polished them up I will make them available on this page.
  5. More computer demos. What should we do once we have found the key length for a Vigenere cipher? A crude statistical model for English text. What is the expected value of the index of coincidence using this crude model?
  6. Computing the expected value of the IOC using the crude statistical model from last time. Adding random variables and multiplying by constants. Linearity of expectation. The space R^n. Inner product (aka scalar product or dot product) of two elements in R^n. Geometrical interpretation when n=2 or n=3.
  7. Why bother with doing geometry in R^n? It can help us decrypt Vigenere, by giving an efficient way of guessing shift. Addition in R^n, multiplying elements of R^n by reals. Properties of the scalar product. Proof of the inequality |a.b| less than or equal to |a| times |b|. Angle between two vectors in R^n. Parallelism. Nonzero a and b are parallel if and only if |a.b|=|a| |b|
  8. Using geometry in R^n to automate decrypting a shift cipher (which is useful because once you determine that the Vigenere key has length m, you need to decrypt m separate shift enciphered sequences)
  9. Unique factorisation in the natural numbers. GCD of integers a and b. GCD is always of the form M a + N b, also the set of numbers of form M a + N b = set of multiples of the GCD of a and b. Euclid's algorithm gives an efficient way of computing gcd of a and b, also a representation in the M a + N b form.
  10. The remainder in Euclid's algorithm goes down by at least a factor of two every two steps. So if we start on a, b with a < b then it finishes in at worst about 2 times log-to-the-base-2(a) many steps. Turing machines.
  11. Turing-computable functions. Not all functions are computable. Church's thesis. Size of an integer = number of binary digits needed to represent it, size of n is roughly log-to-the-base-2(n). Efficient: running time is polynomial in the size of the input. Inefficient: running time is exponential in the size of the input.

 

Maple demos

 

Tex stuff

You need some TeX style files if you want to take the TeX output exported by Maple and include it in your TeX documents. This will be in HW 2 (probably) but for anyone who wants to experiment right now this is a gzipped tar file containing the relevant style files. Unpack it in your TeX directory.

People who use Microsoft Windows at home have encountered a problem previewing their TeX output when they are logged in remotely to Andrew Unix machines. This is because the previewer program xdvi is am X-Windows application and Microsoft Windows does not play nicely with X. There are various options for solving this problem.

  1. Go to a cluster with Unix machines, or install Linux.
  2. Install an X-Windows server on your Microsoft Windows machine, then you will be able to use any X application on any remote machine (if you use SSH, make sure to to turn on X forwarding). You can find a free X-Windows server for Microsoft Windows here.
  3. Install TeX on your Microsoft Windows machine and do all of your TeX activity locally. A free version of TeX for Microsoft Windows can be found here.
  4. Please let me know if you encounter any problems with options 2) or 3).