Banburismus


Banburismus

Banburismus was a process invented by Alan Turing at Bletchley Park in England during the Second World War. It was used by Hut 8 at Bletchley Park to break German Kriegsmarine (i.e., Naval) Enigma. It was a codebreaking procedure which used an early form of Bayesian networks to infer information about the settings of the Enigma machine. It gave rise to Turing's conception of information, measured in "bans" — roughly the same concept as Shannon entropy.

Hut 8 performed Banburismus continually for two years, stopping in 1943 only because the latest generation of ultra-fast Bombe could run a wheel-order in just two minutes, and it became easier just to brute-force the keys. Hugh Alexander was regarded as the best of the banburists — he and Jack Good considered the process more an intellectual game than a job. "Not too hard as to be impossible, but difficult enough to be an enjoyable challenge," they commented.

History

At least as early as 1939, Alan Turing correctly deduced that the message-settings of Kriegsmarine Enigma signals were enciphered on a common Grundstellung (starting position of the rotors), then were super-enciphered with a bigram lookup table. However, without a copy of the bigram tables, Hut 8 were unable to start attacking the traffic until the summer of 1940. The code "break" happened after an armed trawler called "Polares" was surprised and seized by British forces in the North Sea off Norway in late April that year. The Germans didn't have time to destroy all their cryptographic documents, and amongst the captured material were settings-lists for 22–29 April and some message pads with paired plaintext and enciphered messages.

The bigram tables themselves were not part of the capture, but Bletchley Park were able to use the settings-lists to read (retrospectively) all the Kriegsmarine traffic that had been intercepted between 22–29 April. This in turn let them do a partial reconstruction of the bigram tables.

The stage was set for the first attempt to use Banburismus to attack Kriegsmarine traffic from April 30 onwards. Eligible days were those where at least 200 messages could be found for which the partial bigram-tables would decipher the indicators. The first day to be broken was May 8, 1940, thereafter celebrated as "Foss's Day" in honour of Hugh Foss, the cryptanalyst who achieved the feat.

This feat took Foss until November that year — the traffic was by then uselessly out of date, but of course the break-in proved Banburismus could work. It also allowed much more of the bigram tables to be reconstructed and that in turn allowed more days from May and June to be broken retrospectively. However, the Kriegsmarine changed the bigram tables in mid June, and Hut 8 were reduced to occasional decrypts (mostly due to kisses and gardening) until the next pinch in early 1941.

Banburismus was to become the standard procedure against Kriegsmarine Enigma from early 1941 until mid 1943.

Basic principles

Banburismus is an attack on the indicators (the encrypted message settings) of Kriegsmarine Enigma traffic. It can only be used when an Enigma machine has been used with a fixed setting (the "Grundstellung") to create those indicators. The indicators effectively form a set of three-letter enigma messages "in depth". The principle of Banburismus is to guess the plaintext of those three-letter messages (the message-settings) by statistical examination of the messages themselves.

The principle is simple. If you take two sentences in any natural language, lay them one above the other and count where letters in one message are the same as the matching letter in the other message, then you will find many more matches than you would have had the sentences been random junk. If both messages were enciphered with an Enigma machine at the same message-setting, then the matches will occur just as they did in the plaintexts. However, if the message-settings were not the same then the two ciphertexts will compare as if they were random junk, and you'd expect about one match every 26 characters.

This principle allows an attacker to take two messages whose indicators differ only in the third character, and slide them against each other looking for the giveaway repeat pattern that shows where they align in depth. This gives a vital clue as to the possible third characters of the plaintexts of those indicators.

This comparison of two messages, looking for the repeats, was made easier by punching the messages onto thin cards about 250 mm high (10") by several metres wide (they had different cards for different lengths of message). A hole at the top of a column on the card represented an 'A' at that position, a hole at the bottom represented a 'Z'. Two message-cards were laid on top of each other on a light-box and where the light shone through, there was a repeat in the messages. This made it much easier to count the repeats.

The cards were printed in Banbury, England. They became known as 'banburies' in BP, and hence the procedure using them was Banburismus.

A quick example

Message with indicator "VFG": GXCYBGDSLVWBDJLKWIPEHVYGQZWDTHRQXIKEESQSSPZXARIXEABQIRUCKHGWUEBPF

Message with indicator "VFX": YNSCFCCPVIPEMSGIZWFLHESCIYSPVRXMCFQAXVXDVUQILBJUABNLKMKDJMENUNQ

Hut 8 would punch these onto banburies and count the repeats for all valid offsets -25 letters to +25 letters. There are two promising positions:

GXCYBGDSLVWBDJLKWIPEHVYGQZWDTHRQXIKEESQSSPZXARIXEABQIRUCKHGWUEBPF YNSCFCCPVIPEMSGIZWFLHESCIYSPVRXMCFQAXVXDVUQILBJUABNLKMKDJMENUNQ - -- - - - - --
This is 9 repeats (including two bigrams) in an overlap of 56 characters.

The other promising setup looks like this:

GXCYBGDSLVWBDJLKWIPEHVYGQZWDTHRQXIKEESQSSPZXARIXEABQIRUCKHGWUEBPF YNSCFCCPVIPEMSGIZWFLHESCIYSPVRXMCFQAXVXDVUQILBJUABNLKMKDJMENUNQ ---
This is just a single trigram in an overlap of 57 characters.

Bayesian statistics allows us to know which of these situations is most likely to represent messages in depth. As you might expect, the former is the winner with odds of 5:1 on, the latter is only 2:1 on.

BP would work on the principle that the plaintext of "VFX" is 9 characters ahead of "VFG", or (in terms of just the third letter of the texts) that "X = G+9".

critchmus

Hut 8 might have evidence from other message-pairs (with only the third indicator letter differing) showing that "X = Q-2", "H = X-4" and "B = G+3". They could then construct a "chain" as follows:

G--B-H---X-Q
Now, if you lay this knowledge over the letter-sequence of an Enigma rotor, you find that quite a few possibilities are discounted due to breaking either the "reciprocal" property or the "no-self-ciphering" property of the Enigma machine:
G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (G enciphers to B, yet B enciphers to E)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (H apparently enciphers to H)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (G enciphers to D, yet B enciphers to G)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (B enciphers to H, yet H enciphers to J)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (Q apparently enciphers to Q)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (G apparently enciphers to G)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (G enciphers to H, yet H enciphers to M)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (H enciphers to Q, yet Q enciphers to W)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (X enciphers to V, yet Q enciphers to X)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (B enciphers to Q, yet Q enciphers to Y)

G--B-H---X-QABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (X enciphers to X)

Q G--B-H---X->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

-Q G--B-H---X->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (Q enciphers to B, yet B enciphers to T)

X-Q G--B-H--->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

-X-Q G--B-H-->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (X enciphers to B, yet B enciphers to V)

--X-Q G--B-H->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

---X-Q G--B-H->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (X enciphers to D, yet B enciphers to X)

H---X-Q G--B->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (Q enciphers to G, yet G enciphers to V)

-H---X-Q G--B->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (H enciphers to B, yet Q enciphers to H)

B-H---X-Q G-->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible (note the G enciphers to X, X enciphers to G property)

-B-H---X-Q G->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is impossible (B enciphers to B)

--B-H---X-Q G->ABCDEFGHIJKLMNOPQRSTUVWXYZ ......... is possible

The so called "end-wheel alphabet" is already limited to just nine possibilities, merely by establishing a letter-chain of five letters derived from a mere four message-pairs. Hut 8 would now try fitting other letter-chains — ones with no letters in common with the first chain — into these nine candidate end-wheel alphabets.

Eventually they will hope to be left with just one candidate, maybe looking like this:

NUPF----A--D---O--X-Q G--B-H->ABCDEFGHIJKLMNOPQRSTUVWXYZ

Not only this, but such an end-wheel alphabet forces the conclusion that the end wheel is in fact "Rotor I". This is because "Rotor II" would have caused a mid-wheel turnover as it stepped from "E" to "F", yet that's in the middle of the span of the letter-chain "F----A--D---O". Likewise, all the other possible mid-wheel turnovers are precluded. Rotor I does its turnover between "Q" and "R", and that's the only part of the alphabet not spanned by a chain.

That the different Enigma wheels were given different turnover points was, presumably, a measure by the designers of the machine to improve its security. However, this very complication allowed Bletchley Park to deduce the identity of the end wheel.

The middle wheel

Once the end wheel is identified, these same principles can be extended to handle the middle rotor, though with the added complexity that you are now looking for overlaps in message-pairs sharing just the first indicator letter, and that the overlaps could therefore occur at up to 650 characters apart.

The workload of doing this is beyond manual labour, so BP punched the messages onto 80-column cards and used Hollerith machines to scan for tetragram repeats or better. That told them which banburies to set up on the light boxes (and with what overlap) to evaluate the whole repeat pattern.

Armed with a set of probable mid-wheel overlaps, Hut 8 could compose letter-chains for the middle wheel much in the same way as was illustrated above for the end wheel. That in turn (after Scritchmus) would give at least a partial middle wheel alphabet, and hopefully at least some of the possible choices of rotor for the middle wheel could be eliminated from turnover knowledge (as was done in identifying the end wheel).

Taken together, the middle wheel alphabet and the end wheel alphabet would yield a menu for the Bombe, and the number of wheel-orders to try on the Bombe would be significantly reduced from the 336 possible for the day.

References

* David J. C. MacKay. " [http://www.inference.phy.cam.ac.uk/mackay/itila/book.html Information Theory, Inference, and Learning Algorithms] " Cambridge: Cambridge University Press, 2003. ISBN 0-521-64298-1. This [http://www.inference.phy.cam.ac.uk/mackay/itila/ on-line textbook] includes a chapter discussing information theory aspects of Banburismus.

ee also

* Ban
* Sequential analysis

External links

* [http://www.codesandciphers.org.uk/documents/cryptdict/page05.htm The 1944 Bletchley Park Cryptographic Dictionary]
* [http://tallyho.bc.nu/~steve/banburismus.html "All You Ever Wanted to Know About Banburismus but were Afraid to Ask"] The whole procedure researched in detail, with a worked example.


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Banburismus — noun A cryptologic process using an early form of Bayesian networks to infer information about the settings of the Enigma machine …   Wiktionary

  • Cryptanalysis of the Enigma — enabled the western Allies in World War II to read substantial amounts of secret Morse coded radio communications of the Axis powers that had been enciphered using Enigma machines. This yielded military intelligence which, along with that from… …   Wikipedia

  • Тьюринг, Алан — Алан Тьюринг Alan Mathison Turing …   Википедия

  • Alan Turing — Turing redirects here. For other uses, see Turing (disambiguation). Alan Turing Turing at the time of his election to Fellowship of the Royal Society …   Wikipedia

  • Enigma machine — Military Enigma machine …   Wikipedia

  • Ban (information) — Fundamental units of information bit (binary) nat (base e) ban (decimal) qubit (quantum) This box: view · …   Wikipedia

  • Clock (cryptography) — Biuro Szyfrów Methods and technology ANX  · Enigma doubles  · Grill Clock  …   Wikipedia

  • Хронология развития теории информации — Хронология событий, связанных с теорией информации, сжатием данных, кодами коррекции ошибок и смежных дисциплин: 1872 …   Википедия

  • Known-plaintext attack — The known plaintext attack (KPA) is an attack model for cryptanalysis where the attacker has samples of both the plaintext (called a crib), and its encrypted version (ciphertext). These can be used to reveal further secret information such as… …   Wikipedia

  • PC Bruno — The Enigma cipher machine Enigma machine Enigma rotors …   Wikipedia


Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.