What would breaking the Enigma look like in 2021? (film, 21 minutes)

So it was in the news that there's a £50 note coming out with Alan Turing on it. Now, Alan Turing has been featured from time to time on our channel, and rightly so. Dave's done some fabulous videos on all the work that he and a lot of others were doing at Bletchley Park, you know, cracking Enigma. And it got me thinking, you know, now that we're in the 21st century and I've got a laptop that's much more powerful than all the Turing bombs put together, how easy is it to break Enigma like today? So I thought, let's code it up and find out. First thing I should say is Dave's done some really interesting videos on the history of Enigma, and do go and watch those. And it'll give you an idea of the sort of things they were doing to try and break Enigma back during the 40s. Not to trivialise it, but it's really difficult, isn't it? It's really, really difficult. You know, this isn't something one does by hand, right? Not quickly. The Enigma machine is not a stupid idea, right? It's well designed. The only thing that we've got now is much, much more processing power. And so things that we couldn't possibly brute force back then, maybe we can start to begin to brute force now. And that's what we're looking at today. So let's look very briefly at what an Enigma machine is, and then we'll have a go at actually breaking one and see whether it holds up. If you recall, it has some lights on some letters and it has, this is my technical description, has some buttons that you press. Keys, are they cool sometimes? So you press an A, right? You press an A and it goes into the machine and it goes through something called a plug board. And the plug board will swap it. So this is the plug board here. Now plugs just swap certain letters in pairs. So maybe this A comes out as A and F or something like that. Then this goes into the first rotor. And so that maybe comes out as a Q or something like that. So, you know, it sort of just wiggles around in here. And then this comes in, this comes out as maybe, you know, a P. I'm making this up. And then this goes into the next rotor. So stick to three rotor Enigmas for today, hey? Comes out as a sort of an S. And then it goes through something called a reflector. So this is getting quite complicated. But this is all mechanical, right? These rotors literally physically turn around and they just have wires soldered from one end to the other that connect up. So the reflector just bounces, S goes to something else. I can't remember which letter it is. And then it goes back through this, like this. And then it comes around through the plug board again. So maybe that D goes to, I mean, A is over here. That was a bit silly of me, wasn't it? For the purposes of this, it's all absolutely fine. And this comes out as a Z, right? So in my weird botched Enigma diagram, the A went in and it came out as a Z. It was encrypted as a Z. Now, what's hard about this problem? Well, you can take these rotors out, right? Each rotor has different wiring. There's usually five or eight rotors to choose from, and you can put them in any position you like. The next one is that every time you press a letter, one of the rotors rotates, and sometimes it rotates the next rotors along. And that means that this mapping, this transformation, changes every time you press a character. So you press A, it's not going to do the same thing again? No, actually, if you press A, the rotor first turns and then lights up a letter, right? So it's not going to do the same thing again. If you press A, A, A, A, A, the only thing you're guaranteed due to one of the quirks of Enigma is you won't get any A's back, but you'll get a lot of random letters. Back during the war, they solved this by trying to find possible rotor configurations that definitely couldn't work or could work, based on some guess plaintext. So maybe they had this idea that the first part of the message contained the word weather report, and so if they put that in, then they could find all the encryptions that couldn't possibly have come out as weather report and so on, and they could start to narrow down what their options were. Now, that's called a known plaintext attack, but during the war, if they hadn't known the plaintext, they wouldn't have been able to crack Enigma. They didn't have the brute force power to do it. Nowadays, we have pretty fast laptops, right? And beyond my laptop, other computers are even faster than that, would you believe? So theoretically, we could start to try out some of these combinations, even if we didn't know any plaintext. What we're going to be looking at, at least to begin with, is a ciphertext-only attack. That's an attack where we've only got the ciphertext. We don't have any idea what the plaintext is, and we want to see if we can guess what some of these settings would be. There's a few interesting weaknesses of Enigma that make it a little bit practical to brute force, but not actually as much as you'd think. We talk a lot about how a letter doesn't encrypt ever to itself, and that's quite relevant for plaintext attacks, known plaintext, because you can then work out where plaintext definitely can't or could be. We're not doing that today, so I'm less worried about that particular property of Enigma. More interesting to me is the fact that if you get some of the settings right, like you get this row to correct but these two wrong, that will often be better at decrypting than if you get them all wrong. So as we start to move slowly towards a solution, we get a little hint that maybe this ciphertext isn't quite as good as it was, and we can start measuring it. So really what we need to do is find some way of putting some ciphertext into an Enigma machine with some configuration, and then reading the output and saying, is that a plausible sentence or not? And preferably we'd like to do it really, really quickly, because otherwise this could take a long time. So we're not going to be doing any deep learning, can't be bothered with that. We're going to use very simple statistical properties of text to try and measure whether one sentence is better as English than another sentence. And there's a few of these, and I've implemented all of them because I thought, well, let's just test them out. So the first one is something called index of coincidence. So let's suppose you have a ciphertext. So, for example, here it is. I'm just looking at random text, right? So, yeah, okay. I'm just going to copy some random ciphertext from here. So why I can't just randomise my own ciphertext, I don't know. But I feel more comfortable copying it than I do just thinking up clever, interesting letters on my own. Isn't there a number of files on the fact that I don't pick as random as I think, and stuff like that? So let's do S, E, B, H, W. This is some actual ciphertext that we'll be breaking later. Does it honestly start with Zeus, as in Conrad's Zeus? Yeah. The reality of random, isn't it? Yeah, yeah, yeah, yeah. You see all kinds of words in the random ciphertext, and they mean absolutely nothing because it hasn't been decrypted yet. I suppose it's pseudo-random. Oh, gee, yeah, well, exactly. E, U. So let's suppose we have some ciphertext like this. We guess some rotor settings, and we put it through our Enigma machine, and that will decrypt it. Now, it will probably decrypt it incorrectly. But where we accidentally stumble upon the right plugs, or we stumble upon the right rotor configuration, even for briefly, we'll find that this decryption is slightly better than completely random. Because actually, mostly, this is completely random. Yes, we have this stipulation that letters can't turn into themselves, but generally speaking, it looks completely random. So this is what we get out. So H, F, my writing's bad today, V, V, F, L, I, N, G. Now, the interesting thing about this is, I mean, it's complete nonsense, right? I'm not cracking any words on this, but I do recognise I, N, G. Now, I, N, G is a fairly common trigram, or set of three characters, in the English language. Now, that doesn't necessarily mean this is correct, but it's slightly more English, shall we say, than this one. So if we were measuring some amount of how English is this sentence, then it's a little bit closer than this. So maybe one of our rotors is in the right position, and the others are wrong. Or our rotor's in the correct position, but our plugs are wrong. Or something like this. And the idea is that we slowly go through different configurations of our Enigma machine. I say slowly, as fast as we can. And we measure statistical properties of these output sentences to find the ones that most closely resemble correct decrypted text. And we can do this without looking at them. We don't have to look at them and say, well, that's a real word. We just measure statistical properties. So what are some of these statistical properties? Well, the first one is called the Index of Coincidence, or IOC. This is the probability that when you pick two letters at random, they'll be the same. So, for example, if we randomly pick P, and then we randomly pick P, those are the same. If we pick P and then L, they're not the same. We won't write the formula out. The formula's not that complicated. But what you have to do is go through and count every single character and how many of each one there are. You produce a histogram, and then you can calculate the Index of Coincidence based on this. Now, for random text, that is, text that's been put through an Enigma machine and not decrypted, the Index of Coincidence is usually something like 0.038, which is basically everything's evenly distributed. There's nothing interesting going on there at all. But for decrypted English text, we usually get a higher Index of Coincidence, about 0.067. And I think it's something like 0.072 for German text. One way of looking at it is it measures the fact that some characters have more higher probabilities than others. If everything's equally likely, you get something like this. If some characters are quite common, and so they tend to come up in pairs, it starts to look a little bit higher. It never goes higher than this. So, well, not really. What we can do is we can work through our rotors, different rotors, different positions, different settings, and we can calculate our Index of Coincidence, and we take the best-scoring text. So where our output has a higher Index of Coincidence, we think maybe we've got the rotor settings correct. And that's basically how it works. There's been a Numberphile where they actually got to use an Enigma machine, and I'm super jealous. They talked about the number of different combinations. So it's all very well saying, OK, we'll just go through all the rotor settings and work out what the best one is. Maybe if I have a super, super, super computer. But actually, Enigma has a nifty weakness in this sense, which is that if you get some of the settings correct, this will improve. If I get the rotor positions correct, even though the rotor's in a slightly wrong position, the result will be better than if I've got the wrong rotors in place. If I get one of the plug board settings right, the result will be better than if I've got none of the plug board settings right. Because basically, fewer characters will be incorrect. Subtly. So you've got three out of, let's say, five rotors, or eight rotors. So that's physically the three that happen to be in the machine at the time. Yeah, and you've got their different positions, right? So if for today we just talk about five rotors, just because I've wanted to be sitting here a little bit longer, then we've got 60 possible positions. One, we've got to choose three out of five, and then they can all go in any slot. You don't tend to have the same rotor twice, right? Because, I mean, they didn't have those duplicates of rotors. They had one set of them. Yeah, yeah, yeah. So we've got that. Then for each of these, we have a start position from 1 to 26, which is, you know, what letter is showing on the top, right? Basically how rotated it is. So you've got the start position, or the indicator setting. So there's, you know, times, let's say, 26 of those, times by three of the different rotors, right? Then you've got the ring setting. Now, the ring setting is essentially rotating the internals of the rotor, right? Now, actually, if we ignore the notch for a minute, which I'll talk about, if you rotate the ring and you rotate the actual whole rotor, they kind of cancel each other out. So it's about the notch position, really. The notch is when the rotor turns, it turns the next one along. And so the combination of your start position and your ring setting will mean where your notch is and then when it turns around. So you've got the ring settings, right, which is going to be 26 times 3 again. Then we've got the plug board, which, as you know, swaps random characters, and that's got something like 10 different pairs out of 13 possible pairs. That's 150 trillion, I think, different combinations. That's, you know, out of reach of my laptop. Certainly, when I'm doing all these decryptions as well, and we're multiplied by all these things, right, the number file goes into much better detail on this. We're looking at 5 today because, again, I don't want to be here all day. It does get harder to solve, and we'll talk about that. So this is a lot of combinations. It's too many combinations for me to go through, even with this nice little index of coincidence thing, right? Even though, when I get this exactly right, I will just get the plaintext out, and it will have a very nice index of coincidence, and I might be able to find it. So what do we do? Well, the weakness of Enigma is that if we get some of these things right, even if the others are wrong, we get a little bit closer to the answer, usually. So, for example, if you get the correct three rotors in the correct positions, and you get their start positions roughly correct, if your ring settings are wrong, all that will do is mess around with the notches. So you'll get bits of your plaintext correct, and then bits of ciphertext, and then bits of plaintext, and you get these kind of pockets of valid characters coming out into the decryption. It will still score better on IOC or any other metric. So that's what we're going to do. And this is the same with the plugs. If we get the rotors and the start positions and the ring settings correct, then we can start to guess plugs, and generally speaking, if we guess one correctly, the output will be better, and we can then move towards a solution. There's a lot of possible variations, but the fact that we can deal with some of them at a time makes this practical, right? If we had to brute force through all the different variations, it wouldn't be possible. That's the idea. So I've written some code to do this. If you want to have a go, I'll make my code available, but also there's a really good online tool called Cryptool, which lets you do this in a visual way. We'll put a link to that in the description. But I've written some pretty simple code here. I implemented an enigma machine because it was fun, and then I implemented a number of different fitness functions, which is how good is our decryption. Index of coincidence is one. I also, maybe we can talk about some others another time. So you were kind enough to send me some ciphertext. I don't know what it is, and it's been encrypted by some enigma configuration with, I think, five plugs. The first thing I do is I go through all the different rotor configurations, and I find the one that has the highest index of coincidence score when it decrypts that message. So this is five different rotors, each one tried in each position and each starting position. So that's 26 for each one. So that's quite a few combinations, about 17,000. But 17,000 for a laptop in 2021, not such a big deal. It takes somewhere around 10 seconds or something like that. So you can see what it's doing now is it's stepping through the different rotors. So 1, 2, 3, 1, 2, 4, 1, 2, 5. And we've done about 10 or 15 configurations already. And for each of these, it's going through all the different starting positions. But we're not looking at ring settings, and we're not looking at plug board, because that will just multiply this astronomically by the number of things we have to do. So we're already on rotor three in the left hand side. We're keeping going. This same thing works exactly the same for eight rotors. It doesn't really change anything. It just takes slightly longer, and I'm a bit lazy. So I have actually coded up the other rotors as well. Interestingly enough, some of the later rotors have two notches on, right, which is not – it doesn't make any difference in terms of the cracking, because – So that just means it turns the next one twice as often, right? Yeah, yeah, twice as often, yeah. Only really affects the first two rotors. The last one doesn't ever really turn that often, and it doesn't have any other rotor to turn. So here what we've got is we've got a list of the top performing rotor configurations. So 2, 5, 3 is the best performing rotor configuration, with start positions of 21, 3, and 25. I'm using zero indexing, right, which is not how you would normally do it in Enigma, but it was easier for my array indexing to do this. And that has an index of coincidence of 0.043, which is a lot higher than 3, 8. I say it's a lot higher. It's a little bit higher. Good enough. So that suggests to me – I listed the top ten here, because sometimes you might not get the one on the top one. You might get the next rotor configuration or something like this. If you were trying to really actually pay attention to this, what you would do is maybe start doing further attacks on the top three rotor configurations, just to keep your options open. So we're going to fix at 2, 5, and 3, because, you know, it saves time, right? So given rotors 2, 5, 3, from left to right, and their starting positions, what we now can do is we can start to brute force through the ring settings. So we can find the best possible ring settings, right? Now this is almost instant, because there's now only 600 or so of those, right? We don't need to try the left-hand rotor, because it doesn't really rotate. And so we do that, and it's already happened. And the best ring positions were 0, 3, 23, right? Now the 0 we ignore, because it's not – the ring setting, remember, affects where the notch is, and the left rotor doesn't turn anything over, so it doesn't have any effect. So given that, this is the ciphertext we've got out. This was our original ciphertext, and this is our slightly better ciphertext. Now it still looks like total nonsense, right? But it has a much higher index of coincidence score than the original, which means in some sense it's less random. So if you look, you might start to see groups of letters that might be a real word, they might not, I don't know, right? But some of the real letters are going to come out here. We might not be able to see what they are. So given this, we can finally start addressing this really problematic plugboard situation. Remember, there's far too many plugboard combinations to realistically just try them all. But, again, we have this wonderful benefit that if we get one of the plugs correct, the result will probably be better than if we got none of them right. So what we do is we go through each of the first 300 or so different possible plugs, just one at a time, and we see which the best one is. And then we fix it, and we do the next for the next one. And that's two plugs. And then we do for the next for the next one. And that's three plugs. And so on. And if we do that, we very quickly come up with a few sets of plugs. And our ciphertext is starting to look a lot better. So this is our ciphertext here. The first letter's a nonsense, but then it's proposed to consider the quest. Consider the quest. Can machines think? Ah, see, now I'm starting to guess what this might be. Oh, I see what you've done here. This is the Alan Turing paper. So some of the letters are wrong, right? So it should be I propose to consider. And it's J propose to consider, right? And we're nearly there. But that's because when we optimise the rotor configuration, we fix the rings at 0, 0, 0. So we're never going to find the exact correct thing. So essentially, the turnover is slightly wrong. Everything's slightly wrong, but it's still pretty good. Now, if we go back to our original question as to how secure is Enigma, the answer is not very secure. And the reason is not because it's trivial to break, right? This took me a little bit of effort. And for short messages where these fitness functions start to break down, you don't have enough information. They're actually very robust, right? For a 50 character message, very, very difficult to break using something like an index of coincidence, because even if some of the letters start to appear right, there's not very many of them. The index is just noise. In the war, they limited or they attempted to limit messages to something like 200, 250 characters for this reason, because index of coincidence was already known. And there's now more powerful metrics like trigram scores and quadram score, which I've also implemented, which often work better, particularly for the plug board. And so if you have a short message, you don't get very much information on the different frequencies of different groups of letters. And so there's really no way to know what's going on at all. And you get very lucky or you don't. And most likely you don't. But the other issue is, of course, the number of plugs. I've only done five plugs here. So I've cheated a bit. Right. For most German messages are sent using 10 plugs. You're going to need 1200 to 1500 characters before fitness functions are going to start to give you something. If you know what the plain text might be, this becomes much, much easier. Right. Because if you can fix these characters have to be exactly this, your fitness function is much less noisy. Right. And actually, I've implemented that as well. And it just starts breaking it. No, no. Nobody's business. So it is crackable. Right. If you know if you can guess what plain text is. And, of course, modern cryptography assumes you know what the plain text is, at least for some of the message. For example, whenever you send an HTTP message to a Web server, the beginning bit always says HTTP get or something similar. Right. There's a very known structure to these things. You can start to guess what they would be. We can't assume that you wouldn't know what any of the plain text was. But even if we don't, you can see that these index of coincidence and trigram scores and things can start to tease out some information. So going back to the beginning, Enigma is actually harder to crack than I thought. Right. People always talk about how hard it was to crack during the war. And that's absolutely fine. But you just kind of assume. But now it's 2021. Laptop should be able to just click. Go for all the settings. Find yourself. The ciphertext doesn't really work that way. You have to try and stumble your way towards it. And often it doesn't work. And there's noise in the output. And so you have to try and work out whether what you're seeing is actually improvement or not. And so on, which I think is quite interesting. Modern ciphers don't have this problem. If you have 128 bit AS key, you can't brute force the first bit because different like zero or one won't give you any better or worse plain text. It will just be nonsense each time. And that's true of any amount. So you can't do like the first half of the key and then the second half of the key, which is kind of what we're doing here. So modern ciphers don't have this problem. And so on. We can't do a lot of interesting things with this image after just one set of convolutions. But we're getting there. So this one is starting to be transformed. Some of them are noisy. There's a lot of paint associated with it showing which was the proper setting with A at the top and A against the dot of paint.

Menu

What would breaking the Enigma look like in 2021? (film, 21 minutes)

Toggle timeline summary

Transcription