Extracting firmware from devices - hardware hacking (film, 19 minutes)

One of the first things you must do when hacking an embedded device is to obtain and analyze the firmware. If you're lucky, you can download it from a website, or if you have a root shell, you can just get all the files from there. But what if none of these options are available? In this video, we will show you different firmware storage systems used by embedded devices, and how you can connect directly to a memory chip to dump a firmware image and find your vulns. Let's get started! Hey, this is Pedro from the Flashback team. Let's start by looking at a router board. We like to demo using routers, because these devices are cheap, readily available, can be purchased by anyone, and they contain a lot of technologies in big and small embedded devices. In the picture, we have three things highlighted. We have the MCU, the SPI, and the flash memory. The MCU is the microcontroller unit. This is effectively a CPU in a package with some RAM and some input and output peripherals. This MCU is a real tech chip, as you can see by the logo and by the RTL designation. Why don't you go online and check the datasheet for this? This package does not contain a lot of memory, so the firmware is not stored there. It will be stored in our flash memory chip. Upon boot, the MCU will communicate with the flash memory in order to get the firmware that will be executed. How does this happen? Well, that's the magic of the SPI bus, Serial Peripheral Interface. This is a high-speed, full-duplex bus that allows the MCU to communicate with this NOR flash memory. Don't worry, we'll go into more details later. Let's briefly talk about flash memory types. On the left-hand side, we have the one that we just saw, which is the NOR flash. This is shown in a SOIC8 package, but can come in different packages. Package is just a fancy name for how many legs the chip has. In the middle, we got NAND flash, which is a higher-density flash. Here it's shown in a TSOP48 package, but again, can have many more legs, or less in some cases. And lastly, on the right-hand side, we have EMMC flash. This is a flash type that has very high density, meaning very high large capacities. And here it comes in a ball-grid array package, where the pins are under the chip. There's a fourth type of memory used in high-end devices that we do not mention here, and is slowly replacing EMMC. It's called UFS. Your computer BIOS is most likely stored in SPI flash. Go ahead, open it up and have a look. But don't break it. NAND would typically be used in larger devices, such as high-end routers, smart TVs, anything that leads to bigger firmware. Whereas EMMC would be used in high-end devices. This would be more expensive stuff, such as mobile phones, digital cameras, tablets, etc. We will not go into NAND versus EMMC discussions. Just bear in mind that the NAND is accessed as a raw flash by the operating system, so it requires a bit more tricks, let's say. Whereas the EMMC is a NAND with a built-in controller. Yeah, I know, it sounds a bit complicated. If you want to know more, we might release a video in the future. Or better yet, just come to our training, where we explain all of this and play with old memory types. Let's talk a bit about NOR flash. It's a storage medium for non-volatile data. This means that the data which is written to it will remain in the chip until it is rewritten. Volatile data, like RAM, is erased when you reboot your computer or embed a device or turn the power off. Data can be read or written to on a byte-by-byte basis. This is actually a key feature of NOR flash. For example, with NAND flash, you cannot just read 1, 10 or 100 bytes. You have to read the size of the page of that NAND chip, which varies per chip, but is usually 4096 bytes. This means you have a lot more flexibility reading and writing with NOR flash. Another key property of NOR flash, when compared to NAND flash and other types, is it's mostly error-free. What this means is that it does not require any special error-correcting features in the chip to work properly. And this is not the case for NAND. Additionally, it has a very low latency, which means you can execute directly from the flash memory, which, again, is not possible with NAND flash. As we said previously, this means that NOR flash is mostly used for embedded devices that do not require a lot of storage, but need fast execution and fast memory. So how do we look at the NOR flash chip and identify it as such? Well, they usually look like this, as you're seeing on the left-hand side. So they usually come in a SOIC package, although they can come in bigger or smaller packages. The flash chip usually has a model number written on it. This is true not only for NOR, but also for NAND. What you do is you read that model number, either with your eyes, magnifying loop, or one of the things we like to do is take a high-resolution photo with our phones and then just zoom in. And then you go on the Internet and you look for that model number. In this case, MX25L6406E. If you're lucky, your datasheet will pop on your Google search. And then you can read the datasheet, understand how the NOR flash model works, and understand how you can interface with it. But sometimes the datasheet is not available. In this case, you don't need to worry much. As we will see, the SPI protocol is quite standardized. And as long as you guess where the power is and where the ground is so you don't fry your chip, you're usually okay. And Radek will show us how. Let's talk about the SPI protocol then. The SPI protocol allows for synchronous serial communication in full duplex mode using a master-slave architecture. This mumbo-jumbo means that usually there is a master, which controls the communication, and a slave, which follows its orders and where the data is written to or read from. But data flows both ways. The master provides a clock signal, which will determine the speed of the communication. Multiple slaves can be connected to one master, but a slave cannot connect to a slave. Finally, and not less important, SPI requires a minimum of four wires for communication. SCLK, this is the clock signal we just mentioned. CS, chip select. Think of it like an enable button for the SPI chip, which turns it on. MOSI, master out, slave in. As the name implies, this is where the chip receives data from the master, the output from the master. MISO, master in, slave out. This is where the slaves send the data back to the master. Anyway, I know this is all very abstract. You're bored, tired of my voice. The good news is, Mr. Radek is going to take over from me right now and show exactly how all of this works in practice. This is Radek. All the content we put on our YouTube channel, all the advisories and exploits we release, that's all free and sponsored by our own training. Do you enjoy watching our videos, learning new hacking techniques, and finding and exploiting vulnerabilities? Then why don't you come to our Embedded Device Hacking course, which we regularly host all over the world, live, in person. There are many hardware, embedded device hacking, IoT exploitation courses out there, but ours is truly unique. Why is that, Rado? We focus only on real vulnerabilities that we or other hackers find. And as you can see from our videos, we have a lot of experience hacking real devices, in-port to home, or our day jobs. The goal of the course is to teach you how to take apart embedded devices by analyzing the hardware, obtaining the firmware, reverse engineer it, find a vulnerability, and exploit it. Our mottos are no fake vulns and POC or GTFO. For more details, check our website, training.flashback.sh, and get your ticket now. That is a lot of theory. Let's turn that into practice. So that is our target. We can see the main CPU and some data lines which go to the flash. Flash contains all the information that is needed for the router to work. Firmware, configuration, and so on. So it has to read it at the boot time. Can you sniff it? Let's try. For this, we would need some hardware. Hooks, or a SOIC-8 clip. Cables, and a logic analyzer. With hooks, we can connect directly to the legs of the chip. But in this case, I prefer the clip. It's just faster and a little bit more stable. After that, I connected the logic analyzer according to the datasheet pinout. Channel 0 to chip select, channel 1 to SO data line, channel 2 to clock, and channel 3 to SI data line. Logic analyzer will allow us to sniff the traffic. Let's hit that power button and see what is happening on the wire. This is the software that comes with Celia. I set the device to capture data with highest possible settings. Make sure you use original high-quality cables delivered with Celia, or you might not always get the most reliable data. Okay, let's start capturing. Nothing yet. Wow, we see some data. This is literally a router pulling some data from the chip. We are sniffing on the bus. How awesome is that? Let's have a closer look and see what is happening on the wire. Okay, this looks very interesting and promising. We can see some waves changing. But let's rename the channels to where they are connected physically. Chip select, master in slave out, clock, master out slave in. This software has an amazing feature that we can add analyzers, and they support SPI from the box. That means it will try to interpret the data. We assign the channels to the protocol definition. Everything else stays as default. And there we go. It interpreted the waves. But what does it really mean? For that, we have to dig into the data sheet and try to understand a little better how SPI works. We are sniffing on four pins of the flash. Pedro explained earlier in the video what is the responsibility for each of the pins so we know that the flash will receive data online MOSI. SPI implements a set of commands. When it receives data, it matches it against available commands and interprets following data accordingly. In our case, we save the command 03hex, which is read command. It has an action of the reads n bytes out until chip select goes high. Hmm, okay, let's try to understand it better. This is the sequence diagram for read command. On top, the communication is started by, be careful, pulling the CS line down. By default, it idles high. Next, the clock dictates the speed of the communication. Each clock cycle is one bit. And then, the 8 bits on the SI line sends the 0hex3 command, which is read. But it is followed by 3 bytes of data. That 3 bytes say from which memory address to read data from. And after this command, data should be shitted out on the SO line until the CS line goes back to high. Okay, let's see if that is what we saw in the logic analyzer software. Yes, notice it's exactly as presented in the data sheet. We can see the data being returned. So, does it mean we can sniff fumer like this? Well, I would advise not to do so, and be careful of this. The data read by the router might not be in sequence, might not be complete, or some other things might happen. So, I would always advise to dump the fumer offline. But how do we do that? We need a microcontroller that is able to speak SPI protocol. Ladies and gentlemen, let me introduce you one of my best hardware hacking tools, a HydroBus. It's an open source device that implements tons of protocols. It's like a Swiss army knife tool. You just take a breakout diagram and wire cables accordingly to what you want to do. Notice, it has two channels for SPI. There are other tools out there that you could use for SPI dumping, like FTDI232 chip, Bus Pirates, but this one here is much faster, implements more protocols, and I can extend its functionality as it's open source. Okay, let's see how we could use HydroBus to dump fumer from our target. What do we need? A clip, the HydroBus of course, good quality cables. Now, we have to match the pins on the chip to the breakout on the HydroBus. I will be using SPI2 now as it supports SERPROC that I will use to dump the fumer shortly. So I have to simply match each of the lines on the chip to the pins on the HydroBus. But this time, I will be providing 3.3V power to the target, so I have to wire power and ground lines too. Okay, everything wired. We are ready to connect HydroBus to our computer, and we can use a magic tool called FlashROM in SERPROC mode. FlashROM is a standard tool used to dump fumer via SPI. It has tons of hardware supported, so we might be lucky with our target. Let's try. Oh, so it says it recognizes more possible chip models, and it's unsure which one to use. Let's help it as we are able to read the model of the chip package. Wow, it dumps it! Okay, it's been not going to recognize it. Yes, that's the file system of the target. Awesome. Sometimes FlashROM might have problems to read data from the chip, and you might have to lower the clock speed. You can easily do that with the SPISPEED command. Okay, so it's a true hackerman job. But what happens if FlashROM doesn't support your target chip? You can cry in a corner or speak with the SPI chip on your own terms. And I will show you now how to send any SPI command to the chip. For this, we need the HydroBus. We can connect to the HydroBus via Serial Console. It welcomes us with a nice menu. We select SPI. In here, we have tons of things that we can set and tune. Not everything is in scope of this video, but I suggest you explore on your own. I use SPI1 device now, as this is where my cables are connected on the HydroBus. Okay, let's have a look at the datasheet. One of the very nice commands to send is READID. Each chip has a hard-coded ID, and if we ask nicely, it will return to us. We already know how to interpret this diagram. We have to pull CSLINE down, send HEX9F command, and, well, read data from the MISO line. In HydroBus, we can do it very easily. The bracket is used to instruct HydroBus to pull CSLINE down. Then we send a command, HEX9F, and we tell HydroBus to read 3 bytes. Okay, let's try it. Yes, it returned data. And yes, this is valid READID data, as it matches to what we expect based on the datasheet. And this is how it looks on the wire in the logic analyzer. So, what else can we do? Let's send a READ command and read from the beginning of the chip. It is HEX03, and as you want to read from the beginning, we send 000000. And read data of the line. Yeah, data is returned. So, is this the way to dump the firmware? Well, I think that would be quite inefficient. But we can script it. Yes, Radek, we can script it. But guess what? I have already done it back in 2018. You just need to go to the HydroBus git repository on the CONTRIB directory, and you will see there's an SPI dumping script which works very quickly. Give it a try, and you see it works quite well. You can use it to dump any standard SPI chip. And then, once that's done, it's time to analyze the firmware. We win.

Menu

Extracting firmware from devices - hardware hacking (film, 19 minutes)

Toggle timeline summary

Transcription