Google TPU + Raspberry Pi 5 = home AI platform (film, 12m)

This is the new Google TPU, and this is a Raspberry Pi 5. I'm going to show you how to integrate a Google TPU with a Raspberry Pi to enable new Edge AI capabilities in your Pi projects. We'll see what this new coprocessor can do, and what new features it can unlock in your Pi. So, the Google TPU, or Tensor Processing Unit, is a small, low-powered chip designed to work as an AI coprocessor on Edge devices. What this means is that it will process your AI operations without overtaxing your CPU. Now, Coral AI has added this chip onto a module with an M.2 form factor to make it easier to integrate with existing hardware. With the Raspberry Pi 5's new PCIe port and an adapter, you can connect the M.2 form factor device to a Raspberry Pi. Now, AI operations could just run on a CPU. You could just run your AI workloads directly on the Pi without the TPU. However, let's look at the performance difference. A Raspberry Pi 5 has an estimated throughput of 25 GFLOPS, or 25 billion floating point operations per second, compared to the TPU, which is estimated to support 4 TOPS, or 4 trillion operations per second. That's about 160 times faster. Now, compared to a desktop GPU, such as the NVIDIA RTX 4090, the 4 TOPS of the TPU is about 25 times slower than the nearly 100 TFLOPS of the 4090. But given that the 4090 retails for about $1800, and the TPU costs only around $25 US, you get a lot of bang for your buck. Even compared to lower-end desktop GPUs, such as the RTX 4060, which supports about 15 TFLOPS, the TPU is only about 4 times slower, but nearly 16 times cheaper. However, the smaller size of the TPU does come with limitations. It may not be able to fit the latest and greatest neural networks. It only comes with 8 MB of memory to store the model parameters and input data. This is suitable for simple mobile image detection or classification models. It would struggle trying to run larger networks such as LLMs or Large Language Models, the kinds of networks you would use to build something like ChatGBT. Furthermore, desktop GPUs are used for a number of different tasks. They can render graphics, run general computations like physics simulations, as well as train and use AI models. The TPU, on the other hand, was designed only to use AI models. This is why they can have such a great price-to-performance ratio, and why they will never replace desktop GPUs in our systems. OK, so the Raspberry Pi 5 is the first Pi to expose the PCIe port. This is what allows the Pi 5 to work with the M.2 form factor TPU. It supports PCIe Gen 2 out of the box. However, it can be configured to support PCIe Gen 3. Unfortunately, the TPU module only supports PCIe Gen 2. So, even though the Pi 5 can support Gen 3, the best it can do is Gen 2 speed. One lane of PCIe Gen 2 can support a bandwidth of 500MBps. Considering that the TPU only has 8MB of memory to be filled, Gen 2 speed should be plenty. It should be noted that special care needs to be taken when choosing an M.2 to PCIe adapter. The module M.2 interface is available in two different keys, A plus E and B plus M. The one I have uses the A plus E key. Fortunately, there is a company that builds adapters specifically designed for this TPU. I managed to pick up one of these Pineberry AI hats. It is what I will be using to integrate the Coral AI Google TPU with the Raspberry Pi 5. OK, let's start the assembly. The first thing you want to do is install the Pi's OS on a microSD card. I will show you why this must be your first step later. The OS can be installed using Pi Imager. This tool helps you install the OS onto an SD card. First, select your device, then select your OS. For this project, any OS will work. I will choose the latest version of the 64-bit Pi OS. Finally, select the SD card to be written and click Next. In a few minutes, the writing will be done and you can eject the SD card from your computer and insert it into your Pi. Let's start assembling the hardware by inserting the TPU module into the M.2 slot on the Pineberry AI hat. Thanks to the keying, it only fits in one way. The AI hat even comes with a special set screw to hold the module down. It's a good idea to insert the PCIe cable into the hat now, before mounting onto the Pi, and make sure that the arrows are aligned on the same side. This hat is unique in that it mounts to the bottom of the Pi instead of to the top like most other hats. It comes with standoffs for mounting the hat onto the Pi. They screw onto the underneath side of the Pi through the normal mounting holes. The hat can then be mounted onto the Pi with the TPU module facing up towards the Pi. The final step is to then insert the PCIe cable into the Pi. You can see that there is not a lot of room to remove the SD card. This is why we had to install the OS first. Now, we can start setting up the software. Coral AI has provided documentation for their TPU module. However, their documentation is not written for the Pi. In addition, their documentation seems to only work for Debian 10, but the recent version of Pi OS is derived from Debian 12. I'm going to show you the steps that I used to get the TPU setup on a Pi with the most recent Pi OS. There are three main tasks to set up the TPU. First, install the runtime library. Second, install the TPU driver. Third, configure the PCIe port. Installing the runtime library is the easy part. Google provides a package repository with the runtime library. We just have to add the package repository to our source list and update apt. We can then just install the package like so. The driver for the TPU is called gasket driver. There is a package available that we could install. However, it is not built for the latest Pi OS, so we are going to have to build it from source. First, we need to install the build dependencies. Then we can clone the gasket driver's git repo, cd into the repo directory, and build the package. This will build an installable .dev package. Then we just need to install the newly created package. It is also a good idea to add your user to the Apex group so that you can access the TPU from your user account. Finally, we will need to update the boot configuration and the device tree to activate the PCIe port. With newer versions of Pi OS, you can update the device tree using the DT overlay option, like so. Now you can reboot and it should be ready to go. If you want to run the examples that Coral AI has provided, you will need to install Pi Coral. The problem is that the most recent Pi OS only comes with Python 3.11, which is too new to run Pi Coral. You will need to install an older version of Python. This can be accomplished with Pi ENV. Pi ENV will allow us to install any version of Python on our system. Fortunately, there is a script that will install Pi ENV binaries for us. We just have to remember to set the path to use our new bin directory. Then we can install some build dependencies and install our new Python version. I will be using Python version 3.9.16 as that is the latest version that will work with Pi Coral. In a few minutes, the new Python version will be installed separate from your system Python version in your home directory. Our Python version can be activated like so. Then we can install Pi Coral. I normally use separate virtual environments for installing packages, but since this version of Python is just for running Pi Coral, I will install packages directly to the Pi ENV directory. Okay, so I found that you have to downgrade NumPy for this to work. Pip will attempt to install NumPy version 2 to satisfy the requirements of TF Lite. However, TF Lite needs a version that is less than version 1.20. I found that NumPy version 1.19.5 works perfectly. So to make the process easier, I've created some scripts that I've linked to in the description. To use my scripts, first clone the repository. Then you can just run tpuinstall.sh. It will install the runtime library, the driver, and configure the PCIe port. When it is done, it will restart your Pi for you. If you want to set up Pi Coral, you can run pienvinstall.sh. It will install pienv, Python 3.9.16, and the dependencies for Pi Coral. And then you will be ready to run the demo. To run the demo, we need to clone the Pi Coral repo to get the example code. Then we cd into the Pi Coral directory and download the test data. And finally, we run the demo. You should get results similar to mine, of about 2.7 to 2.6 milliseconds. This is compared to the over 16 milliseconds when run the same workload on the CPU. So with the TPU, you can do real-time image classification like this. Now, the model wasn't trained off images of 3D printed models, so it's likely going to be less accurate. But I'm still getting some pretty good results. The draft is one of my best models. It will get 99 to 100% confidence. This is likely because the color is reasonably accurate and the shape is fairly unique. However, at certain angles, it will predict as a horse. The cat is a pretty good model, with a reasonable confidence, though sometimes it gets misclassified as a bear. The apple also does really well, at least at certain angles. At other angles, it isn't detected at all. Now I need to come up with a good project idea that uses the TPU. If you have any ideas for projects that you want to see me work on, let me know in the comments. If you liked this video or found it helpful, please give me a like. And don't forget to subscribe for future content. Thanks for watching!

Menu

Google TPU + Raspberry Pi 5 = home AI platform (film, 12m)

Toggle timeline summary

Transcription