Menu
About me Kontakt

In his latest video, Jeff Geerling discussed the topic of building Raspberry Pi clusters, sparking considerable curiosity among his viewers. Many asked questions about why they should consider building a cluster, especially when cheaper and more efficient PC alternatives exist. Jeff began by clarifying what a cluster actually is. He stated that it isn’t simply about bundling two computers' processing power but rather a set of similar or even different computers that collaborate to perform specific tasks. The key is to split work among the cluster's members, which isn't always effective for certain programs, such as games.

Jeff elaborated on how he utilizes his Raspberry Pis for various tasks within his home network. For instance, he runs Prometheus and Grafana on one of the Pis to monitor air quality and energy consumption. Other units serve as backup managers for his data or host a website. He emphasized that many applications run great on ARM processors, making Raspberry Pi an attractive choice for a wide array of users. Instead of building a traditional computer, he believes interesting and efficient solutions can be achieved using less expensive yet capable Pis.

Moving on to the general advantages of clustering, Jeff highlighted two main points: uptime and scalability. By implementing clustering, you can enhance the availability of various services, which is crucial during server failures. This feature allows continuous operation even when some components are malfunctioning. Jeff himself is planning to build a cluster using Kubernetes, which will further enhance his redundancy. As a result, in case of failure, everything can be quickly repaired and restored to normal operation.

It’s also worth considering savings and efficiency. Jeff compared the costs of a Raspberry Pi cluster with more expensive alternatives like the AMD EPYC processor. Although a Raspberry Pi cluster may not have the same processing power as expensive solutions, its low power consumption and construction cost make it a competitive choice for many applications. Jeff noted that for lower computing demands, a Pi cluster could prove more economical.

Finally, Jeff encourages his viewers to share their own experiences with building clusters, showcasing the diversity of possibilities that Raspberry Pis offer. With 580,612 views and 18,455 likes at the time of writing this article, the video clearly reflects a growing interest in the topic of Raspberry Pi clusters. Jeff Geerling demonstrates that there are numerous exciting opportunities within technology, and clusters can serve as an engaging and educational project for many IT enthusiasts.

Toggle timeline summary

  • 00:00 Introduction to the Raspberry Pi blade server video and questions about Pi clusters.
  • 00:21 Emphasis that not everyone should build a Pi cluster, but some should.
  • 00:27 Explanation of what a Pi cluster is not, clarifying misconceptions.
  • 01:07 A cluster consists of similar or different computers managed to perform tasks.
  • 01:21 Some software parallelizes well, while others, like games, do not.
  • 01:40 Discussion of software that runs efficiently on a cluster, like Prometheus and Grafana.
  • 02:09 Examples of practical applications running on Raspberry Pis.
  • 02:52 Reasons to build clusters including uptime and scalability.
  • 03:56 Uptime and reliability issues with single computers discussed.
  • 04:32 Benefits of clustering for managing server failures.
  • 05:09 Main reasons for clustering over single machines highlighted.
  • 05:32 Debate on the effectiveness of a Pi cluster versus high-end CPUs.
  • 06:58 Personal insights and educational value of building Pi clusters.
  • 08:07 Enterprise use cases for Raspberry Pi clusters.
  • 08:50 Discussion on ECC RAM in Raspberry Pis.
  • 09:45 Summary of thoughts on using Raspberry Pi clusters and viewer engagement.

Transcription

After I posted my Raspberry Pi blade server video last week, lots of people asked what you'd do with a Pi cluster. Many asked out of curiosity, while others seemed to shudder at the very idea of a Pi cluster because obviously a cheap PC would perform better, right? Before we go any further, I'd say probably 90% of you watching shouldn't build a Pi cluster. But some of you should. Why? Well, the first thing I have to clear up is what a Pi cluster isn't. Some people think when you put together two computers in a cluster, let's say both of them having 4 CPU cores and 8 gigs of RAM, you end up with the ability to use 8 CPU cores and 16 gigs of RAM. Well, that's not really the case. You still have two separate 4-core CPUs and two separately addressable 8-gig portions of RAM. Storage can sometimes be aggregated in a cluster to a degree, but even there you suffer a performance penalty and the complexity is much higher over just having one server with a lot more hard drives. So that's not what a cluster is. Instead, a cluster is a group of similar computers, or even in some cases wildly different computers, that can be coordinated through some sort of cluster management to perform certain tasks. The key here is that tasks must be split up to work on members of the cluster. Some software will work well in parallel, but there's other software like games that can only address one GPU and one CPU at a time. Throwing Flight Simulator at a giant cluster of computers isn't going to make it run any faster. Software like that simply won't run on any Pi cluster, no matter how big. Luckily, there is a lot of software that does run well in smaller chunks in parallel. For example, right now in my little home cluster, which I'm still building out, I'm running Prometheus and Grafana on this first Raspberry Pi and monitoring my internet connection, indoor air quality, and household power consumption. This Pi is also running PiHole for custom DNS and to prevent ad tracking on my home network. This next Pi is running Docker and serving up the website Pidramble.com, and the one after that is managing backups for my entire digital life, backing up all my data off-site on Amazon Glacier. I also have another set of Pis that typically runs Kubernetes, but I'm rebuilding that cluster right now. But there's tons of other software that runs great on Pis. Pretty much any application that can be compiled for ARM processors will run on the Pi. And that includes most things you'd run on servers these days, thanks to Apple's adopting ARM with the new M1 Macs and Amazon using Graviton instances in their cloud. I'm considering hosting Nextcloud and Bitwarden soon to help reduce my independence on cloud services and for better password management. A lot of people run things like Home Assistant on Pis, and there are thousands of different Pi-based automation solutions for home and industry. But before we get to specifically why some people build Pi-clusters, let's first talk about clusters in general. Why would anyone want to build a cluster of any type of computer? I already mentioned that you don't just get to lump together all the resources. A cluster with 10 AMD CPUs and 10 RTX 3080s can't magically play Crysis at 8K at 500 FPS. Well, there are actually a number of reasons, but the two I'm usually concerned with are uptime and scalability. For software other than games, you can usually design it so it scales up and down by splitting up tasks into one or more application instances. Take a web server, for instance. If you have one web server, you can scale it up until you can't fit more RAM in the computer or a faster CPU. But if you can run multiple copies, you could have one, ten, or a hundred workers running that handle requests, and each worker could take as much or as little resources as it needs. So you could, in fact, get the performance of 10 AMD CPUs split up across 10 computers, but in aggregate. Not everything scales that easily, but even so, another common reason for clustering is uptime or reliability. Computers die. There are two types of people in the world. People who have had a computer die on them, and people who will have a computer die on them. And not just complete failure. Computers sometimes do weird things, like the disk access gets slow, or it starts erroring out a couple times a day. Or the network goes from a gigabit to a hundred megabits for seemingly no reason. If you have just one computer, you're putting all your eggs in one basket. In the clustering world, we call these servers snowflakes. They're precious to you, unique, and irreplaceable. You might even name them. But the problem is, all computers need to be replaced someday. And life is a lot less stressful if you can lose one, two, or even ten servers while your applications still run happy as can be because you're running them on a cluster. Now I mentioned that I'm running one instance of each of these applications on this cluster right now. I'm planning on splitting up a few of them, though, and probably using Kubernetes on the entire stack so I can have even better redundancy. But having multiple Pis, and having good backups and automation to manage them, means when a micro SD card fails or a Pi blows up, I toss it out and can have a spare running in a few minutes. Okay, so those aren't all the reasons for clustering, but two of the main reasons most people would consider a cluster over one computer. But that doesn't answer the question why someone would run Raspberry Pis in their cluster. A lot of people questioned whether a 64-core ARM cluster built with Raspberry Pis could compete with a single 64-core AMD CPU. And well, that's not a simple question. First I have to ask, what are you comparing? If we're talking about price, are we talking about 64-core AMD CPUs that alone cost $6,000? Because that's certainly more expensive than buying 16 Raspberry Pis with all the associated hardware for around $3,000 all in. If we're talking about power efficiency, that's even more tricky. Are we talking about idle power consumption? Assuming the worst case with PoE-plus-powered HPi, 16 Pis would total about 100 watts of power consumption all in. According to ServeTheHome's testing, the AMD EPYC 7742 uses a minimum of 120 watts, and that's just the CPU. If you're talking about something like crypto mining, 3D rendering, or some other test that's going to try to use as much CPU and GPU power as possible constantly, that's an entirely different game. The Pis' performance per watt is okay, but it's no match for a 64-core AMD EPYC running full blast. Total energy consumption would be higher, 400-plus watts compared to 200 watts for the entire Pi cluster full-tilt, but you'll get a lot more work out of that EPYC chip on a per-watt basis, meaning you could compute more things faster. But there are a lot of applications in the world that don't need full-throttle 24-7. And for those applications, unless you need frequent bursty performance, it could be more cost-effective to run on lower-power CPUs like the ones in the Pi. But a lot of people get hung up on performance. It's not the be-all and end-all of computing. I've built at least five versions of my Pi cluster. I've learned a lot. I've learned about Linux networking. I've learned about power over Ethernet. I've learned about the physical layer of the network. I've learned how to compile software. I've learned how to use Ansible for bare-metal configuration and network management. These are things that I may have learned to some degree from other activities or by building virtual machines on one bigger computer, but I wouldn't know them intimately. And I wouldn't have had as much fun, since building physical computers is so hands-on. So for many people, myself included, I do it mostly for the educational value. Even still, some people say it's more economical to build a cluster of old laptops or PCs you may have laying around. Well, I don't have any laying around, and even if I did, unless you have pretty new PCs, the performance per watt from a Pi 4 is actually pretty competitive with a 5-10-year-old PC, and they take up a lot less space. And besides, the Pis typically run silent, or nearly so, and don't act like a space eater all day like a pile of older Intel laptops. But there's one other class of users that might surprise you, enterprise. Some people need ARM servers to integrate into their continuous integration CI or testing system so they can build and test software on ARM processors. It's a lot cheaper to do it on a Pi than on a Mac Mini or an expensive Ampere computer if you don't need the raw performance. And some enterprises need an on-premise ARM cluster to run things like they would on AWS Graviton or to test things out for industrial automation where there are tons of Pis and other ARM processors in use. Finally, some companies integrate Pis into larger clusters as small, low-power ARM nodes to run software that doesn't need bleeding edge performance or needs to be isolated from other servers. Another sentiment I see a lot is that it's too bad the Pi doesn't have ECC RAM. Well, be ready to be shocked, because the Pi technically does have ECC RAM. Check the product brief. The Micron LPDDR4 RAM the Pi uses technically has on-die ECC. Now, when people say ECC, they mean a lot of different things. And I'd say half the people who complain about a lack of it couldn't explain specifically how it would help their application run better. But it is good in a server setting for a lot of different types of software, and the Pi has it. Or does it? Well, not in the sense that expensive high-end servers do. The on-die ECC can prevent memory access errors in the RAM itself, but it doesn't seem to be integrated with the Pi's system on a chip, so the error correction is minimal compared to what you'd get if you spent tons of money on a beefy server with ECC integrated through the whole system. So anyways, those are my thoughts on what you could do with a cluster of Raspberry Pis. What are some other things you've seen people do with them? And have you built your own cluster of computers before, Raspberry Pi, or anything else? I'd love to see your examples in the comments. Until next time, I'm Jeff Geerling. Before we go any further, I should say blah blah blah. That's definitely something I should say. Throwing fights, fight, fight simulator. Some software will work well in parallel. When you talk about fun and you're not having fun, that's just insane. Hands on. Or by building virtual machines. One one bigger computer. But there's one other class of users. The wasberries taste like wasberries.