DeepSeek R1 - free and open-source AI beats OpenAI o1? (film, 5m)
Yesterday, China unveiled a state-of-the-art, free, and open-source chain-of-thought reasoning model, DeepSeek R1, that rivals OpenAI's O1, which many people are paying a hefty $200 a month for. The tech world is currently divided into two camps: the pessimists who believe AI peaked with GPT-3.5, and the optimists who think we are on the brink of creating artificial superintelligence that will launch humanity into a technological singularity. While the future remains uncertain, it's easy to see how pessimists appear knowledgeable, while optimists often find ways to profit. However, being an AI optimist can be tough, as it requires trust in hype-driven figures like Sam Altman and closed-off companies like OpenAI. Fortunately, on the same day the TikTok ban was lifted, China gifted the world with DeepSeek R1. In today's video, viewers will learn how to utilize this model like a senior prompt engineer, marking a pivotal moment in history as of January 21st, 2025.
The release of DeepSeek marks a significant development in the AI landscape, offering a freely available MIT-licensed model that can be used commercially to generate profits in applications. Ironically, the unveiling occurred while Sam Altman was attending Trump's inauguration, an ideal moment for a viral meme depicting Zuckerberg recognizing an overflow in Bezos' artificial binary code. Altman has recently derailed the optimism parade by claiming that the AI hype is unmanageable and that OpenAI hasn't achieved AGI internally—a point made even clearer by the recurring bugs in ChatGPT. For instance, a security researcher recently exposed how ChatGPT could be coaxed into DDoSing websites merely by providing a list of URLs pointing to the same site, demonstrating behavior that any genuinely intelligent being would likely avoid. The earlier release of R1 was merely another stride in the AI race, yet open-source solutions have quickly caught up, and now with DeepSeek R1, users have a strong alternative. Its benchmarks reveal that DeepSeek R1 stands on par with OpenAI's O1 and often exceeds it in areas like math and software engineering. One important takeaway is to remember that benchmarks should not always be trusted—recent disclosures revealed that Epic AI, a company that provides math benchmarks, has close ties to OpenAI, raising questions about potential conflicts of interest.
Moving on, DeepSeek offers both a web-based UI and options for local downloads via tools like Ollama. The model offers a lightweight 7 billion parameter version for those with average hardware but can scale up to a massive 671 billion parameters requiring significant resources. The recommended version for users seeking performance reminiscent of O1 mini is the 32 billion parameter model. A distinctive feature of DeepSeek is its avoidance of supervised fine-tuning; instead, it leverages direct reinforcement learning. This method allows the model to learn and improve independently through trial and error, finding solutions in a manner akin to human reasoning, which is quite innovative. While the technical complexities behind this may seem daunting, the gist is that the AI evaluates its outputs across multiple attempts, assigning reward scores based on performance, which helps it refine its process over time.
In practice, users can witness the model's chain of thought when prompted through Ollama. It’s advised to keep prompts succinct and direct while using chain-of-thought models like R1 and O1 since they’re intended to handle thinking autonomously. For example, if prompted with a math problem, the model will first reveal all reasoning steps before arriving at the final answer. This begs the question about when to use a chain-of-thought model as opposed to more conventional large language models: chain-of-thought approaches excel particularly in solving complex challenges that require deep planning, such as advanced mathematics or intricate programming tasks. Those wishing to build innovative AI applications must ultimately start from the ground up, and fortunately, today they can do so for free thanks to this video's sponsor, Brilliant. Their interactive platform offers hands-on lessons that simplify the complexities of deep learning. By dedicating only a few minutes each day, users can grasp the underlying mathematics and computer science behind what may seem like magical technology. Recommended starting points include learning Python and exploring the in-depth course on How Large Language Models Work for those eager to understand the intricacies of GPT. Users can enjoy a free 30-day trial of everything Brilliant offers by visiting brilliant.org slash fireship or by using the QR code displayed on screen. At the time of writing this article, the video has garnered an impressive 3,107,336 views and 90,874 likes, reflecting the growing interest and engagement surrounding the release of DeepSeek R1.
Toggle timeline summary
-
China releases an open-source reasoning model, challenging OpenAI's model.
-
Discussion on two camps in the tech world: pessimists vs. optimists.
-
Announcement of DeepSeek R1, following TikTok’s ban removal.
-
DeepSeek released as an MIT-licensed model for commercial use.
-
Critique of AI hype and mention of issues with existing models like ChatGPT.
-
DeepSeek R1's benchmarks compared to OpenAI's performance.
-
DeepSeek R1's specifications and hardware requirements.
-
Explanation of the reinforcement learning approach used by DeepSeek.
-
Prompts should be concise for effective responses from reasoning models.
-
Promotion of Brilliant for learning about the foundations of AI.
Transcription
Yesterday, China released a state-of-the-art, free and open-source, chain-of-thought reasoning model with performance that rivals OpenAI's 01, which I'm stupidly paying $200 a month for right now. You see, there's two types of people in the tech world right now. In one camp, we have the pessimists, who think that AI is overhyped and plateaued with GPT-3.5. In the other camp, we have the optimists, who think we're about to see the emergence of an artificial superintelligence that will propel humanity into Ray Kurzweil's technological singularity. Nobody truly knows where things are going, but one thing to remember is that pessimists sound smart, while optimists make money. But sometimes it's hard to be an AI optimist because you need to trust hype jedis like Sam Altman and closed AI companies like OpenAI. Well, luckily, on the same day that TikTok's ban was removed, China gave the world a gift in return in the form of DeepSeek R1. And in today's video, you'll learn exactly how to use it like a senior prompt engineer. It is January 21st, 2025, and you're watching The Code Report. Yesterday, the course of history changed forever. And no, I'm not talking about the return of the king, but rather the release of DeepSeek, which is an MIT-licensed chain-of-thought model that you can use freely and commercially to make money in your own applications. This model came out while Sam Altman was busy at Trump's inauguration, which is a perfect time to use this new meme template, where Zuckerberg appears to detect a rack overflow in this artificial binary code owned by Jeff Bezos. He's gonna have some explaining to do with his wife, but Sam Altman also reigned on the AI optimist parade recently when he said that the AI hype was out of control and, no, they have not achieved AGI internally. And that's pretty obvious with how buggy chat GPT is. Like, recently, a security researcher figured out how to get chat GPT to DDoS websites for you. All you have to do is provide it with a list of similar URLs that point to the same website, and it will crawl them all in parallel, which is something that no truly intelligent being would do. That being said, the release of R1 a few months ago was another step forward in the AI race, but it didn't take long for open source to catch up, and that's what we have with DeepSeek R1. As you can see from its benchmarks, DeepSeek R1 is on par with OpenAI O1 and even exceeds it in some benchmarks like math and software engineering. But let me remind you once again that you should never trust benchmarks. Just recently, this company Epic AI, which provides a popular math benchmark, only recently disclosed that they've been funded by OpenAI, which feels a bit like a conflict of interest. I don't care about benchmarks anyway and just go off of vibes, so let's go ahead and try out DeepSeek R1 right now. And they have a web-based UI, but you can also use it in places like Hugging Face or download it locally with tools like Ollama. And that's what I did for its 7 billion parameter model, which weighs about 4.7 gigabytes. However, if you want to use it in its full glory, it'll take over 400 gigabytes and some pretty heavy-duty hardware to run it with 671 billion parameters. But if you want something that's on par with O1 mini, you want to go with 32 billion parameters. Now, one thing that makes DeepSeek different is that it doesn't use any supervised fine-tuning. Instead, it uses direct reinforcement learning. But what does that even mean? Well, normally, with supervised fine-tuning, you show the model a bunch of examples and explain how to solve them step-by-step, then evaluate the answers with another model or a human. But R1 doesn't do that and pulls itself up by its own bootstraps using direct or pure reinforcement learning, where you give the model a bunch of examples without showing it the solution first, then it tries a bunch of things on its own and learns or reinforces itself by eventually finding the right solution, just like a real human with reasoning capabilities. DeepSeek also released a brief paper that describes the reinforcement learning algorithm. It looks complicated, but basically, for each problem, the AI tries multiple times to generate answers, which are called outputs. The answers are then grouped together and given a reward score, so the AI learns to adjust its approach for answers with a higher score. That's pretty cool, and we can see the model's actual chain of thought if we go ahead and prompt it here with OLAMA. When prompting a chain of thought model like R1 or O1, you want to keep the prompt as concise and direct as possible, because unlike other models like GPT-4, the idea is that it does the thinking on its own. Like, if I ask it to solve a math problem, you'll notice that it first shows me all the thinking steps, and then after that thinking process is done, it'll show the actual solution. That's pretty cool, but you might be wondering when to use a chain of thought model instead of a regular large language model. Well, basically, the chain of thought models are much better when it comes to complex problem solving, things like advanced math problems, puzzles, or coding problems that require detailed planning. But if you want to build the future with AI, you need to learn it from the ground up. And you can do that today for free thanks to this video's sponsor, Brilliant. Their platform provides interactive hands-on lessons that demystify the complexity of deep learning. With just a few minutes of effort each day, you can understand the math and computer science behind this seemingly magic technology. I'd recommend starting with Python, then check out their full How Large Language Models Work course if you really want to look under the hood of chat GPT. You can try everything Brilliant has to offer for free for 30 days by going to brilliant.org slash fireship, or use the QR code on screen. This has been the Code Report. Thanks for watching, and I will see you in the next one.