Grok 3 - the new king of AI from Elon Musk? (movie, 4m)
In a recent episode, Fireship discusses the groundbreaking language model, Grok 3, developed by Elon Musk, which has just clinched the top spot on the LLM Arena leaderboard. This model boasts incredible intelligence and is hardly censored, allowing it to produce content that may be deemed illegal in various regions across the globe. The video examines what sets Grok apart, how it was trained, and verifies whether it lives up to its reputation as the best LLM in existence. A noteworthy feature of Grok is its direct access to Twitter’s data stream, coupled with optimization from former AI developers focused on maximum truth-seeking, even at the risk of being politically incorrect.
As Fireship delves deeper into Grok's performance metrics, he highlights how it has outperformed other AI models such as Gemini and GPT-4 in benchmark tests. However, he points out that OpenAI-03 was not included in these comparisons, which may skew Grok's perceived efficacy. Grok has shown prowess in coding generation and aiding in game creation within the Godot engine, demonstrating its practical applications. The setup for Grok's training at the Colossus supercomputer, equipped with over 200,000 NVIDIA H100 GPUs, is also an exciting revelation that underpins its capabilities.
Fireship anticipates the future expansion of Grok, particularly with the upcoming release of a SuperGrok subscription priced at $30 a month, a competitor to ChatGPT Pro’s $200. This affordability could attract a broad range of users to the model. Ultimately, he emphasizes the importance of learning to code from the ground up and recommends Brilliant, the sponsoring platform of the video, which facilitates understanding deep learning complexities in a user-friendly manner.
In the concluding remarks, Fireship shares the current statistics for the video, noting it has garnered 1,578,205 views and 53,460 likes at the time of writing. This level of engagement underlines the widespread interest in AI developments and the Grok model specifically, which is gaining traction for its unique offerings and innovative approaches to content generation. The insights offered by Fireship, along with the evolving landscape of the AI marketplace, clearly capture the attention of both developers and technology enthusiasts alike.
Toggle timeline summary
-
Introduction to Grok 3, a new large-language model.
-
Grok 3 took the top spot on the LM Arena leaderboard.
-
Overview of Grok 3's features, including minimal censorship.
-
It has a deep-thinking mode and text-to-video capabilities.
-
Discussion on how Grok 3 was trained and its LLM context.
-
Elon Musk's recent offer to buy OpenAI was rejected.
-
Mark Zuckerberg's use of pirated books for AI training revealed.
-
Grok has unique access to Twitter's data for training.
-
Grok's ability to generate blocked content on other LLMs.
-
Currently leading in the LLM Arena's comparative testing.
-
Grok's performance compared to other state-of-the-art models.
-
Insight into Grok's training at the Colossus supercomputer.
-
Upcoming SuperGrok subscription service for advanced features.
-
Mention of various AI tools the speaker uses.
-
Promotion of Brilliant's platform for learning coding and deep learning.
-
Closing remarks and thanks for watching.
Transcription
Just hours ago, yet another deep-thinking large-language model hit the timeline, crushing existing benchmarks and reaching the coveted number one spot on the LM Arena leaderboard. This new model is none other than Elon's based and red-pilled Grok 3. Not only is it smart as hell, but it's also mostly uncensored, and will generate content that's illegal in many parts of the world. It has a deep-thinking mode like DeepSeeker 1, it can apparently do text-to-video, and in the near future will have a paid subscription for something even more powerful called SuperGrok. I'm already paying for Twitter Premium Plus just to access Grok 3, so that's a slap in the face, but in today's video, we'll find out what makes Grok special, how they trained it, and if it truly is the best LLM in the world. It is February 18th, 2025, and you're watching The Code Report. Last week, Elon attempted to troll OpenAI by offering to buy it out. Not surprisingly, the OpenAI board promptly rejected this offer, and Sam Altman is still on track to make it for-profit and get his big payday. The AI game of Thrones is ruthless, and Mark Zuckerberg, one of the big players vying for the throne, took a big loss last week when it was revealed that he signed off on using 82 terabytes of pirated books to train their llama models, which he obtained through the Library Genesis Project, which contains millions of books and paywalled articles. I just can't believe Zuckerberg would do something like that, though, said nobody ever. When it comes to training AI, though, one thing that's special about Grok is that it's the only model that has direct access to the firehose of data from Twitter, and ex-AI developers have optimized this model for maximum truth-seeking, even if that comes at the expense of being politically correct, and that means you can use it to do things like generate images of celebrities or have it write a profanity-laced poem about racial stereotypes. In the name of science, I tried this prompt on every LLM, and it was blocked on every single one of them except for Grok. The response it gave me is so offensive that I can't even show it here on YouTube, and if you posted something like this in a country that doesn't have the right to free speech, you could quite literally go to jail. Despite that, Grok 3 should be available in countries like Germany and the UK soon. That's a huge win if you're a professional internet troll, but how good is Grok in reality? Well, currently it's sitting on top of LLM Arena, which is basically a blind taste test where humans compare different LLMs side by side, and reaching the top means that it's pretty dang good. In addition, we have another benchmark showing Grok beating Gemini, Claw, DeepSeek, and GPT-4 when it comes to math, science, and coding. However, conveniently, it's missing OpenAI-03, and when you add that model in, it paints a much different picture. It's also missing the Codeforces and ArcAGI benchmarks, and benchmarks are almost always cherry-picked for obvious reasons. The only thing I care about is my own proprietary vibe check, and it did things like generate ValidSvelte 5 code in one shot, and helped me build a game in Godot. Overall, it did a great job and appears to be plateauing at the same level as all the other state-of-the-art models. Recently, the AI grip has shifted from creating bigger, better base models to creating better prompting frameworks like DeepResearch and BigBrainMode. Another interesting detail about Grok, though, is that they provided details on how it was actually trained at the Colossus supercomputer in Memphis, Tennessee, which is currently believed to be the world's largest AI supercomputer. It contains a cluster of over 200,000 NVIDIA H100 GPUs, with plans to expand to 1 million GPUs. The facility uses so much electricity that they can't get it all from the grid, and brought in a bunch of portable diesel generators just to power the thing. And they're going to need all that power when SuperGrok comes out, which is expected to cost $30 per month, which would be a highly competitive price compared to ChatGPT Pro at $200 per month. As a developer, I'm already going broke paying for Claude, Cursor, Gemini, ChatGPT, Copilot, Codium, Midjourney, and WatsonX, but for some reason my code quality is worse than ever. A better strategy to learn how to code is to start from the ground up, and you can do that today for free thanks to this video's sponsor, Brilliant. Their platform provides interactive, hands-on lessons that demystify the complexity of deep learning. With just a few minutes of effort each day, you can understand the math and computer science behind this seemingly magic technology. I'd recommend starting with Python, then check out their full How Large Language Models Work course if you really want to look under the hood of ChatGPT. Try everything Brilliant has to offer for free for 30 days by going to brilliant.org slash fireship or use the QR code on screen. This has been The Code Report. Thanks for watching, and I will see you in the next one.