AMD has made significant strides with their Radeon RX 7900 XTX, which now outperforms NVIDIA’s GeForce RTX 4090 in running the DeepSeek R1 AI model in inference benchmarks. This is a notable achievement, especially in the competitive landscape of GPUs powering AI models.
In a swift move, AMD has rolled out comprehensive support for DeepSeek’s R1 LLM models, showcasing superior performance on their hardware. DeepSeek has recently captured the industry’s attention, leaving many curious about the computational effort behind training such models. Luckily, AMD’s flagship “RDNA 3” Radeon RX 7900 XTX GPU provides consumers with the ability to comfortably run these models with impressive efficiency, even outperforming some of NVIDIA’s offerings in various cases.
David McAfee from AMD shared a tweet highlighting the RX 7900 XTX’s impressive performance with DeepSeek’s AI models. He also provided a useful guide for those looking to harness the power of Radeon GPUs and Ryzen AI APUs.
For many enthusiasts and professionals focusing on AI workloads, consumer GPUs present a favourable option given their cost-effectiveness compared to dedicated AI accelerators. The added bonus is the ability to run models locally, which enhances privacy—a growing concern with AI applications like DeepSeek’s.
AMD has published an in-depth guide to help users run DeepSeek R1 models on their GPUs. Here’s a quick rundown of the steps:
1. Ensure you’re using the 25.1.1 Optional or newer Adrenalin driver.
2. Download LM Studio 0.3.8 or later from lmstudio.ai/ryzenai.
3. Install LM Studio and skip the initial onboarding.
4. Navigate to the discover tab.
5. Select your DeepSeek R1 Distill. For quick performance, go with smaller distills like the Qwen 1.5B, which are also recommended for beginners. Larger models offer enhanced reasoning capabilities.
6. On the right, select “Q4 K M” quantization and hit “Download.”
7. After downloading, return to the chat tab, choose the DeepSeek R1 distill from the menu, and check the “manually select parameters” box.
8. Adjust the GPU offload layers by moving the slider to the maximum.
9. Click on model load.
10. Begin interacting with the reasoning model fully operating on your local AMD hardware.
If these steps prove challenging, AMD also offers a video tutorial on YouTube that breaks down each step in detail. This visual guide can be incredibly helpful for setting up and running DeepSeek’s LLMs while ensuring data privacy. As both NVIDIA and AMD prepare to release new GPUs, we anticipate substantial improvements in inferencing capabilities, thanks to integrated AI engines designed to handle these complex workloads more efficiently. Make sure to leverage these resources to keep your data safe and fully utilize the power of your AMD devices.