DIY AI Powerhouse: Builder Unveils 64GB VRAM Dual MI50 Rig with Open-Source Cooling Shroud
A passionate hobbyist has constructed a high-performance local AI server using two AMD Instinct MI50 GPUs, achieving 64GB of VRAM for running large language models at near-real-time speeds. The build, costing under €1,500, features a custom 3D-printed cooling shroud designed for silent operation under a desk.

DIY AI Powerhouse: Builder Unveils 64GB VRAM Dual MI50 Rig with Open-Source Cooling Shroud
summarize3-Point Summary
- 1A passionate hobbyist has constructed a high-performance local AI server using two AMD Instinct MI50 GPUs, achieving 64GB of VRAM for running large language models at near-real-time speeds. The build, costing under €1,500, features a custom 3D-printed cooling shroud designed for silent operation under a desk.
- 2DIY AI Powerhouse: Builder Unveils 64GB VRAM Dual MI50 Rig with Open-Source Cooling Shroud In a remarkable display of grassroots innovation, an anonymous AI enthusiast has completed a high-efficiency, low-noise local AI server built around dual AMD Instinct MI50 GPUs — a rare and cost-effective solution for running large language models (LLMs) without reliance on cloud services.
- 3The system, detailed in a Reddit post on r/LocalLLaMA, delivers 64GB of unified video memory, enabling the inference of models such as GLM 4.7 Flash Q8_0 at approximately 50 tokens per second — a performance level that rivals many commercial cloud-based offerings at a fraction of the cost.
psychology_altWhy It Matters
- check_circleThis update has direct impact on the Yapay Zeka Araçları ve Ürünler topic cluster.
- check_circleThis topic remains relevant for short-term AI monitoring.
- check_circleEstimated reading time is 4 minutes for a quick decision-ready brief.
DIY AI Powerhouse: Builder Unveils 64GB VRAM Dual MI50 Rig with Open-Source Cooling Shroud
In a remarkable display of grassroots innovation, an anonymous AI enthusiast has completed a high-efficiency, low-noise local AI server built around dual AMD Instinct MI50 GPUs — a rare and cost-effective solution for running large language models (LLMs) without reliance on cloud services. The system, detailed in a Reddit post on r/LocalLLaMA, delivers 64GB of unified video memory, enabling the inference of models such as GLM 4.7 Flash Q8_0 at approximately 50 tokens per second — a performance level that rivals many commercial cloud-based offerings at a fraction of the cost.
The builder, who goes by u/roackim, assembled the rig using a Gigabyte X399 DESIGNARE motherboard paired with a Threadripper 2990WX 32-core CPU and 64GB of DDR4 RAM. Two AMD Instinct MI50 accelerators, each with 32GB of HBM2 memory, form the computational core. Purchased second-hand for around €330 each, the MI50s — originally designed for data center workloads — offer exceptional memory bandwidth and capacity, making them ideal for local LLM inference despite their discontinued status. The total build cost, including a custom case, power supply, and components, came to approximately €1,500, positioning it as an extraordinarily economical alternative to NVIDIA-based AI rigs that often exceed €5,000.
Perhaps the most innovative aspect of the build is the custom 3D-printed GPU shroud designed to mitigate noise while ensuring adequate cooling. Due to space constraints and the need for quiet operation — the server sits beneath the builder’s desk — the shroud was engineered to house both MI50 cards with a single 92mm Arctic P9 Max fan, maintaining stable thermal performance at idle (18W per card) and under load (155W total). The design is modular, requiring only M2 and M3 screws for assembly, and is fully open-sourced under the MIT license on GitHub. The repository includes all STL files, assembly instructions, and firmware recommendations, enabling other enthusiasts to replicate the solution on small-format 3D printers.
Software-wise, the system runs Ubuntu 24.04 LTS with ROCm 6.3, AMD’s open-source GPU computing platform, and leverages llama.cpp for efficient model inference. While the builder notes that token throughput drops after initial bursts — a common challenge with memory-bound LLMs — the system remains stable and usable for extended inference tasks. Notably, the motherboard’s lack of fan header control forces the cooling fan to run at a fixed 2,700 RPM, a minor compromise for noise reduction.
This build underscores a growing trend among AI hobbyists and researchers seeking autonomy from proprietary cloud platforms. As open-source LLMs like GLM, Llama, and Mistral become increasingly accessible, the demand for affordable, quiet, and efficient local hardware is surging. The MI50, once overlooked in favor of NVIDIA’s CUDA ecosystem, is experiencing a renaissance thanks to ROCm’s maturation and community-driven optimizations.
The builder’s open-sourcing of the shroud design is likely to inspire a wave of similar projects, particularly among those constrained by space or noise sensitivity. With AI becoming increasingly embedded in personal workflows — from research to content creation — this DIY rig exemplifies how technical ingenuity and community collaboration can democratize access to powerful computing resources. As one commenter noted: "This isn’t just a server. It’s a manifesto for local AI."
For those interested in replicating the build, the complete project files are available at github.com/roackim/mi50-92mm-shroud.


