Tuesday, August 26, 2025

Andreessen Horowitz Builds An AI Workstation (a16z)

From Andreessen Horowitz, August 22: 

Building a16z’s Personal AI Workstation with four NVIDIA RTX 6000 Pro Blackwell Max-Q GPUs 

In the era of foundation models, multimodal AI, LLMs, and ever-larger datasets, access to raw compute is still one of the biggest bottlenecks for researchers, founders, developers, and engineers. While the cloud offers scalability, building a personal AI Workstation delivers complete control over your environment, latency reduction, custom configurations and setups, and the privacy of running all workloads locally.

This post covers our version of a four-GPU workstation powered by the new NVIDIA RTX 6000 Pro Blackwell Max-Q GPUs. This build pushes the limits of desktop AI computing with 384GB of VRAM (96GB each GPU), all in a shell that can fit under your desk.

Why Build This Workstation? 
Training, fine-tuning, and running inference on modern AI models require massive VRAM bandwidth, high CPU throughput, and ultra-fast storage. Running these workloads in the cloud can introduce latency, setup overhead, slower data transfer speeds, and privacy tradeoffs.

By building a workstation around enterprise-grade GPUs with full PCIe 5.0 x16 connectivity, we get:

  • Maximum GPU-to-CPU bandwidth: No bottlenecks from PCIe switches or shared lanes.
  • Enterprise-class VRAM: Each RTX 6000 Pro Blackwell Max-Q provides 96GB of VRAM, enabling dense training runs and large model inference without quantization. Each card consumes only 300W of power at peak (Max-Q version).
  • 8TB of NVMe 5.0 storage: 4x 2TB of NVMe PCIe 5.0 x4 modules.
  • 256 GB of total ECC DDR5 RAM.
  • Surprising efficiency: Despite its scale, the workstation pulls 1650W at peak, low enough to run on a standard 15-amp / 120V household circuit.
  • Next-gen data GDS streaming: While we are still in the process of testing this support, this setup should be compatible with the NVIDIA GPUDirect Storage (GDS), which allows datasets or models to stream directly from PCIe 5.0 NVMe SSDs into GPU VRAM, bypassing CPU memory, to reduce latency and maximize throughput.

We are planning to test and make a limited number of these custom a16z Founders Edition AI Workstations....

....MUCH MORE