Your Build Benchmarked
WillMyGPURunIt is a free tool that tells you what your PC can really do. Enter your CPU and GPU below and get a straight answer: which local AI models your card can run (and how fast), whether anything bottlenecks, the power supply you actually need, part compatibility, and the games you can play. Prefer a focused page? The full build calculator lives on its own too.
Benchmark Your Build
What WillMyGPURunIt Checks
One form, one report: everything you need to understand a build, whether you're buying parts, upgrading, or seeing what you already own can handle.
Local AI models + tokens/sec
Every popular LLM (Llama, Qwen, Gemma, DeepSeek): whether it fits your VRAM, the best quant, and an estimated speed in tokens/sec.
AI & gaming bottlenecks
Whether your CPU and GPU are a balanced match, in plain language, with a concrete upgrade suggestion either way.
Power supply wattage
The PSU size your parts actually need, sized for real transient spikes and ~10% headroom, never below the GPU's minimum.
Part compatibility
Socket, RAM type and PSU checks from your CPU, motherboard chipset and supply, so you catch a mismatch before you build.
Games you can run
110 popular games scored against your GPU at 1080p high, plus a search box for anything not listed.
1-100 build scores
At-a-glance Local AI and gaming ratings so you can size up a build in a single number each.
New to Local AI? Start Here
Plain-English guides to running AI on your own machine, no jargon assumed.
Why Run AI Locally, and Why Not
Privacy, cost, offline access and control versus the real downsides: setup, hardware cost and speed. An honest both-sides look.
Read →How Much VRAM Do You Need to Run an LLM?
A practical VRAM-by-model-size table, from 7B chat models on 8 GB cards to 70B on 24 GB, plus how quantization and context change the math.
Read →Best GPUs for Local LLMs
Ranked GPU picks for running local AI by budget and VRAM tier, built from real benchmark data and the actual model each card can run.
Read →What Is CUDA? And Why It Matters for Local AI
CUDA is the reason NVIDIA dominates AI. A plain-English explainer on what it is, what it does, and how AMD (ROCm) and Apple (Metal) compare.
Read →Deciding Between Two Builds?
Compare two PCs side by side: every score, bottleneck and number at once.
Frequently Asked Questions
Is WillMyGPURunIt free?
Yes, completely free. Enter your parts and get every result with no account or sign-up.
How much VRAM do I need to run AI locally?
Roughly 0.6 GB of VRAM per billion parameters at 4-bit, plus overhead, so an 8B model needs about 8 GB, a 14B around 12-16 GB, and a 32B needs 24 GB. The calculator shows exactly what your card runs.
What is a CPU or GPU bottleneck?
A bottleneck is when one part holds the other back, for example a CPU too slow to keep a fast GPU fed with frames. For gaming a GPU being the limiter is the healthy state; for local AI the limiter is almost always VRAM capacity.
How accurate are the numbers?
They're estimates built from published benchmark data (PassMark G3D, single-thread ratings) and real GPU specs (VRAM, bandwidth, board power). They're a reliable guide for comparing builds, not a guarantee of exact frame rates or speeds.
What does tokens per second mean?
It's how fast a model writes. A token is about ¾ of a word, so ~40 tokens/sec already outpaces your reading speed. We estimate it from your GPU's memory bandwidth.