AI Atlas
Tools

GPU Calculator

Model size + quant → how much VRAM you need, which GPUs fit.

ESTIMATED VRAM NEEDED
estimated
Weights
KV cache
Overhead

RECOMMENDED GPUS

VRAM estimates are approximate. Actual usage varies ±10-30% depending on framework (vLLM, Transformers, llama.cpp), KV cache implementation, and optimizations like PagedAttention.