LLM Hardware Tool

VRAM
Calculator.

Select a language model, quantization, and context length to calculate exactly how much VRAM your GPU needs.

settings_input_component PC Builder Configurator bolt Power Supply Calculator memory LLM VRAM Calculator code Compilation Speed Benchmark electric_bolt Electricity Cost Calculator compare_arrows Bottleneck Detector psychology LLM Inference Speed

Required VRAM

5.19GB

Breakdown

Model weights4.19 GB

KV Cache (ctx 4,096)0.50 GB

Runtime overhead0.50 GB

Total estimated5.19 GB

Recommended GPU

shopping_cart View GPU on Amazon

check_circle

Minimum recommended: RTX 4060

Estimation based on Ollama / llama.cpp. Results may vary by framework.

article

New to local LLMs?

Read our full guide on how to run models on your PC with Ollama.

Model

DeepSeek-R1 7Bctx 131K

7Bparams

DeepSeek-R1 8Bctx 131K

8Bparams

DeepSeek-R1 14Bctx 131K

14Bparams

DeepSeek-R1 32Bctx 131K

32Bparams

DeepSeek-R1 70Bctx 131K

70Bparams

Gemma 2 2Bctx 8K

2.6Bparams

Gemma 2 9Bctx 8K

9.2Bparams

Gemma 2 27Bctx 8K

27Bparams

Llama 3.2 1Bctx 131K

1Bparams

Llama 3.2 3Bctx 131K

3Bparams

Llama 3.1 8Bctx 131K

8Bparams

Llama 3.1 70Bctx 131K

70Bparams

Llama 3.1 405Bctx 131K

405Bparams

Mistral 7B v0.3ctx 33K

7Bparams

Mistral Nemo 12Bctx 128K

12Bparams

Mixtral 8x7Bctx 33K

46.7Bparams

Phi-3 Mini 3.8Bctx 128K

3.8Bparams

Phi-4 14Bctx 16K

14Bparams

Phi-3 Medium 14Bctx 128K

14Bparams

Qwen2.5 7Bctx 131K

7Bparams

Qwen2.5 14Bctx 131K

14Bparams

Qwen2.5 32Bctx 131K

32Bparams

Qwen2.5 72Bctx 131K

72Bparams

Quantization

F32

starstarstarstarstar

Maximum quality. Research only.

F16

starstarstarstarstar

No noticeable loss. Fine-tuning and inference.

starstarstarstarstar_border

Near identical to F16, half the VRAM.

starstarstarstarstar_border

Minimal loss. Good balance.

starstarstarstar_borderstar_border

Recommended if you have enough VRAM.

PopularQ4

starstarstarstar_borderstar_border

Most used. Best VRAM/quality balance.

starstarstar_borderstar_borderstar_border

Noticeable loss. Use only if necessary.

starstar_borderstar_borderstar_borderstar_border

Severe loss. Last resort.

SelectedQ4_K_M — Popular

Bits/weight4.5

QualityAcceptable

Context

Context length