How much VRAM does DeepSeek V2.5 236B need?

DeepSeek V2.5 236B (236B parameters) requires approximately 175.2 GB of memory with Q4_K_M quantization.

What is the best quantization for DeepSeek V2.5 236B?

The recommended quantization for DeepSeek V2.5 236B is Q4_K_M, which balances quality and memory efficiency.

Can it run?

Can AMD Instinct MI325X 256GB run DeepSeek V2.5 236B?

Q: Can AMD Instinct MI325X 256GB run DeepSeek V2.5 236B?

Yes, AMD Instinct MI325X 256GB can run DeepSeek V2.5 236B with a B grade (Runs well). Expected decode speed: 82.0 tok/s.

BGood

Runs well

Using Q4_K_M in vLLM

Capabilities:

Fit status

Runs well

Decode

82.0 tok/s

TTFT

2362 ms

Safe context

23K

Memory

175.2 GB / 256.0 GB

Memory breakdown

Weights144.0 GB

KV Cache3.3 GB

Runtime2.4 GB

Headroom25.6 GB

Performance by workload

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	B	Runs well	82.0 tok/s	3436 ms	46K
Chat	B	Runs well	82.0 tok/s	1288 ms	12K
Coding	B	Runs well	82.0 tok/s	2362 ms	23K
RAG	B	Runs well	82.0 tok/s	4294 ms	46K
Reasoning	B	Runs well	82.0 tok/s	2791 ms	23K

Quantization options

How DeepSeek V2.5 236B (236B params) fits at each quantization level on AMD Instinct MI325X 256GB (256.0 GB usable).

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	92.0 GB	Low	D37
Q3_K_S	3	115.6 GB	Low	D39
NVFP4	4	132.2 GB	Medium	C40
Q4_K_M	4	144.0 GB	Medium	C41
Q5_K_M	5	169.9 GB	High	C44
Q6_KBest for your GPU	6	193.5 GB	High	C44
Q8_0	8	252.5 GB	Very High	C44
F16	16	483.8 GB	Maximum	F0

Get started

Ollama

ollama run deepseek-v2.5-236b

HuggingFace

huggingface-cli download deepseek-v2.5-236b

See all results for AMD Instinct MI325X 256GB See all hardware for DeepSeek V2.5 236B

Can it run?

Can AMD Instinct MI325X 256GB run DeepSeek V2.5 236B?

BGood

Runs well

Using Q4_K_M in vLLM

Capabilities:

Fit status

Runs well

Decode

82.0 tok/s

TTFT

2362 ms

Safe context

23K

Memory

175.2 GB / 256.0 GB

Memory breakdown

Weights144.0 GB

KV Cache3.3 GB

Runtime2.4 GB

Headroom25.6 GB

Performance by workload

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	B	Runs well	82.0 tok/s	3436 ms	46K
Chat	B	Runs well	82.0 tok/s	1288 ms	12K
Coding	B	Runs well	82.0 tok/s	2362 ms	23K
RAG	B	Runs well	82.0 tok/s	4294 ms	46K
Reasoning	B	Runs well	82.0 tok/s	2791 ms	23K

Quantization options

How DeepSeek V2.5 236B (236B params) fits at each quantization level on AMD Instinct MI325X 256GB (256.0 GB usable).

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	92.0 GB	Low	D37
Q3_K_S	3	115.6 GB	Low	D39
NVFP4	4	132.2 GB	Medium	C40
Q4_K_M	4	144.0 GB	Medium	C41
Q5_K_M	5	169.9 GB	High	C44
Q6_KBest for your GPU	6	193.5 GB	High	C44
Q8_0	8	252.5 GB	Very High	C44
F16	16	483.8 GB	Maximum	F0

Get started

Ollama

ollama run deepseek-v2.5-236b

HuggingFace

huggingface-cli download deepseek-v2.5-236b

See all results for AMD Instinct MI325X 256GB See all hardware for DeepSeek V2.5 236B