How much VRAM does DeepSeek Coder V2 236B need?

DeepSeek Coder V2 236B (236B parameters) requires approximately 178.4 GB of memory with Q4_K_M quantization.

What is the best quantization for DeepSeek Coder V2 236B?

The recommended quantization for DeepSeek Coder V2 236B is Q4_K_M, which balances quality and memory efficiency.

Can it run?

Can AMD Instinct MI350X 288GB run DeepSeek Coder V2 236B?

Q: Can AMD Instinct MI350X 288GB run DeepSeek Coder V2 236B?

Yes, AMD Instinct MI350X 288GB can run DeepSeek Coder V2 236B with a B grade (Runs well). Expected decode speed: 109.3 tok/s.

BGood

Runs well

Using Q4_K_M in vLLM

Capabilities:

Fit status

Runs well

Decode

109.3 tok/s

TTFT

1771 ms

Safe context

26K

Memory

178.4 GB / 288.0 GB

Memory breakdown

Weights144.0 GB

KV Cache3.3 GB

Runtime2.4 GB

Headroom28.8 GB

Performance by workload

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	B	Runs well	109.3 tok/s	2577 ms	51K
Chat	B	Runs well	109.3 tok/s	966 ms	13K
Coding	B	Runs well	109.3 tok/s	1771 ms	26K
RAG	B	Runs well	109.3 tok/s	3221 ms	51K
Reasoning	B	Runs well	109.3 tok/s	2094 ms	26K

Quantization options

How DeepSeek Coder V2 236B (236B params) fits at each quantization level on AMD Instinct MI350X 288GB (288.0 GB usable).

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	92.0 GB	Low	D36
Q3_K_S	3	115.6 GB	Low	D38
NVFP4	4	132.2 GB	Medium	D39
Q4_K_M	4	144.0 GB	Medium	C40
Q5_K_M	5	169.9 GB	High	C42
Q6_KBest for your GPU	6	193.5 GB	High	C44
Q8_0	8	252.5 GB	Very High	C44
F16	16	483.8 GB	Maximum	F0

Get started

HuggingFace

huggingface-cli download deepseek-coder-v2-236b

See all results for AMD Instinct MI350X 288GB See all hardware for DeepSeek Coder V2 236B

Can it run?

Can AMD Instinct MI350X 288GB run DeepSeek Coder V2 236B?

BGood

Runs well

Using Q4_K_M in vLLM

Capabilities:

Fit status

Runs well

Decode

109.3 tok/s

TTFT

1771 ms

Safe context

26K

Memory

178.4 GB / 288.0 GB

Memory breakdown

Weights144.0 GB

KV Cache3.3 GB

Runtime2.4 GB

Headroom28.8 GB

Performance by workload

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	B	Runs well	109.3 tok/s	2577 ms	51K
Chat	B	Runs well	109.3 tok/s	966 ms	13K
Coding	B	Runs well	109.3 tok/s	1771 ms	26K
RAG	B	Runs well	109.3 tok/s	3221 ms	51K
Reasoning	B	Runs well	109.3 tok/s	2094 ms	26K

Quantization options

How DeepSeek Coder V2 236B (236B params) fits at each quantization level on AMD Instinct MI350X 288GB (288.0 GB usable).

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	92.0 GB	Low	D36
Q3_K_S	3	115.6 GB	Low	D38
NVFP4	4	132.2 GB	Medium	D39
Q4_K_M	4	144.0 GB	Medium	C40
Q5_K_M	5	169.9 GB	High	C42
Q6_KBest for your GPU	6	193.5 GB	High	C44
Q8_0	8	252.5 GB	Very High	C44
F16	16	483.8 GB	Maximum	F0

Get started

HuggingFace

huggingface-cli download deepseek-coder-v2-236b

See all results for AMD Instinct MI350X 288GB See all hardware for DeepSeek Coder V2 236B