How much VRAM does EXAONE 3.5 7.8B Instruct need?

EXAONE 3.5 7.8B Instruct (7.800000190734863B parameters) requires approximately 8.3 GB of memory with Q4_K_M quantization.

What is the best quantization for EXAONE 3.5 7.8B Instruct?

The recommended quantization for EXAONE 3.5 7.8B Instruct is Q4_K_M, which balances quality and memory efficiency.

Can it run?

Can RTX 2080 Ti 11GB run EXAONE 3.5 7.8B Instruct?

Q: Can RTX 2080 Ti 11GB run EXAONE 3.5 7.8B Instruct?

Yes, RTX 2080 Ti 11GB can run EXAONE 3.5 7.8B Instruct with a B grade (Runs well). Expected decode speed: 84.2 tok/s.

BGood

Runs well

Using Q4_K_M in Ollama

Capabilities:

Fit status

Runs well

Decode

84.2 tok/s

TTFT

2301 ms

Safe context

21K

Memory

8.3 GB / 11.0 GB

Memory breakdown

Weights4.8 GB

KV Cache1.2 GB

Runtime1.2 GB

Headroom1.1 GB

Performance by workload

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	C	Tight fit	84.2 tok/s	3346 ms	37K
Chat	B	Runs well	84.2 tok/s	1255 ms	11K
Coding	B	Runs well	84.2 tok/s	2301 ms	21K
RAG	C	Tight fit	84.2 tok/s	4183 ms	37K
Reasoning	B	Runs well	84.2 tok/s	2719 ms	21K

Quantization options

How EXAONE 3.5 7.8B Instruct (7.800000190734863B params) fits at each quantization level on RTX 2080 Ti 11GB (11.0 GB usable).

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	3.0 GB	Low	D35
Q3_K_S	3	3.8 GB	Low	D36
NVFP4	4	4.4 GB	Medium	D38
Q4_K_M	4	4.8 GB	Medium	D38
Q5_K_M	5	5.6 GB	High	D40
Q6_KBest for your GPU	6	6.4 GB	High	C41
Q8_0	8	8.3 GB	Very High	C44
F16	16	16.0 GB	Maximum	F0

Get started

Upgrade options

Hardware that runs EXAONE 3.5 7.8B Instruct well

RTX 5070 12GBBudget pick

B89 tok/s decode

~$549 MSRP

RTX 3080 12GBBest value

B145.7 tok/s decode

~$799 MSRP

RTX 3080 Ti 12GBBiggest leap

B141.8 tok/s decode

~$1,199 MSRP

See all results for RTX 2080 Ti 11GB See all hardware for EXAONE 3.5 7.8B Instruct

Can it run?

Can RTX 2080 Ti 11GB run EXAONE 3.5 7.8B Instruct?

BGood

Runs well

Using Q4_K_M in Ollama

Capabilities:

Fit status

Runs well

Decode

84.2 tok/s

TTFT

2301 ms

Safe context

21K

Memory

8.3 GB / 11.0 GB

Memory breakdown

Weights4.8 GB

KV Cache1.2 GB

Runtime1.2 GB

Headroom1.1 GB

Performance by workload

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	C	Tight fit	84.2 tok/s	3346 ms	37K
Chat	B	Runs well	84.2 tok/s	1255 ms	11K
Coding	B	Runs well	84.2 tok/s	2301 ms	21K
RAG	C	Tight fit	84.2 tok/s	4183 ms	37K
Reasoning	B	Runs well	84.2 tok/s	2719 ms	21K

Quantization options

How EXAONE 3.5 7.8B Instruct (7.800000190734863B params) fits at each quantization level on RTX 2080 Ti 11GB (11.0 GB usable).

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	3.0 GB	Low	D35
Q3_K_S	3	3.8 GB	Low	D36
NVFP4	4	4.4 GB	Medium	D38
Q4_K_M	4	4.8 GB	Medium	D38
Q5_K_M	5	5.6 GB	High	D40
Q6_KBest for your GPU	6	6.4 GB	High	C41
Q8_0	8	8.3 GB	Very High	C44
F16	16	16.0 GB	Maximum	F0

Get started

Upgrade options

Hardware that runs EXAONE 3.5 7.8B Instruct well

RTX 5070 12GBBudget pick

B89 tok/s decode

~$549 MSRP

RTX 3080 12GBBest value

B145.7 tok/s decode

~$799 MSRP

RTX 3080 Ti 12GBBiggest leap

B141.8 tok/s decode

~$1,199 MSRP

See all results for RTX 2080 Ti 11GB See all hardware for EXAONE 3.5 7.8B Instruct