Question 1

Can Intel Arc Pro A40 6GB run All MiniLM L6 v2?

Accepted Answer

Yes, Intel Arc Pro A40 6GB can run All MiniLM L6 v2 with a C grade (Runs well). Expected decode speed: 138.2 tok/s.

Question 2

How much VRAM does All MiniLM L6 v2 need?

Accepted Answer

All MiniLM L6 v2 (0.023000000044703484B parameters) requires approximately 2.6 GB of memory with F16 quantization.

Question 3

What is the best quantization for All MiniLM L6 v2?

Accepted Answer

The recommended quantization for All MiniLM L6 v2 is F16, which balances quality and memory efficiency.

Workload	Grade	Fit	Decode	TTFT	Context
Agentic Coding	C	Runs well	94.1 tok/s	2993 ms	256
Chat	C	Runs well	94.1 tok/s	1122 ms	256
Coding	C	Runs well	138.2 tok/s	1400 ms	256
RAG	C	Runs well	94.1 tok/s	3741 ms	256
Reasoning	C	Runs well	94.1 tok/s	2432 ms	256

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	0.0 GB	Low	D29
Q3_K_S	3	0.0 GB	Low	D29
NVFP4	4

Can Intel Arc Pro A40 6GB run All MiniLM L6 v2?

Memory breakdown