Mistral

Mistral Nemo 12B

Name: Mistral Nemo 12B
Author: Mistral

Current

HuggingFace

Ollama

105.8KDownloads1.7KLikesJul 2024Released128K tokensContextApache 2.0License3 EntryQuality

Get started

— copy & paste to run locally

Ollama

ollama run mistral-nemo-12b

HuggingFace

huggingface-cli download mistral-nemo-12b

Quick specs

Parameters12B

Architecturedense

Context128K tokens

Modalitytext

Min RAM4.7 GB

Rec. RAM7.3 GB (Q4_K_M)

LicenseApache 2.0

FamilyMistral

✓ Chat

About this model

The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407. Trained jointly by Mistral AI and NVIDIA, it significantly outperforms existing models smaller or similar in size.

•Released under the Apache 2 License
•Pre-trained and instructed versions
•Trained with a 128k context window
•Trained on a large proportion of multilingual and code data
•Drop-in replacement of Mistral 7B

Related models

Quick picks

Best budgetC

Intel Arc B580 12GB~$249 — 30 tok/s

Best overallB

RTX 5080 16GB~$999 — 85 tok/s

Best hardware

Top picks for Mistral Nemo 12B

RTX 5080 Laptop 16GBB

RTX 4070 Ti Super 16GBB

16 GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	4.7 GB	Low	—
Q3_K_S	3	5.9 GB	Low	—
NVFP4	4	6.7 GB	Medium	—
Q4_K_M	4	7.3 GB	Medium	—
Q5_K_M	5	8.6 GB	High	—
Q6_K	6	9.8 GB	High	—
Q8_0	8	12.8 GB	Very High	—
F16	16	24.6 GB	Maximum	—

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: NVIDIA A10 24GB

Weights7.3 GB

KV Cache1.9 GB

Runtime0.9 GB

Headroom2.4 GB

Mistral

Mistral Nemo 12B

Current

HuggingFace

Ollama

105.8KDownloads1.7KLikesJul 2024Released128K tokensContextApache 2.0License3 EntryQuality

Get started

— copy & paste to run locally

Ollama

ollama run mistral-nemo-12b

HuggingFace

huggingface-cli download mistral-nemo-12b

Quick specs

Parameters12B

Architecturedense

Context128K tokens

Modalitytext

Min RAM4.7 GB

Rec. RAM7.3 GB (Q4_K_M)

LicenseApache 2.0

FamilyMistral

✓ Chat

About this model

•Released under the Apache 2 License
•Pre-trained and instructed versions
•Trained with a 128k context window
•Trained on a large proportion of multilingual and code data
•Drop-in replacement of Mistral 7B

Related models

Quick picks

Best budgetC

Intel Arc B580 12GB~$249 — 30 tok/s

Best overallB

RTX 5080 16GB~$999 — 85 tok/s

Best hardware

Top picks for Mistral Nemo 12B

RTX 5080 Laptop 16GBB

RTX 4070 Ti Super 16GBB

16 GB

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	4.7 GB	Low	—
Q3_K_S	3	5.9 GB	Low	—
NVFP4	4	6.7 GB	Medium	—
Q4_K_M	4	7.3 GB	Medium	—
Q5_K_M	5	8.6 GB	High	—
Q6_K	6	9.8 GB	High	—
Q8_0	8	12.8 GB	Very High	—
F16	16	24.6 GB	Maximum	—

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: NVIDIA A10 24GB

Weights7.3 GB

KV Cache1.9 GB

Runtime0.9 GB

Headroom2.4 GB