Llama 3.1 70B

Name: Llama 3.1 70B
Author: Meta

Legacy

HuggingFace

Ollama

890.6KDownloads897LikesJul 2024Released128K tokensContextCommunityLicense5 EntryQuality

Get started

— copy & paste to run locally

Ollama

ollama run llama-3.1-70b

HuggingFace

huggingface-cli download llama-3.1-70b

Quick specs

Parameters70B

Architecturedense

Context128K tokens

Modalitytext

Min RAM27.3 GB

Rec. RAM42.7 GB (Q4_K_M)

LicenseCommunity

FamilyLlama

✓ Chat✓ Reasoning

About this model

Llama 3.1 70B is Meta's high-capability open model with 128K context window. Excels at complex reasoning, multilingual tasks, code generation, and tool use with quality competitive with leading proprietary models.

Related models

Quick picks

Best budgetC

MacBook Pro M3 Max 128GB~$2,499 — 6 tok/s

Best overallB

NVIDIA H100 80GB~$40,000 — 66 tok/s

Best hardware

Top picks for Llama 3.1 70B

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	27.3 GB	Low	—
Q3_K_S	3	34.3 GB	Low	—
NVFP4	4	39.2 GB	Medium	—
Q4_K_M	4	42.7 GB	Medium	—
Q5_K_M	5	50.4 GB	High	—
Q6_K	6	57.4 GB	High	—
Q8_0	8	74.9 GB	Very High	—
F16	16	143.5 GB	Maximum	—

Quality benchmarks

Llama 3.1 70B benchmark scores

Benchmark verified

Coding

SWE-bench Verified—

HumanEval+80.5%

Aider Polyglot—

LiveCodeBench—

Reasoning

MMLU-Pro66.4%

GPQA Diamond46.7%

MATH-50068.0%

ARC Challenge94.8%

General

Chatbot Arena—

IFEval87.5%

Source: official · 2024-07-23

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: NVIDIA A10 24GB

Weights42.7 GB

KV Cache10.9 GB

Runtime0.9 GB

Headroom2.4 GB

Llama 3.1 70B

Legacy

HuggingFace

Ollama

890.6KDownloads897LikesJul 2024Released128K tokensContextCommunityLicense5 EntryQuality

Get started

— copy & paste to run locally

Ollama

ollama run llama-3.1-70b

HuggingFace

huggingface-cli download llama-3.1-70b

Quick specs

Parameters70B

Architecturedense

Context128K tokens

Modalitytext

Min RAM27.3 GB

Rec. RAM42.7 GB (Q4_K_M)

LicenseCommunity

FamilyLlama

✓ Chat✓ Reasoning

About this model

Related models

Quick picks

Best budgetC

MacBook Pro M3 Max 128GB~$2,499 — 6 tok/s

Best overallB

NVIDIA H100 80GB~$40,000 — 66 tok/s

Best hardware

Top picks for Llama 3.1 70B

Quantization options

VRAM estimates by quant level

No hardware detected — fit column shows raw VRAM estimates

Quant	Bits	VRAM	Quality	Fit
Q2_K	2	27.3 GB	Low	—
Q3_K_S	3	34.3 GB	Low	—
NVFP4	4	39.2 GB	Medium	—
Q4_K_M	4	42.7 GB	Medium	—
Q5_K_M	5	50.4 GB	High	—
Q6_K	6	57.4 GB	High	—
Q8_0	8	74.9 GB	Very High	—
F16	16	143.5 GB	Maximum	—

Quality benchmarks

Llama 3.1 70B benchmark scores

Benchmark verified

Coding

SWE-bench Verified—

HumanEval+80.5%

Aider Polyglot—

LiveCodeBench—

Reasoning

MMLU-Pro66.4%

GPQA Diamond46.7%

MATH-50068.0%

ARC Challenge94.8%

General

Chatbot Arena—

IFEval87.5%

Source: official · 2024-07-23

Hardware compatibility

Fit estimates across all hardware

Open calculator

Computing compatibility...

Memory breakdown

Reference: NVIDIA A10 24GB

Weights42.7 GB

KV Cache10.9 GB

Runtime0.9 GB

Headroom2.4 GB