Calculator Models Hardware Compare

Product

Calculator
Compare
Tier List

Browse

Models
Hardware
Docs

About

Why It Works
What's New
Legal Notice
Privacy Policy

All estimates are approximations based on mathematical models and public specifications. Actual performance may vary. Do not make purchasing decisions based solely on these estimates.

Data sourced from Hugging Face, Ollama, and official model documentation. Model names and logos are trademarks of their respective owners.

© 2026 Will It Run AI — Fase Consulting Ibiza, S.L. (NIF: B57969656)

Home/Hardware/GPUs/AMD Instinct MI325X 256GB

AMD

AMD Instinct MI325X 256GB

InstinctDatacenterCDNA 4OAMROCm

256GB

VRAM

6kGB/s

Bandwidth

1.3kTFLOPS

FP16 Compute

2.6kTOPS

INT8 Inference

AMD Instinct MI325X 256GBCategory AvgAMD Instinct MI350X 288GB

Specifications

Compute

FP161307 TFLOPS

INT82614 TOPS

ArchitectureCDNA 4

Memory

VRAM256 GB

Bandwidth6000 GB/s

General

FamilyInstinct

SegmentDatacenter

InterconnectOAM

Compute PlatformROCM

Architecture

CDNA 4

CDNA 4 powers the next-generation Instinct MI325X and MI350X accelerators. Built on TSMC 3nm with up to 288 GB HBM3e memory and native FP4 support for maximum inference density.

AI Relevance

With up to 288 GB HBM3e and FP4 support, CDNA 4 targets the highest-density AI inference deployments. Directly competes with NVIDIA Blackwell B200 for large-scale model serving.

Process: TSMC 3nmPlatform: ROCMPrecisions: FP64, FP32, TF32, FP16, BF16, FP8, FP4, INT8

Recommendations by Workload

Agentic Coding

C

Devstral 2 123B Instruct

This model is still usable for agentic-coding, but it is not the most specialized pick. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 58.4 tok/s · 59K ctx · llama.cpp

140.0 GB / 256.0 GB VRAM

Chat

C

Mistral Small 4 119B

This model is a direct match for chat. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 175.6 tok/s · 21K ctx · llama.cpp

99.9 GB / 256.0 GB VRAM

Coding

C

Devstral 2 123B Instruct

This model is a direct match for coding. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 58.4 tok/s · 34K ctx · llama.cpp

120.7 GB / 256.0 GB VRAM

RAG

C

Command A 111B

This model is a direct match for rag. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, ollama, lm-studio.

Decode 64.7 tok/s · 64K ctx · llama.cpp

128.9 GB / 256.0 GB VRAM

Reasoning

B

Qwen 3 235B A22B

This model is a direct match for reasoning. It belongs to a current frontier family for local AI. It fits natively with comfortable headroom. Known channels: huggingface, lm-studio.

Decode 81.5 tok/s · 24K ctx · llama.cpp

173.3 GB / 256.0 GB VRAM

Full Model Compatibility

Qwen 3 235B A22B

235B173.3 GB82 tok/s24K ctx

141B118.6 GB98 tok/s35K ctx

Devstral 2 123B Instruct

123B120.7 GB58 tok/s34K ctx

Mistral Small 4 119B

119B100.1 GB176 tok/s41K ctx

111B111.6 GB65 tok/s37K ctx

Qwen3.5 122B A10B

122B105.3 GB68 tok/s39K ctx

72B81.7 GB100 tok/s50K ctx

Qwen 2.5 VL 72B

72B81.7 GB100 tok/s33K ctx

70B80.1 GB103 tok/s51K ctx

Qwen3-Coder-Next

80B76.1 GB272 tok/s54K ctx

Qwen3.5 35B A3B

35B53.3 GB205 tok/s77K ctx

Qwen3.5 35B A3B

35B53.3 GB205 tok/s77K ctx

Qwen 2.5 Coder 32B

32B51.0 GB224 tok/s80K ctx

27B47.2 GB266 tok/s87K ctx

Qwen3-Coder 30B A3B Instruct

30.5B45.9 GB609 tok/s89K ctx

27B47.2 GB266 tok/s87K ctx

Qwen3-VL 30B A3B Instruct

30B45.6 GB630 tok/s90K ctx

Devstral Small 2 24B Instruct

24B44.9 GB299 tok/s91K ctx

Devstral Small 1.1

24B44.9 GB299 tok/s91K ctx

Codestral 2 25.08

22B43.4 GB326 tok/s94K ctx

9B33.4 GB798 tok/s123K ctx

Qwen3.5 9B Uncensored HauhauCS Aggressive

9B33.4 GB798 tok/s123K ctx

Meta Llama 3.1 8B Instruct

8B32.6 GB898 tok/s126K ctx

llava llama 3 8b v1 1

8B32.6 GB898 tok/s126K ctx

9B33.4 GB798 tok/s123K ctx

DeepSeek R1 0528 Qwen3 8B

8B32.6 GB898 tok/s126K ctx

Llama 2 7B Chat

7B31.9 GB1026 tok/s129K ctx

Meta Llama 3 8B Instruct

8B32.6 GB898 tok/s126K ctx

Mistral 7B Instruct v0.2

7B31.9 GB1026 tok/s129K ctx

Mistral 7B Instruct v0.3

7B31.9 GB1026 tok/s129K ctx

Llama 3.2 3B Instruct

3B29.5 GB2068 tok/s139K ctx

4B29.7 GB1795 tok/s138K ctx

2B28.9 GB2804 tok/s142K ctx

2B28.5 GB3590 tok/s144K ctx

Qwen2.5 3B Instruct

3B29.1 GB2393 tok/s141K ctx

4B29.7 GB1795 tok/s138K ctx

Llama 3.2 1B Instruct Q8 0

1B28.1 GB4599 tok/s146K ctx

Qwen2.5 1.5B Instruct

1.5B28.2 GB4380 tok/s145K ctx

Gemmasutra Mini 2B v1

2B28.5 GB3590 tok/s144K ctx

TinyLlama 1.1B Chat v1.0

1.1B28.0 GB4380 tok/s146K ctx

SmolVLM 500M Instruct

0.5B27.7 GB4599 tok/s148K ctx

embeddinggemma 300M

0.3B27.5 GB4599 tok/s149K ctx

Llama 4 Maverick 17B 128E

400B273.2 GB51 tok/s15K ctx

DeepSeek R1 671B

671B441.6 GB31 tok/s9K ctx

744B486.6 GB28 tok/s8K ctx

1000B641.5 GB22 tok/s6K ctx

Mistral Large 3

675B444.7 GB31 tok/s9K ctx

Qwen3-Coder 480B A35B Instruct

480B324.8 GB42 tok/s13K ctx

DeepSeek V3 671B

671B441.6 GB31 tok/s9K ctx

Qwen3.5 397B A17B

397B330.7 GB18 tok/s12K ctx

Just out of reach

Models you could run with an upgrade

High-quality models that need a bit more memory

DeepSeek R1 671B

671BTier 5Needs ~447.4 GB

744BTier 5Needs ~492.8 GB

1000BTier 5Needs ~646.5 GB

Mistral Large 3

675BTier 5Needs ~451.1 GB

Qwen3-Coder 480B A35B Instruct

480BTier 5Needs ~330.2 GB

Upgrade paths

Upgrade from AMD Instinct MI325X 256GB

See what you unlock with more powerful hardware

Upgrade options

Upgrade options

AMD Instinct MI350X 288GBNext step up

288 GB VRAM (+32)8000 GB/s (+2000)

Unlocks Qwen3-Coder 480B A35B Instruct+33% faster avg

~$8,000 MSRP

Compare this GPU