Will It Run AI
CalculatorModelsHardwareCompare
Product
  • Calculator
  • Compare
  • Tier List
Browse
  • Models
  • Hardware
  • Docs
About
  • Why It Works
  • What's New
  • Legal Notice
  • Privacy Policy

All estimates are approximations based on mathematical models and public specifications. Actual performance may vary. Do not make purchasing decisions based solely on these estimates.

Data sourced from Hugging Face, Ollama, and official model documentation. Model names and logos are trademarks of their respective owners.

© 2026 Will It Run AI — Fase Consulting Ibiza, S.L. (NIF: B57969656)

Browse AI Models

328 models available

/
Status:
Sort:
MistralMistralCodestral 2 25.08
22B256K ctx12.3 GBfrontier
denseLegacy

Codestral 2 is Mistral AI's latest code-focused model with enhanced performance on code generation, refactoring, and documentation across dozens of programming languages.

MistralMistralDevstral Small 1.1
24B131K ctx13.4 GBcurrent
denseLegacy

Devstral is an agentic LLM for software engineering tasks built under a collaboration between Mistral AI and All Hands AI 🙌. Devstral excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench which positions it as the #1 open source model on this benchmark.

MetaMetaLlama 3.1 70B
70B128K ctx39.2 GBlegacy
denseLegacy

Llama 3.1 70B is Meta's high-capability open model with 128K context window. Excels at complex reasoning, multilingual tasks, code generation, and tool use with quality competitive with leading proprietary models.

NVIDIANVIDIANemotron 70B
70B131K ctx39.2 GBcurrent
denseLegacy

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA to improve the helpfulness of LLM generated responses to user queries.

Mistral AIMistral AIPixtral Large 124B
124B131K ctx69.4 GBfrontier
denseLegacy

Pixtral-Large-Instruct-2411 is a 124B multimodal model built on top of Mistral Large 2, i.e., Mistral-Large-Instruct-2407. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding. Particularly, the model is able to understand documents, charts and natural images, while maintaining the leading text-only understanding of Mistral Large 2.

AlibabaAlibabaQwen 2.5 Math 72B
72B4K ctx40.3 GBfrontier
denseLegacy

> [!Warning] > > > 🚨 Qwen2.5-Math mainly supports solving English and Chinese math problems through CoT and TIR. We do not recommend using this series of models for other tasks. > >

UnslothUnslothDeepSeek R1 Distill Llama 8B
8B0K ctx4.5 GB
denseLegacy

 

DphnDDphnDolphin3.0 Llama3.1 8B
8B0K ctx4.5 GB
denseLegacy

 

MaziyarPanahiMMaziyarPanahiLlama 3 8B Instruct 32k v0.1
8B0K ctx4.5 GB
denseLegacy

 

UnslothUnslothDeepSeek R1 Distill Qwen 1.5B
1.5B0K ctx0.8 GB
denseLegacy

 

MaziyarPanahiMMaziyarPanahiMeta Llama 3.1 8B Instruct
8B0K ctx4.5 GB
denseLegacy

 

CohereCohereCommand R+ 104B
104B131K ctx58.2 GBcurrent
denseLegacy

Command R+ is Cohere's most capable open-weight model for enterprise RAG workloads. Offers superior long-context reasoning, multi-step tool use, and grounded generation with citations across 10 languages.

DeepSeekDeepSeekDeepSeek Coder V2 236B
236B (21B active)131K ctx132.2 GBcurrent
moeLegacy

We present DeepSeek-Coder-V2, an open-source Mixture-of-Experts (MoE) code language model that achieves performance comparable to GPT4-Turbo in code-specific tasks. Specifically, DeepSeek-Coder-V2 is further pre-trained from an intermediate checkpoint of DeepSeek-V2 with additional 6 trillion tokens. Through this continued pre-training, DeepSeek-Coder-V2 substantially enhances the coding and mathematical reasoning capabilities of DeepSeek-V2, while maintaining comparable performance in general language tasks.

DeepSeekDeepSeekDeepSeek R1 Distill 32B
32B33K ctx17.9 GBfrontier
denseLegacy

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.

MetaMetaLlama 4 Scout 17B 16E
109B (17B active)10.5M ctx61 GBfrontier
moeLegacy

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

AlibabaAlibabaQwen 3 30B A3B
30.5B (3.3B active)131K ctx17.1 GBfrontier
moeLegacy

We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements:

BigCodeBigCodeStarCoder 15B
15B8K ctx8.4 GBlegacy
denseLegacy

StarCoder 15B is BigCode's flagship code generation model trained on 1 trillion tokens from The Stack. Supports 80+ programming languages with 8K context and strong code completion capabilities.

Lmg-anonLLmg-anonvntl llama3 8b v2
8B0K ctx4.5 GB
denseLegacy

 

UnslothUnslothMistral Small 3.2 24B Instruct 2506
24B0K ctx13.4 GB
denseLegacy

 

Lmstudio-communityLLmstudio-communityDeepSeek R1 0528 Qwen3 8B
8B0K ctx4.5 GB
denseLegacy

 

UnslothUnslothDeepSeek R1 Distill Qwen 14B
14B0K ctx7.8 GB
denseLegacy

 

MaziyarPanahiMMaziyarPanahigemma 3 4b it
4B0K ctx2.2 GB
denseLegacy

 

MaziyarPanahiMMaziyarPanahiLlama 3.3 70B Instruct
70B0K ctx39.2 GB
denseLegacy

 

DeepSeekDeepSeekDeepSeek V2.5 236B
236B (21B active)131K ctx132.2 GBcurrent
moeLegacy

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, please visit DeepSeek-V2 page for more information.

PreviousPage 3 of 14Next