Browse AI Models

283 models available

/

Status:

Sort:

Filtered by:

MaziyarPanahi Meta Llama 3.1 8B Instruct

8B0K ctx4.5 GB

denseLegacy

Cohere Command R+ 104B

104B131K ctx58.2 GBcurrent

denseLegacy

Command R+ is Cohere's most capable open-weight model for enterprise RAG workloads. Offers superior long-context reasoning, multi-step tool use, and grounded generation with citations across 10 languages.

DeepSeek DeepSeek R1 Distill 32B

32B33K ctx17.9 GBfrontier

denseLegacy

We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing.

Meta Llama 4 Scout 17B 16E

109B (17B active)10.5M ctx61 GBfrontier

moeLegacy

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

Alibaba Qwen 3 30B A3B

30.5B (3.3B active)131K ctx17.1 GBfrontier

moeLegacy

We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements:

Lmg-anon vntl llama3 8b v2

8B0K ctx4.5 GB

denseLegacy

Unsloth Mistral Small 3.2 24B Instruct 2506

24B0K ctx13.4 GB

denseLegacy

Lmstudio-community DeepSeek R1 0528 Qwen3 8B

8B0K ctx4.5 GB

denseLegacy

MaziyarPanahi gemma 3 4b it

4B0K ctx2.2 GB

denseLegacy

MaziyarPanahi Llama 3.3 70B Instruct

70B0K ctx39.2 GB

denseLegacy

DeepSeek DeepSeek V2.5 236B

236B (21B active)131K ctx132.2 GBcurrent

moeLegacy

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions. For model details, please visit DeepSeek-V2 page for more information.

Microsoft Phi-4-reasoning-plus 14B

14.7B33K ctx8.2 GBfrontier

denseLegacy

> [!IMPORTANT] > To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_k=50`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).

Alibaba Qwen 3 32B

32B131K ctx17.9 GBfrontier

denseLegacy

Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:

MaziyarPanahi Yi Coder 1.5B Chat

1.5B0K ctx0.8 GB

denseLegacy

MaziyarPanahi gemma 3 12b it

12B0K ctx6.7 GB

denseLegacy

MaziyarPanahi Llama 3.2 1B Instruct

1B0K ctx0.6 GB

denseLegacy

MaziyarPanahi gemma 2 2b it

2B0K ctx1.1 GB

denseLegacy

MaziyarPanahi Llama 3.2 3B Instruct

3B0K ctx1.7 GB

denseLegacy

MaziyarPanahi gemma 3 1b it

1B0K ctx0.6 GB

denseLegacy

Bartowski cognitivecomputations Dolphin Mistral 24B Venice Edition

24B0K ctx13.4 GB

denseLegacy

MaziyarPanahi DeepSeek R1 0528 Qwen3 8B

8B0K ctx4.5 GB

denseLegacy

MaziyarPanahi Mistral Small 24B Instruct 2501

24B0K ctx13.4 GB

denseLegacy

MaziyarPanahi Yi 1.5 6B Chat

6B0K ctx3.4 GB

denseLegacy

Mistralai Ministral 3 3B Instruct 2512

3B0K ctx1.7 GB

denseLegacy

Browse AI Models

283 models available

/

Status:

Sort:

Filtered by:

MaziyarPanahi Meta Llama 3.1 8B Instruct

8B0K ctx4.5 GB

denseLegacy

Cohere Command R+ 104B

104B131K ctx58.2 GBcurrent

denseLegacy

DeepSeek DeepSeek R1 Distill 32B

32B33K ctx17.9 GBfrontier

denseLegacy

Meta Llama 4 Scout 17B 16E

109B (17B active)10.5M ctx61 GBfrontier

moeLegacy

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

Alibaba Qwen 3 30B A3B

30.5B (3.3B active)131K ctx17.1 GBfrontier

moeLegacy

We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements:

Lmg-anon vntl llama3 8b v2

8B0K ctx4.5 GB

denseLegacy

Unsloth Mistral Small 3.2 24B Instruct 2506

24B0K ctx13.4 GB

denseLegacy

Lmstudio-community DeepSeek R1 0528 Qwen3 8B

8B0K ctx4.5 GB