Browse AI Models

21 models available

/

Status:

Sort:

Filtered by:

Mistral Devstral 2 123B Instruct

123B256K ctx68.9 GBfrontier

denseLegacy

Devstral is an agentic LLM for software engineering tasks. Devstral 2 excels at using tools to explore codebases, editing multiple files and power software engineering agents. The model achieves remarkable performance on SWE-bench.

Moonshot AI Kimi K2.5

1000B (32B active)256K ctx560 GBfrontier

moeLegacy

Kimi K2.5 is Moonshot AI's advanced reasoning model with strong performance in math, coding, and multilingual tasks. Features long-context understanding and agentic capabilities for complex multi-step problem solving.

Mistral Mistral Large 3

675B (41B active)256K ctx378 GBfrontier

moeLegacy

Mistral-Large-Instruct-2411 is an advanced dense Large Language Model (LLM) of 123B parameters with state-of-the-art reasoning, knowledge and coding capabilities extending Mistral-Large-Instruct-2407 with better Long Context, Function Calling and System Prompt.

Mistral Mistral Small 4 119B

119B (6.5B active)256K ctx66.6 GBfrontier

moeLegacy

Mistral Small 4 is a powerful hybrid model capable of acting as both a general instruction model and a reasoning model. It unifies the capabilities of three different model families—Instruct, Reasoning (previously called Magistral), and Devstral—into a single, unified model.

Alibaba Qwen3-VL 30B A3B Instruct

30B (3B active)256K ctx16.8 GBfrontier

moeLegacy

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

Meta Llama 4 Maverick 17B 128E

400B (17B active)1.0M ctx224 GBfrontier

moeLegacy

Llama 4 Maverick is Meta's large MoE model with 17B active parameters and 128 experts (400B total). Delivers frontier-class performance on reasoning and coding while remaining deployable on a single node.

Alibaba Qwen 2.5 VL 72B

72B33K ctx40.3 GBfrontier

denseLegacy

license: other license_name: qwen license_link: https://huggingface.co/Qwen/Qwen2.5-VL-72B-Instruct/blob/main/LICENSE language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers

Mistral AI Pixtral Large 124B

124B131K ctx69.4 GBfrontier

denseLegacy

Pixtral-Large-Instruct-2411 is a 124B multimodal model built on top of Mistral Large 2, i.e., Mistral-Large-Instruct-2407. Pixtral Large is the second model in our multimodal family and demonstrates frontier-level image understanding. Particularly, the model is able to understand documents, charts and natural images, while maintaining the leading text-only understanding of Mistral Large 2.

Meta Llama 4 Scout 17B 16E

109B (17B active)10.5M ctx61 GBfrontier

moeLegacy

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

Meta Llama 3.2 11B Vision

11B16K ctx6.2 GBlegacy

visionLegacy

Llama 3.2 11B Vision is Meta's multimodal model that processes both text and images. Supports visual question answering, image captioning, and document understanding alongside standard text generation.

Mistral Mistral Small 3.2 24B

24B131K ctx13.4 GBcurrent

visionLegacy

Mistral-Small-3.2-24B-Instruct-2506 is a minor update of Mistral-Small-3.1-24B-Instruct-2503.

Mistral Ministral 3 14B

14B262K ctx7.8 GBfrontier

multimodalLegacy

The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.

Tsinghua/Zhipu CogVLM2 19B

19B8K ctx10.6 GBcurrent

denseLegacy

👋 Wechat · 💡Online Demo · 🎈Github Page · 📑 Paper

InternLM InternVL2 8B

8B8K ctx4.5 GBcurrent

denseLegacy

We are excited to announce the release of InternVL 2.0, the latest addition to the InternVL series of multimodal large language models. InternVL 2.0 features a variety of instruction-tuned models, ranging from 1 billion to 108 billion parameters. This repository contains the instruction-tuned InternVL2-8B model.

Mistral AI Pixtral 12B

12B131K ctx6.7 GBcurrent

denseLegacy

The Pixtral-12B-2409 is a Multimodal Model of 12B parameters plus a 400M parameter vision encoder.

OpenBMB MiniCPM-V 2.6 8B

8B2K ctx4.5 GBcurrent

denseLegacy

MiniCPM-V 2.6 is OpenBMB's compact multimodal model supporting image and video understanding alongside text. Delivers strong visual reasoning and OCR capabilities at 8B parameter scale.

Mistral Ministral 3 8B

8B262K ctx4.5 GBfrontier

multimodalLegacy

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

LLaVA LLaVA 1.6 13B

13B4K ctx7.3 GBcurrent

denseLegacy

Model type: LLaVA is an open-source chatbot trained by fine-tuning LLM on multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture. Base LLM: mistralai/Mistral-7B-Instruct-v0.2

Alibaba Qwen 2.5 VL 7B

7B33K ctx3.9 GBcurrent

denseLegacy

license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers

LLaVA LLaVA 1.5 7B

7B4K ctx3.9 GBlegacy

denseLegacy

Model type: LLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.

Mistral Ministral 3 3B

3B262K ctx1.7 GBfrontier

multimodalLegacy

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.

Browse AI Models

21 models available

/

Status:

Sort:

Filtered by:

Mistral Devstral 2 123B Instruct

123B256K ctx68.9 GBfrontier

denseLegacy

Moonshot AI Kimi K2.5

1000B (32B active)256K ctx560 GBfrontier

moeLegacy

Mistral Mistral Large 3

675B (41B active)256K ctx378 GBfrontier

moeLegacy

Mistral Mistral Small 4 119B

119B (6.5B active)256K ctx66.6 GBfrontier

moeLegacy

Alibaba Qwen3-VL 30B A3B Instruct

30B (3B active)256K ctx16.8 GBfrontier

moeLegacy

Meet Qwen3-VL — the most powerful vision-language model in the Qwen series to date.

Meta Llama 4 Maverick 17B 128E

400B (17B active)1.0M ctx224 GBfrontier

moeLegacy

Alibaba Qwen 2.5 VL 72B

72B33K ctx40.3 GBfrontier

denseLegacy

Mistral AI Pixtral Large 124B

124B131K ctx69.4 GBfrontier

denseLegacy

Meta Llama 4 Scout 17B 16E

109B (17B active)10.5M ctx61 GBfrontier

moeLegacy

Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.

Meta Llama 3.2 11B Vision

11B16K ctx6.2 GBlegacy

visionLegacy

Mistral Mistral Small 3.2 24B

24B131K ctx13.4 GBcurrent

visionLegacy

Mistral-Small-3.2-24B-Instruct-2506 is a minor update of Mistral-Small-3.1-24B-Instruct-2503.

Mistral Ministral 3 14B

14B262K ctx7.8 GBfrontier

multimodalLegacy

Tsinghua/Zhipu CogVLM2 19B

19B8K ctx10.6 GBcurrent

denseLegacy

👋 Wechat · 💡Online Demo · 🎈Github Page · 📑 Paper

InternLM InternVL2 8B

8B8K ctx4.5 GBcurrent

denseLegacy

Mistral AI Pixtral 12B

12B131K ctx6.7 GBcurrent

denseLegacy

The Pixtral-12B-2409 is a Multimodal Model of 12B parameters plus a 400M parameter vision encoder.

OpenBMB MiniCPM-V 2.6 8B

8B2K ctx4.5 GBcurrent

denseLegacy

MiniCPM-V 2.6 is OpenBMB's compact multimodal model supporting image and video understanding alongside text. Delivers strong visual reasoning and OCR capabilities at 8B parameter scale.

Mistral Ministral 3 8B

8B262K ctx4.5 GBfrontier

multimodalLegacy

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

LLaVA LLaVA 1.6 13B

13B4K ctx7.3 GBcurrent

denseLegacy

Alibaba Qwen 2.5 VL 7B

7B33K ctx3.9 GBcurrent

denseLegacy

license: apache-2.0 language: - en pipeline_tag: image-text-to-text tags: - multimodal library_name: transformers

LLaVA LLaVA 1.5 7B

7B4K ctx3.9 GBlegacy

denseLegacy

Mistral Ministral 3 3B

3B262K ctx1.7 GBfrontier

multimodalLegacy

The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.