10 models available
Llama 3.3 70B is Meta's most capable single-GPU-class model, offering improved reasoning and instruction following over Llama 3.1 70B. Supports 128K context with enhanced multilingual and code capabilities.
Llama 4 Maverick is Meta's large MoE model with 17B active parameters and 128 experts (400B total). Delivers frontier-class performance on reasoning and coding while remaining deployable on a single node.
Llama 3.1 70B is Meta's high-capability open model with 128K context window. Excels at complex reasoning, multilingual tasks, code generation, and tool use with quality competitive with leading proprietary models.
Llama 4 Scout is Meta's efficient Mixture-of-Experts model with 17B active parameters across 16 experts. Supports a 10M token context window and natively handles text, images, and video inputs.
Llama 3.2 11B Vision is Meta's multimodal model that processes both text and images. Supports visual question answering, image captioning, and document understanding alongside standard text generation.
Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the repository for the 13 instruct-tuned version in the Hugging Face Transformers format. This model is designed for general code synthesis and understanding. Links to other models can be found in the index at the bottom.
Code Llama is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 34 billion parameters. This is the repository for the 7B instruct-tuned version in the Hugging Face Transformers format. This model is designed for general code synthesis and understanding. Links to other models can be found in the index at the bottom.
Llama 3.1 8B is Meta's efficient general-purpose model supporting 128K context and multilingual text generation. Optimized for dialogue, summarization, reasoning, and code generation tasks.
Llama 3.2 3B is Meta's compact multilingual text model optimized for edge and mobile deployment. Supports summarization, instruction following, and text generation with strong performance for its size class.
Llama 3.2 1B is Meta's smallest text model designed for on-device inference. Optimized for multilingual text generation, summarization, and instruction following on resource-constrained hardware.