LLaMA
Website: https://www.llama.com/
Detailed pricing plans are not available yet for this tool.
SearchDocumentationProductsAPIOverviewAPI waitlistLoginLlama StackOverviewModelsModels familiesLlama 4Llama 3Get startedDownloadLearnResourcesCookbookVideosAI at Meta BlogCommunityBuilt with LlamaCase studiesResearchAI research communityNetworkHugging FaceGitHubSafetyLlama ProtectionsOverviewLlama Defenders ProgramDeveloper use guideLlama APIStay updatedDownload modelsBuild on your own termsOptimized models for easy deployment, cost efficiency, and performance that scale to billions of users.Download modelsStay updatedMODELSLatest Llama modelsThe latest models feature native multimodality, advanced reasoning, and industry-leading context windows.Model overviewLlama 4Native multimodality leveraging early fusion to pre-train unlabeled text and vision data enabling a change in intelligence from separate, frozen multimodal weights.More detailsLlama 4 MaverickNatively multimodal for image and text understanding. 10M-token context for long-form workMultimodal text + image understanding For use cases around memory, personalization, and multi-modal applicationsMore detailsDownload modelsLlama 4 ScoutNatively multimodal offering text and visual intelligenceOffers single H100 GPU efficiency10M context windowFor use cases around long document analysisMore detailsDownload modelsLlama 3The open-source AI models you can fine-tune, distill and deploy anywhere. Choose from our collection of models: Llama 3.1, Llama 3.2, Llama 3.3.More detailsLlama 3.3Multilingual open source large language model. Available in 70B Experience 405B performance and quality at a fraction of the cost Built for text-based use cases such as synthetic data generationMore detailsDownload modelsLlama 3.2Flexible, cost-effective, and built for edge use cases. 1B & 3B are lightweight and cost-efficient allowing you to run them anywhere 11B & 90B are flexible multimodal models that can reason on high resolution images and output textMore detailsDownload modelsLlama 3.1Open-foundation model built for flexibility and control.Available in 8B, 70B, and 405B sizes Capabilities in general knowledge, steerability, math, tool use, and multilingual translationText summarization, multilingual agents, and coding use cases More detailsDownload models Model optimization Documentation overviewPrompt EngineeringUsed in natural language processing to improve the performance of LLMs.Learn moreFine-tuningAdapting pre-trained models to perform better for a specific use case.Learn moreVision capabilitiesLetting the model understand and reason over images and text together.Learn moreQuantizationUsed to reduce the computational and memory requirements of models.Learn moreDistillationTeaching a smaller model to match a larger model's performance.Learn moreEvaluationsAutomated and manual tests to systematically measure model performance.Learn more Llama 4 capabilities Natively multimodalUnparalleled long contextExpert image grounding Natively Multimodal All Llama 4 models are designed with native multimodality, leveraging early fusion that allows us to pre-train the model with large amounts of unlabeled text and vision tokens - a step change in intelligence from separate, frozen multimodal weights. How to guides: Video capabilities Video tutorials: Build a Multimodal WhatsApp Chatbot Llama 4 benchmark TaskMetricLlama 4 MaverickLlama 4 ScoutReasoningMMLU Pro80.574.3GPQA Diamond69.857.2CodingLiveCodeBench43.432.8Multimodal (Image)MMMU73.469.4MathVista73.770.7ChartQA90.088.8DocVQA94.494.4MultilingualMMLU Multi84.674.3Long ContextMTOB Half Book54.0 / 46.442.2 / 36.6MTOB Full Book50.8 / 46.739.7 / 36.3EfficiencyCost per 1M tokens$0.19–$0.49$0.19–$0.49Methodology & Notes1. For Llama model results, we report 0 shot evaluation with temperature = 0 and no majority voting or parallel test time compute. For high-variance benchmarks (GPQA Diamond, LiveCodeBench), we average over multiple generations to reduce uncertainty.2. Specialized long context evals are not traditionally reported for generalist models, so we share internal runs to showcase llama's frontier performance.3. $0.19/Mtok (3:1 blended) is our cost estimate for Llama 4 Maverick assuming distributed inference. On a single host, we project the model can be served at $0.30 - $0.49/Mtok (3:1 blended).Start buildingDownload modelsView documentaionFeatured case studiesTechHow Stoque is using LlamaTransforming internal intelligence with Llama. Stoque enabled teams to find insights faster, reduce friction, and work more efficiently at scale.50%reduction in repetitive queries for technical support30%more administrative and support tasks completed11%increase in internal user satisfactionRead more All tech studies Consumer, BusinessHow Shopify is using LlamaShopify uses Llama to generate product pages, localize content, and automate support, helping developers scale workflows and save time.+76%higher token throughout than the previous model97.7%accurate Macro-F1 score on intent detection33%compute cost savings with JSON outputRead more All consumer studies TechHow Stoque is using LlamaTransforming internal intelligence with Llama. Stoque enabled teams to find insights faster, reduce friction, and work more efficiently at scale.50%reduction in repetitive queries for technical support30%more administrative and support tasks completed11%increase in internal user satisfactionRead more All tech studies Consumer, BusinessHow Shopify is using LlamaShopify uses Llama to generate product pages, localize content, and automate support, helping developers scale workflows and save time.+76%higher token throughout than the previous model97.7%accurate Macro-F1 score on intent detection33%compute cost savings with JSON outputRead more All consumer studies More case study categoriesTechConsumerBusinessEducationHealthcareView AllSAFETY Protections in the era of generative AI. Comprehensive system-level protections proactively identify and mitigate potential risks, empowering developers to more easily deploy generative AI responsibly. Protection tools accessible to everyoneLearn moreEnabling AI DefendersLearn more Llama latest Getting Started with PDO (Prompt Duel Optimizer) with prompt-opsDiscover how our next-generation prompt-optimization methods unlock new performance levels in Llama applications.How to build your own multi-modal WhatsApp chatbot with Llama 4Build an intelligent chatbot for your business on WhatsApp.Oxide AIFinding trustworthy signals in a sea of financial dataProductsAPIAPI waitlistLoginLlama StackModelsLlama 4Llama 3DownloadDocumentationOverviewModelsHow to guidesDeployment guidesIntegration guidesLearnCookbookVideosFAQCommunityCase studiesAI research communityLlama Impact GrantsHugging FaceGitHubLlama ProtectionsOverviewLlama Defenders ProgramDeveloper Use GuideNewsAI at Meta BlogMeta NewsroomAI researchOverviewTerms & policiesTerms of ServicePrivacy PolicyCookies Products APIAPI waitlistLoginLlama Stack Models Llama 4Llama 3Download News AI at Meta BlogMeta Newsroom Documentation OverviewModelsHow to guidesDeployment guidesIntegration guides Learn CookbookVideosFAQ AI research Overview Community Case studiesAI research communityLlama Impact GrantsHugging FaceGitHub Llama Protections OverviewLlama Defenders ProgramDeveloper Use Guide Terms & policies Terms of ServicePrivacy PolicyCookies Allow the use of cookies from Meta on this browser?We use cookies and similar technologies to help provide and improve content on Meta Products. We also use them to provide a safer experience by using information we receive from cookies on and off Llama, and to provide and improve Meta Products for people who have an account.•Essential cookies: These cookies are required to use Meta Products and are necessary for our sites to work as intended.•Cookies from other companies: We use these cookies to show you ads off of Meta Products and to provide features like maps and videos on Meta Products. These cookies are optional.You have control over the optional cookies we use. Learn more about cookies and how we use them, and review or change your choices at any time in our Cookies Policy.About cookiesWhat are cookies?Learn moreWhy do we use cookies?Learn moreWhat are Meta Products?Learn moreYour cookie choicesLearn moreCookies from other companiesWe use cookies from other companies in order to show you ads off of our Products, and provide features like maps, payment services and video.How we use these cookiesIf you allow these cookiesIf you don't allow these cookiesOther ways you can control your informationManage your ad experience in Accounts CenterMore information about online advertisingControlling cookies with browser settingsDecline optional cookiesAllow all cookiesDecline optional cookiesAllow all cookies --- SearchHomeGet startedOverviewAPI documentationModelsOverviewLlama 4Llama Guard 4Llama 3.3Other modelsUse our modelsCloud partnersEdge partnersDownloadMetaHugging FaceKaggleBuilding with LlamaHow to guidesOverviewPrompt Engineering (Updated)Fine-tuningQuantizationDistillationEvaluationsValidationVision CapabilitiesResponsible UseIntegration guidesOverviewLangChain IntegrationLlamaIndex IntegrationDeployment guidesOverviewPrivate cloudProduction pipelinesInfrastructure migrationVersioningAccelerator managementAutoscalingRegulated industry self-hostingSecurity in productionCost projection and optimizationComparing costsA/B testingToolsSynthetic data kitPrompt opsLlama APIDocumentationGet startedOverviewStay updatedDownload modelsDOCUMENTATIONGet started with LlamaThis guide provides information and resources to help you set up Llama including how to access the model, hosting, how-to and integration guides. Additionally, you will find supplemental materials to further assist you while building with Llama. What's new Llama 4 ScoutNatively multimodal model that offers single H100 GPU efficiency and a 10M context windowLearn moreLlama 4 MaverickNatively multimodal model for image and text understanding and fast responses at a low costLearn moreLlama Guard 4Updated protection models with higher performance and support for Llama 4Learn moreGet the modelsYou can obtain the models directly from Meta or from one of our partners, Hugging Face, Kaggle or from our Edge partners or Cloud partners. Download modelsLlama 4 ScoutNatively multimodal model that offers single H100 GPU efficiency and a 10M context windowModel CardModel attributes in easy to consume, standard format.View on GithubPrompt Format How to construct effective prompts. Get prompt guidanceGet the model Direct or through a partner. DownloadHugging FaceLlama 4 MaverickNatively multimodal model for image and text understanding and fast responses at a low costModel CardModel attributes in easy to consume, standard format.View on GithubPrompt Format How to construct effective prompts. Get prompt guidanceGet the model Direct or through a partner. DownloadHugging FaceLlama Guard 4Updated protection models with higher performance and support for Llama 4Model CardModel attributes in easy to consume, standard format.View on GithubPrompt Format How to construct effective prompts. Get prompt guidanceGet the model Direct or through a partner. DownloadHugging Face Find us on GitHub Llama cookbookNotebooks and demos for learning Llama. Scripts for fine-tuning Llama3 with single/multi-node GPUs.Learn moreLlama StackDefines and standardizes the building blocks needed to bring generative AI applications to market.Learn more Other topics in this guide Getting Llama modelsTry out Llama models today.Learn moreHow-to guidesAn overview of the processes for developing any LLM: fine-tuning, prompt engineering, and model validation.Learn moreIntegration guidesDevelop solutions based on Code Llama, LangChain, and LlamaIndex.Learn moreCommunity supportCompilation of resources available from the community.Learn moreProductsAPIAPI waitlistLoginLlama StackModelsLlama 4Llama 3DownloadDocumentationOverviewModelsHow to guidesDeployment guidesIntegration guidesLearnCookbookVideosFAQCommunityCase studiesAI research communityLlama Impact GrantsHugging FaceGitHubLlama ProtectionsOverviewLlama Defenders ProgramDeveloper Use GuideNewsAI at Meta BlogMeta NewsroomAI researchOverviewTerms & policiesTerms of ServicePrivacy PolicyCookies Products APIAPI waitlistLoginLlama Stack Models Llama 4Llama 3Download News AI at Meta BlogMeta Newsroom Documentation OverviewModelsHow to guidesDeployment guidesIntegration guides Learn CookbookVideosFAQ AI research Overview Community Case studiesAI research communityLlama Impact GrantsHugging FaceGitHub Llama Protections OverviewLlama Defenders ProgramDeveloper Use Guide Terms & policies Terms of ServicePrivacy PolicyCookies --- SearchDocumentationProductsAPIOverviewAPI waitlistLoginLlama StackOverviewModelsModels familiesLlama 4Llama 3Get startedDownloadLearnResourcesCookbookVideosAI at Meta BlogCommunityBuilt with LlamaCase studiesResearchAI research communityNetworkHugging FaceGitHubSafetyLlama ProtectionsOverviewLlama Defenders ProgramDeveloper use guideLlama APIProductsAPIOverviewStay updatedDownload modelsLlama APIOffering the easiest and fastest way for companies to harness the power of Llama.Join the waitlistAPI loginBUILDStart building with Llama APILlama API provides easy one-click API key creation and interactive playgrounds to explore different Llama models. View all capabilitiesComplete controlBreak free from vendor lock-in Maintain complete control over your AI strategy with our state of the art open source models, giving you the flexibility to switch or adapt as your needs change.Build fastUse the latest Llama models With Llama API, you can effortlessly kickstart your generative AI projects, enabling rapid prototyping and testing of your next big idea.Innovative capabilitiesLightweight SDKs, easy integrationAccess the latest Llama models and unlock novel and differentiated capabilities, allowing you to push the boundaries.Private and secureYour data stays privateOur API is designed with a focus on data privacy and security. Meta ensures that your API inputs and outputs are not used to train our models.Model availabilityBuild with the latest Llama modelsLlama API offers you the opportunity to build with the latest Llama models including Llama 4 Maverick, Scout, previously unreleased Llama 3.3 8B, and more.Trust and security at MetaLearn moreNo trainingYour API inputs and outputs are not used to train or improve our models.Strict access controlWe use strict role-based access controls to limit access to your data.Data not used for adsYour data is not used for advertising or ad targeting.Separation in storageData processed by the Llama API is stored separately from other Meta product data.Encryption at rest and in transitYour data is encrypted in transit and at rest using industry-standard security.Compliance & vulnerability managementWe follow industry compliance standards and actively manage security vulnerabilities.SAFETY Protections in the era of generative AI. Comprehensive system-level protections proactively identify and mitigate potential risks, empowering developers to more easily deploy generative AI responsibly. Protection tools accessible to everyoneLearn moreEnabling AI DefendersLearn moreLlama API ResourcesNeed help getting started building on Llama API? Check out our documentation.Learn moreThe Llama Cookbook Github repo has what you need to get started from recipes to notebooks.Learn moreProductsAPIAPI waitlistLoginLlama StackModelsLlama 4Llama 3DownloadDocumentationOverviewModelsHow to guidesDeployment guidesIntegration guidesLearnCookbookVideosFAQCommunityCase studiesAI research communityLlama Impact GrantsHugging FaceGitHubLlama ProtectionsOverviewLlama Defenders ProgramDeveloper Use GuideNewsAI at Meta BlogMeta NewsroomAI researchOverviewTerms & policiesTerms of ServicePrivacy PolicyCookies Products APIAPI waitlistLoginLlama Stack Models Llama 4Llama 3Download News AI at Meta BlogMeta Newsroom Documentation OverviewModelsHow to guidesDeployment guidesIntegration guides Learn CookbookVideosFAQ AI research Overview Community Case studiesAI research communityLlama Impact GrantsHugging FaceGitHub Llama Protections OverviewLlama Defenders ProgramDeveloper Use Guide Terms & policies Terms of ServicePrivacy PolicyCookies --- SearchDocumentationProductsAPIOverviewAPI waitlistLoginLlama StackOverviewModelsModels familiesLlama 4Llama 3Get startedDownloadLearnResourcesCookbookVideosAI at Meta BlogCommunityBuilt with LlamaCase studiesResearchAI research communityNetworkHugging FaceGitHubSafetyLlama ProtectionsOverviewLlama Defenders ProgramDeveloper use guideLlama APIProductsLlama StackOverviewStay updatedDownload modelsLlama StackA streamlined developer experience enabling seamless AI application development. Build once, deploy anywhere. CookbooksCapabilitiesLlama Stack SDKs have differentiating features and capabilities that empower developers to create AI applications with ease from RAG to innovative applications tailored for mobile use cases.RAGEasily add RAG to your apps with our mobile framework and customize tailored models with user-specific context. Available in local and remote inference modes. Check out more on GitHub.Multi-image inferenceUtilize multi-image inference to build experiences that process and analyze multiple images at once. Available in remote inference mode only. Learn more on GitHub.Custom tool callingMobile framework supports unique tool calls like creating a calendar event all within your device’s app. Available in local and remote inference mode. Learn more on GitHub.Llama Stack: a streamlined developer experience Build faster, deploy anywhere and get the most out of the latest Llama models on day one. Learn moreOur partner ecosystem Llama Stack provides Standardized APIsProvides consistent interfaces for building and deploying AI applications.Flexible deployment optionsSupports local development, cloud, on-premises, and mobile environments.Pre-built toolsYour data is not used for advertising or ad targeting.Scalable infrastructureFacilitates easy scaling of AI applications.Strong partner networkCollaborates with various providers to offer specialized services.Telemetry and monitoringIncludes built-in support for tracing requests, evaluating model outputs.SAFETY Protections in the era of generative AI. Comprehensive system-level protections proactively identify and mitigate potential risks, empowering developers to more easily deploy generative AI responsibly. Protection tools accessible to everyoneLearn moreEnabling AI DefendersLearn more Llama Stack resources Need help getting started building on Llama Stack? Check out our documentation.Learn moreThe Llama Cookbook Github repo has what you need to get started from recipes to notebooks.Learn moreProductsAPIAPI waitlistLoginLlama StackModelsLlama 4Llama 3DownloadDocumentationOverviewModelsHow to guidesDeployment guidesIntegration guidesLearnCookbookVideosFAQCommunityCase studiesAI research communityLlama Impact GrantsHugging FaceGitHubLlama ProtectionsOverviewLlama Defenders ProgramDeveloper Use GuideNewsAI at Meta BlogMeta NewsroomAI researchOverviewTerms & policiesTerms of ServicePrivacy PolicyCookies Products APIAPI waitlistLoginLlama Stack Models Llama 4Llama 3Download News AI at Meta BlogMeta Newsroom Documentation OverviewModelsHow to guidesDeployment guidesIntegration guides Learn CookbookVideosFAQ AI research Overview Community Case studiesAI research communityLlama Impact GrantsHugging FaceGitHub Llama Protections OverviewLlama Defenders ProgramDeveloper Use Guide Terms & policies Terms of ServicePrivacy PolicyCookies