modal.comAI tool

Modal

Site: https://modal.com/

Visitar site

modal.com

Desenvolvimento de APIs com IA AI Platform

Visitar site

Planos de precos

Ainda nao ha planos de preco detalhados para esta ferramenta.

Visao detalhada

AI infrastructure that developers love Run inference, training, and batch processing with sub-second cold starts, instant autoscaling, and a developer experience that feels local. Get Started Contact Us PROGRAMMABLE INFRA BUILT FOR PERFORMANCE ELASTIC GPU SCALING UNIFIED OBSERVABILITY Why Modal Designed to help AI teams deploy faster. Programmable infra Define everything in code, no YAML or config files. Keep environment and hardware requirements in sync. Built for performance Launch and scale containers in seconds to keep feedback loops tight and latency low. Elastic GPU scaling Elastic GPU capacity and access to thousands of GPUs across clouds. No quotas or reservations. Scale back to zero when not in use. Unified observability Integrated logging and full visibility into every function, container, and workload. PRODUCTS Powering any ML workload Inference Deploy and scale inference for LLMs, audio, image/video generation. Learn more Training Fine-tune open-source models on single or multi-node clusters instantly. Learn more Sandboxes Programmatically scale secure, ephemeral environments for running untrusted code. Learn more Batch Scale to thousands of containers for batch workloads on-demand. Learn more Notebooks Collaborate on code and data in real-time with shareable notebooks. Learn more PLATFORM Build on a powerful foundation From filesystem to runtime, every layer of Modal’s platform is engineered to give you the tools to build robust, scalable data applications. Learn More AI-native runtime Engineered from the ground up for heavy AI workloads, built for super-fast autoscaling and model initialization and 100x faster than Docker. Built-in storage layer A globally distributed storage system built for high throughput and low latency. Designed for fast model loading, training data, or other datasets. First-party integrations Mount your existing cloud buckets, connect to MLOps tools, and send data to your existing telemetry vendors. Multi-cloud capacity pool Deep multi-cloud capacity with intelligent scheduling ensures you always have the CPUs and GPUs you need without managing input orchestration. Built with Modal All examples Audio Transcription LLM Inference Coding Agents Computational Biology Image and Video Inference Transcribe speech in batches with Whisper Turn audio bytes into text at scale Voice chat with LLMs Build an interactive voice chat app Transcribe speech with Kyutai STT Stream transcripts at the speed of speech Make music Turn prompts into music with ACE-Step Fine-tune Whisper on domain vocab Improve Whisper transcription accuracy on specialized vocabularies with fine-tuning Deploy a TTS API with Chatterbox Serve text-to-speech with Chatterbox to generate natural audio from text Security and governance Team controls Battle-tested isolation SOC2 & HIPAA Data residency controls Learn More Team controls Battle-tested isolation SOC2 & HIPAA Data residency controls Learn More “We use Modal to run edge inference with <10ms overhead and batch jobs at large scale. Our team loves the platform for the power and flexibility it gives us.” Brian Ichter, Co-founder “Modal makes it easy to write code that runs on 100s of GPUs in parallel, transcribing podcasts in a fraction of the time.” Mike Cohen, Head of Data Read Case Study “Everyone here loves Modal because it helps us move so much faster. We rely on it to handle massive spikes in volume for evals, RL environments, and MCP servers. Whenever a team asks about compute, we tell them to use Modal.” Aakash Sabharwal, VP of Engineering “We've previously managed to break services like GitHub because of our load, so Modal handling our massive scale so smoothly means a lot. We trust Modal to keep up with our growth, and we're excited to build together in the long term.” Anton Osika, CEO & Founder Read Case Study Join Modal's developer community Modal Community Slack Igor Kotua Engineer, The Linux Foundation If you building AI stuff with Python and haven't tried @modal you are missing out big time Caleb ML Engineer, Hugging Face Bullish on @modal - Great Docs + Examples - Healthy Free Plan (30$ free compute / month) - Never have to worry about infra / just Python Daniel Rothenberg Co-founder, Brightband @modal continues to be magical... 10 minutes of effort and the `joblib`-based parallelism I use to test on my local machine can trivially scale out on the cloud. Makes life so easy! @mattzcarey.com on blsky AI Engineer, StackOne @modal has got a bunch of stuff just worked out this should be how you deploy python apps. wow Erin Boyle ML Engineer, Tesla This tool is awesome. So empowering to have your infra needs met with just a couple decorators. Good people, too! Aman Kishore Research Engineer, Harvey If you are still using AWS Lambda instead of @modal you're not moving fast enough Jai Chopra Product, LanceDB Recently built an app on Lambda and just started to use @modal, the difference is insane! Modal is amazing, virtually no cold start time, onboarding experience is great 🚀 Izzy Miller DevRel, Hex special shout out to @modal and @_hex_tech for providing the crucial infrastructure to run this! Modal is the coolest tool I’ve tried in a really long time— cannnot say enough good things. Diego Fernandes Co-founder & CTO, RocketSeat Probably one of the best piece of software I'm using this year: modal.com Mark Tenenholtz Head of AI, PredeloHQ I use @modal because it brings me joy. There isn't much more to it. Adam Azzam Product, Prefect feels weird at this point to use anything else than @modal for this — absolutely the GOAT of dynamic sandboxes Mark Neumann Head of ML, Orbital Materials Used @modal for the first time today - immediate "oh, this is how backends should work" moment, similar to using Vercel for the first time for frontend deployments. Rémi 📎 Co-founder & CEO, .txt Nothing beats @modal when it comes to deploying a quick POC Moin Nadeem Co-founder, Phonic I've realized @modal is actually a great fit for ML training pipelines. If you're running model-based evals, why not just call a serverless Modal function and have it evaluate your model on a separate worker GPU? This makes evaluation during training really easy. Matt Holden Founder Late to the party, but finally playing with @modal to run some backend jobs. DX is sooo nice (compared to Docker, Cloud Run, Lambda, etc). Just decorate a Python function and deploy. And it's fast! Love it. Igor Kotua Engineer, The Linux Foundation If you building AI stuff with Python and haven't tried @modal you are missing out big time Caleb ML Engineer, Hugging Face Bullish on @modal - Great Docs + Examples - Healthy Free Plan (30$ free compute / month) - Never have to worry about infra / just Python Daniel Rothenberg Co-founder, Brightband @modal continues to be magical... 10 minutes of effort and the `joblib`-based parallelism I use to test on my local machine can trivially scale out on the cloud. Makes life so easy! @mattzcarey.com on blsky AI Engineer, StackOne @modal has got a bunch of stuff just worked out this should be how you deploy python apps. wow Erin Boyle ML Engineer, Tesla This tool is awesome. So empowering to have your infra needs met with just a couple decorators. Good people, too! Aman Kishore Research Engineer, Harvey If you are still using AWS Lambda instead of @modal you're not moving fast enough Jai Chopra Product, LanceDB Recently built an app on Lambda and just started to use @modal, the difference is insane! Modal is amazing, virtually no cold start time, onboarding experience is great 🚀 Izzy Miller DevRel, Hex special shout out to @modal and @_hex_tech for providing the crucial infrastructure to run this! Modal is the coolest tool I’ve tried in a really long time— cannnot say enough good things. Diego Fernandes Co-founder & CTO, RocketSeat Probably one of the best piece of software I'm using this year: modal.com Mark Tenenholtz Head of AI, PredeloHQ I use @modal because it brings me joy. There isn't much more to it. Adam Azzam Product, Prefect feels weird at this point to use anything else than @modal for this — absolutely the GOAT of dynamic sandboxes Mark Neumann Head of ML, Orbital Materials Used @modal for the first time today - immediate "oh, this is how backends should work" moment, similar to using Vercel for the first time for frontend deployments. Rémi 📎 Co-founder & CEO, .txt Nothing beats @modal when it comes to deploying a quick POC Moin Nadeem Co-founder, Phonic I've realized @modal is actually a great fit for ML training pipelines. If you're running model-based evals, why not just call a serverless Modal function and have it evaluate your model on a separate worker GPU? This makes evaluation during training really easy. Matt Holden Founder Late to the party, but finally playing with @modal to run some backend jobs. DX is sooo nice (compared to Docker, Cloud Run, Lambda, etc). Just decorate a Python function and deploy. And it's fast! Love it. Ship your first app in minutes. Get Started $30 / month free compute © Modal 2026 Products Modal InferenceModal SandboxesModal TrainingModal NotebooksModal BatchModal Core Platform Resources DocumentationPricingSlack CommunityArticlesGPU GlossaryLLM Engine AdvisorModel Library Popular Examples Serve your own LLM APICreate custom art of your petAnalyze Parquet files from S3 with DuckDBRun hundreds of LoRAs from one appFinetune an LLM to replace your CEO Company AboutBlogCareersEventsPrivacy PolicySecurity & PrivacyTerms © Modal 2026 --- Pricing as magical as our product With Modal, you always pay for what you use and nothing more. You never pay for idle resources — just actual compute time, by the CPU cycle. Get Started Contact Us Compute costs Per hour Per second Per hour Per second GPU Tasks Nvidia B200 $0.001736 / secNvidia H200 $0.001261 / secNvidia H100 $0.001097 / secNvidia RTX PRO 6000 $0.000842 / secNvidia A100, 80 GB $0.000694 / secNvidia A100, 40 GB $0.000583 / secNvidia L40S $0.000542 / secNvidia A10 $0.000306 / secNvidia L4 $0.000222 / secNvidia T4 $0.000164 / sec CPU Physical core (2 vCPU equivalent) $0.0000131 / core / sec *minimum of 0.125 cores per container Memory $0.00000222 / GiB / sec PRICING PLANS Starter $0 + compute / month Built for small teams and independent developers looking to level up. Get started with $30 / month free credit Get Started $30 / month free credits 3 workspace seats included 100 containers + 10 GPU concurrency Crons and web endpoints (limited) Real-time metrics and logs Region selectionTeam $250 + compute / month Built for startups and larger organizations looking to scale quickly. Sign in to upgrade Sign in to upgrade $100 / month free credits Unlimited seats 1000 containers + 50 GPU concurrency Unlimited crons and web endpoints Custom domains Static IP proxy Deployment rollbacksEnterprise Custom For organizations prioritizing security, support, and everlasting confidence. Get in touch Get in touch Volume-based discounts Unlimited seats Higher GPU concurrency Embedded ML engineering services Support via private Slack Audit logs, Okta SSO, and HIPAA Credit grants for startups Early-stage startups can get up to $25k free compute credits on Modal. Apply Now Credit grants for academics Graduate students, labs, and researchers can get up to $10k free compute credits on Modal Apply now Use committed spend on Modal Transact through the AWS and GCP marketplace to use committed spend on Modal. Get In Touch Modal Sandbox + Notebooks Pricing Only pay for what you use, by the CPU cycle. For GPU Sandboxes or Notebooks, refer to our standard GPU prices. Compute costs Per hour Per second Per hour Per second CPU Physical core (2 vCPU equivalent) $0.00003942 / core / sec *minimum of 0.125 cores per container Memory $0.00000672 / GiB / sec Why serverless? Serverless pricing vs. traditional cloud pricing Modal is serverless, which means that we instantly autoscale up and down for you based on request volume. For spiky or unpredictable workloads, we are more cost-effective than fixed on-demand/reserved compute. Traditional cloud: $5,400 75 GPUs * 24 hrs * $3 / GPU-hr Modal serverless cloud: $4,740 Avg 50 GPUs * 24 hrs * $3.95 / GPU-hr Compare features Starter $0 + compute / month Team $250 + compute / month Enterprise Custom compute Workspace Number of seats Up to 3 Unlimited Unlimited Credits and variable costs Included compute $30 / month $100 / month Custom Features Cron jobs 5 deployed crons Unlimited Unlimited Webhooks 8 deployed endpoints Unlimited Unlimited Log retention 1 day 30 days Custom Containers 100 1000 Custom GPU concurrency 10 50 Custom Deployed apps 200 1000 1000 Custom domains Real-time metrics Sharing and collaboration Secrets Unlimited Unlimited Unlimited Custom images Unlimited Unlimited Unlimited Instant deploys Unlimited Unlimited Unlimited Deployment rollbacks 3 versions Custom Distributed queue Unlimited Unlimited Unlimited Distributed dict Unlimited Unlimited Unlimited Region selection 1.25 - 2.5x base prices 1.25 - 2.5x base prices 1.25 - 2.5x base prices Non-preemptible execution 3x base prices 3x base prices 3x base prices Cloud Marketplaces Use committed spend on AWS and GCP Support Access to Modal Community Slack Access to Modal Community Slack + Support via private Slack + Embedded ML engineering services Security SOC 2 compliance HIPAA compatibility Audit logs Static IP proxy RBAC Coming Soon SSO Frequently asked questions How does serverless pricing differ from traditional on-demand pricing? What counts as billable time? How are CPU and memory usage metered? What's an example of real-world CPU billing? What's an example of real-world GPU billing? Do you charge for storage? Can I add more than three team members on the starter plan? What kinds of applications can I deploy using Modal? Can I use my AWS, GCP, or Azure credits on Modal? Ship your first app in minutes. Get Started $30 / month free compute © Modal 2026 Products Modal InferenceModal SandboxesModal TrainingModal NotebooksModal BatchModal Core Platform Resources DocumentationPricingSlack CommunityArticlesGPU GlossaryLLM Engine AdvisorModel Library Popular Examples Serve your own LLM APICreate custom art of your petAnalyze Parquet files from S3 with DuckDBRun hundreds of LoRAs from one appFinetune an LLM to replace your CEO Company AboutBlogCareersEventsPrivacy PolicySecurity & PrivacyTerms © Modal 2026 --- Empowering teams of every size Modal is built for the fastest-growing teams in the world. We help companies from startups to enterprises bring cutting-edge AI applications to the market. Get Started Contact Us “We've previously managed to break services like GitHub because of our load, so when Modal was able to handle the massive scale of our AI weekend event so smoothly, that meant a lot.” Full Case Study “There would be a lot of edge cases and unknowns if we built code sandboxes ourselves. We offloaded this to Modal and are actively saving 2 engineers' worth of ongoing engineering time.” Full Case Study “Modal makes it easy to write code that runs on 100s of GPUs in parallel, transcribing podcasts in a fraction of the time.” Full Case Study “Modal lets us move fast while keeping full control over our models and serving stack. The flexibility meant we could train high-accuracy models and hit the real-time performance our product demands.” Full Case Study “Tasks that would have taken days to complete take minutes instead. We’ve also saved thousands of dollars deploying open-source LLMs on Modal.” Full Case Study “The beauty of Modal is that all you need to know is that you can scale your function calls in the cloud with a few lines of Python.” Full Case Study Customers use Modal for Language Models Image, Video, 3D Audio Processing Fine-Tuning Batch Processing Sandboxed Code Computational Bio Coding Agents Language Models Image, Video, 3D Audio Processing Fine-Tuning Batch Processing Sandboxed Code Computational Bio Coding Agents “With the old in-house systems, we'd have to tune number of workers, instance size, parallelization strategy, all this stuff, which was very time-consuming and not directly generating business value. Modal magically handled all that.” Samarth Goel, ML Engineer “Modal Sandboxes enable us to execute generated code securely and flexibly. With Modal's support, we expedited the development of our code interpreter feature and successfully integrated it into our chat platform, Le Chat, to better assist our users.” Wendy Shang, AI Scientist “Modal is the easiest way to experiment as we develop new fine-tuning techniques. We've been able to validate new features faster and beat competitors because of how quickly we can try new ideas.” Kyle Corbitt, CTO “Using Modal, Codegen has been able to move at lightning speed with full-stack AI development. The product is designed with developer experience front and center, and my team is incredibly happy having it as part of our arsenal.” Jay Hack, Founder & CEO “Modal has been great for iterating quickly on our data pipelines. It enables us to process a large batch of logs in minutes! The infrastructure is amazing for experimentation.” Steven Hao, Co-founder “Our org runs on Modal. We use it for AI agent environments, scalable deployment of AI agents, hosting of deep learning models, and visualization. It dramatically simplified our engineering infrastructure and completely changed the scope of projects we can do.” Andrew White, Co-Founder & Head of Science “Within a few hours I had CI on GPUs running on Modal. I'm shocked at how easy it was for us to go from manually running tests on GPUs to automating these in parallel across our growing team.” Stas Bekman, ML Engineer “Modal is the only platform that supported our custom infrastructure needs and had a simple developer experience. We deploy hundreds of predictive models for our core model routing functionality, and Modal helps us scale our product cleanly.” Shriyash Upadhyay, Founder “Switched to Modal for our LLM inference instead of Azure. 1/4 the price for GPUs and so much simpler to set up/scale. Big fan.” Alex Reichenbach, CEO “We are constantly shipping the most cutting-edge creative AI machine learning techniques so our customers have access to the best creative models. Modal's has helped us streamline the process from idea to deployed pipeline, allowing us to both deploy quickly & scale rapidly.” Weber Wong, Founder “Modal makes it unbelievably quick to deploy our models onto scalable infrastructure. We’ve been able to move faster on our last few model launches, including Olmo and Tülu, thanks to the platform.” Michael Schmitz, Engineering “Our platform leverages Modal's infrastructure for the heavy lifting—handling ridiculous scale and concurrency behind the scenes. This lets us focus on what we do best: gathering and analyzing unstructured textual data with precision.” Charles Zhou, Founding Engineer “Adopting Modal allows my team to focus on our product, not infrastructure. We save thousands in CPU/GPU costs, but more importantly engineering-hours. Everything we run on Modal just works: we scale from 0.5gb of ram to 95gb of ram without a second thought. Almost everything we do runs on Modal.” Ben Epstein, Co-founder & CTO “We used Modal to build an inference server for our model, Chai-1, which allows people to predict molecular structures via a web app. Modal allowed us to build and launch the server in days: our engineers didn't have to worry about maintaining infrastructure, delivering the product in record time.” Jack Dent, Co-Founder “At Phonic, we train our own proprietary models for audio generation. We moved all our large-scale audio processing batch jobs to Modal. Our engineers are ecstatic with the result – we can run at a much larger scale than before, no longer have to babysit our batch jobs, and we can ship much faster.” Moin Nadeem, Co-Founder “Using Modal for inference is like having an extra infra team - it’s reliable, scalable, and fast - meaning I can get back to training models” Vik Paruchari, Founder “We use Modal to securely run LLM-augmented code on a large scale. Modal’s powerful primitives like sandboxes and file systems have allowed us to focus on our core competencies without having to waste time on our own infra.” Matt Harpe, Co-Founder “Processing external quantum mechanical datasets comes with unique challenges. Jobs can fail in numerous ways. Modal's retry mechanism and batching primitives have made our data pipeline much more robust.” Liz Decolvenaere, Quantum Chemical Engineer Ship your first app in minutes. Get Started $30 / month free compute © Modal 2026 Products Modal InferenceModal SandboxesModal TrainingModal NotebooksModal BatchModal Core Platform Resources DocumentationPricingSlack CommunityArticlesGPU GlossaryLLM Engine AdvisorModel Library Popular Examples Serve your own LLM APICreate custom art of your petAnalyze Parquet files from S3 with DuckDBRun hundreds of LoRAs from one appFinetune an LLM to replace your CEO Company AboutBlogCareersEventsPrivacy PolicySecurity & PrivacyTerms © Modal 2026 --- Modal Documentation Modal provides a serverless cloud for engineers and researchers who want to build compute-intensive applications without thinking about infrastructure. Run generative AI models, large-scale batch workflows, job queues, and more, all faster than ever before. Try the playground Guide Everything you need to know to run code on Modal. Dive deep into all of our features and best practices. Examples Powerful applications built with Modal. Explore guided starting points for your use case. Reference Technical information about the Modal API. Quickly refer to basic descriptions of various programming functionalities. Playground Interactive tutorials to learn how to start using Modal. Run serverless cloud functions from your browser. Guide Everything you need to know to run code on Modal. Dive deep into all of our features and best practices. Examples Powerful applications built with Modal. Explore guided starting points for your use case. Reference Technical information about the Modal API. Quickly refer to basic descriptions of various programming functionalities. Playground Interactive tutorials to learn how to start using Modal. Run serverless cloud functions from your browser. Featured Examples All examples Deploy an OpenAI-compatible LLM service Run large language models with a drop-in replacement for the OpenAI API Custom pet art from Flux with Hugging Face and Gradio Fine-tune an image generation model on pictures of your pet Optimize tokens per second Maximize throughput in batch LLM processing Deploy OpenCode agents Run coding agents at scale in secure Sandboxes Transcribe speech in batches with Whisper Turn audio bytes into text at scale Voice chat with LLMs Build an interactive voice chat app Deploy vibe coding at scale Build an AI coding platform for thousands of users Deploy really big language models Serve models with hundreds of billions of parameters Edit images with Flux Kontext Transform images with SotA diffusion models Fold proteins with Boltz-2 Predict molecular structures and binding affinities from sequences with SotA open source models Serverless WebRTC Stream YOLO detections on webcam footage in real time Sandbox a LangGraph agent's code Run an LLM coding agent that runs its own language models Serve diffusion models Serve Flux on Modal with optimizations for blazingly fast inference Low latency SGLang Run interactive language model applications Transcribe speech with Kyutai STT Stream transcripts at the speed of speech Star in custom music videos Fine-tune a Wan2.1 video model on your face and run it in parallel Make music Turn prompts into music with ACE-Step RAG Chat with PDFs Use ColBERT-style, multimodal embeddings with a Vision-Language Model to answer questions about documents Bring images to life Prompt a generative video model to animate an image Build a protein folding dashboard Serve a web UI for a protein model with ESM3, Molstar, and Gradio Deploy a Hacker News Slackbot Periodically post new Hacker News posts to Slack Fold proteins with Chai-1 Predict molecular structures from sequences with SotA open source models Retrieval-Augmented Generation (RAG) for Q&A Build a question-answering web endpoint that can cite its sources Document OCR job queue Use Modal as an infinitely scalable job queue that can service async tasks from a web app Parallel processing of Parquet files on S3 Analyze data from the Taxi and Limousine Commission of NYC in parallel

Ferramentas da mesma categoria