Modal Labs in Advanced Talks for $2.5B Funding Round Led by General Catalyst
AI inference startup Modal Labs is reportedly in advanced negotiations to raise capital at a $2.5 billion valuation, with General Catalyst poised to lead the round. The four-year-old company has emerged as a key player in optimizing AI model deployment for enterprise clients.

Modal Labs, a San Francisco-based AI infrastructure startup founded in 2021, is reportedly in advanced discussions to secure a new funding round that would value the company at $2.5 billion, according to multiple industry sources. General Catalyst is expected to lead the investment, marking one of the largest venture rounds in the AI inference sector this year. The move underscores the accelerating demand for scalable, cost-efficient solutions to run large language models (LLMs) in production environments.
Modal Labs has carved out a niche by providing developers with a cloud-native platform that simplifies the deployment and scaling of AI inference workloads. Unlike traditional cloud providers that charge for idle compute resources, Modal’s architecture optimizes resource allocation in real time, significantly reducing operational costs for companies running LLMs such as GPT, Llama, and Claude. This efficiency has attracted major clients across finance, healthcare, and media, where latency and cost are critical factors in AI adoption.
The term "inference"—often confused with "prediction"—refers to the process of using a trained machine learning model to generate outputs from new input data. While "prediction" is a broader term that can apply to any statistical model output, inference specifically denotes the computational phase after training, where models are deployed at scale. As noted in technical discussions on platforms like Zhihu, inference demands high-throughput, low-latency infrastructure, making Modal’s platform particularly valuable for enterprises seeking to deploy AI without the overhead of managing proprietary hardware or complex Kubernetes clusters.
Since its inception, Modal Labs has raised approximately $150 million in seed and Series A funding from investors including a16z, Y Combinator, and Matrix Partners. The upcoming round, if finalized, would represent a nearly 17-fold increase in valuation since its last funding event. Sources indicate the company is targeting a close by the end of Q4 2025, with proceeds earmarked for expanding its engineering team, enhancing its GPU optimization stack, and entering international markets, particularly in Europe and Southeast Asia.
The AI inference market is projected to surpass $40 billion by 2028, according to Gartner, driven by the proliferation of generative AI applications. Modal’s competitive edge lies in its serverless inference model, which allows developers to deploy models with a single line of code—similar to how AWS Lambda simplified serverless computing. This developer-centric approach has earned praise from open-source communities and enterprise CTOs alike.
General Catalyst, known for early investments in companies like Stripe and Snap, sees Modal as a foundational piece of the next-generation AI infrastructure stack. "Inference is no longer a footnote in AI development—it’s the bottleneck," said a partner at General Catalyst familiar with the deal. "Modal has built the most elegant solution we’ve seen to unlock real-time AI at scale."
Competitors such as RunPod, Hugging Face, and Replicate are also vying for market share, but Modal’s integration with Python-based ML frameworks and its transparent pricing model have given it an edge in developer adoption. Industry analysts suggest that the $2.5 billion valuation reflects not just current revenue, but the potential for Modal to become the de facto standard for AI inference, much like Snowflake did for cloud data warehousing.
As AI models grow larger and more complex, the ability to efficiently execute inference will determine which companies can deliver real-time AI experiences—and which will be left behind. With this funding round, Modal Labs is positioning itself not just as a vendor, but as a critical enabler of the next wave of AI innovation.


