08-05-2025
Mistral announces new AI model Medium 3 at 8x lower cost
French AI startup Mistral has introduced a frontier-level AI model, Mistral Medium 3. The new model from the Paris-based AI company is said to have outperformed models like Claude Sonnet 3.7 and GPT-4o on numerous benchmarks. The new model reportedly costs less than DeepSeek V3.
The company has said that organisations can use the new model through its new AI assistant called Le Chat Enterprise that features an agent builder and allows full integration with a variety of apps. Mistral has also teased a more powerful model which will be introduced in the coming weeks. Mistral Medium 3 is said to be pushing efficiency and usability of language models even further.
Mistral claims that the new Medium 3 brings a new class of models that balances state-of-the-art performance, is 8x lower in cost, and offers simple deployability to accelerate enterprise usage. The model also leads in professional use cases like coding and multimodal understanding. When it comes to enterprise capabilities, Medium 3 offers hybrid or on-premises in-VPC deployment, custom post-training, and allows integration into enterprise tools and systems.
According to the company, the model performs at or above 90 per cent of Claude Sonnet 3.7 on benchmarks across the board at a considerably lower cost – $0.4 input/$2 output per M token. Medium 3 has also surpassed models such as Llama 4 Maverick and enterprise models like Cohere Command A. When it comes to pricing in terms of API and self-deployed systems, the model beats DeepSeek V3. It can also be deployed on any cloud, including self-hosted environments of four GPUs and above.
The company claims the model is designed to be frontier-class, particularly in categories of professional use. When it comes to benchmarks, Mistral Medium 3 delivers top performance in instruction following (ArenaHard: 97.1%) and math (Math500: 91%), with strong results in long context tasks (RULER 32K: 96%). In terms of human evaluations, Medium 3 outperforms competitors, especially in coding. The model beats Claude Sonnet 3.7, DeepSeek 3.1, and GPT-4o in several cases.