From Nvidia's blog, December 2:
The new Mistral 3 family, spanning frontier-level to compact models, is optimized for NVIDIA platforms, enabling Mistral AI’s vision for distributed intelligence across cloud to the edge.
Today, Mistral AI announced the Mistral 3 family of open-source multilingual, multimodal models, optimized across NVIDIA supercomputing and edge platforms.
Mistral Large 3 is a mixture-of-experts (MoE) model — instead of firing up every neuron for every token, it only activates the parts of the model with the most impact. The result is efficiency that delivers scale without waste, accuracy without compromise and makes enterprise AI not just possible, but practical.
Mistral AI’s new models deliver industry-leading accuracy and efficiency for enterprise AI. It will be available everywhere, from the cloud to the data center to the edge, starting Tuesday, Dec. 2.
With 41B active parameters, 675B total parameters and a large 256K context window, Mistral Large 3 delivers scalability, efficiency and adaptability for enterprise AI workloads.
By combining NVIDIA GB200 NVL72 systems and Mistral AI’s MoE architecture, enterprises can efficiently deploy and scale massive AI models, benefiting from advanced parallelism and hardware optimizations.
This combination makes the announcement a step toward the era of — what Mistral AI calls ‘distributed intelligence,’ bridging the gap between research breakthroughs and real-world applications....
....MUCH MORE