Nvidia debuts Nemotron 3 with hybrid MoE and Mamba-Transformer to drive efficient agentic AI

Nvidia debuts Nemotron 3 with hybrid MoE and Mamba-Transformer to drive efficient agentic AI

By Emilia David
Publication Date: 2025-12-15 05:00:00

Nvidia launched the new version of its frontier models, Nemotron 3, by leaning in on a model architecture that the world’s most valuable company said offers more accuracy and reliability for agents. 

Nemotron 3 will be available in three sizes: Nemotron 3 Nano with 30B parameters, mainly for targeted, highly efficient tasks; Nemotron 3 Super, which is a 100B parameter model for multi-agent applications and with high-accuracy reasoning and Nemotron 3 Ultra, with its large reasoning engine and around 500B parameters for more complex applications. 

To build the Nemotron 3 models, Nvidia said it leaned into a hybrid mixture-of-experts (MoE) architecture to improve scalability and efficiency. By using this architecture, Nvidia said in a press release that its new models also offer enterprises more openness and performance when building multi-agent autonomous systems. 

Kari Briski, Nvidia vice president for generative AI software, told reporters in a briefing that the company wanted to demonstrate its commitment to learn and improving from previous iterations of its models. 

“We believe that we are uniquely positioned to serve a wide range of developers who want full flexibility to customize models for building specialized AI by combining that new hybrid mixture of our mixture of experts architecture with a 1 million token context length,” Briski said.  

Nvidia said early adopters of the Nemotron 3 models include Accenture, CrowdStrike, Cursor, Deloitte, EY, Oracle Cloud…