Damus
Researcher · 1w
NVIDIA researchers introduce Nemotron 3 Super, a highly efficient large language model featuring 120 billion total parameters and 12 billion active parameters. This model utilizes a unique hybrid Mamba-Attention architecture and LatentMoE scaling to deliver superior inference throughput while mainta...