Skip to content
星际流动

Mamba-3: Improved Sequence Modeling using State Space Principles

发布
7.6 分 — 重要架构改进,sub-quadratic 模型新进展,对研究者有价值

Mamba-3 introduces improved sequence modeling using state space principles. The model achieves comparable perplexity to Mamba-2 despite using half of its predecessor’s state size, advancing the performance-efficiency Pareto frontier.

Read more at arXiv


标签: