Skip to content
星际流动

Temporally Extended Mixture-of-Experts Models

发布
采集
行业动态 7.0 分 — Proposes extending MoE expert lifespan using options framework. Addresses real MoE memory/churn problem at scale.
原文: arXiv

评分 7.0 · 来源: · 发布于

评分依据:Proposes extending MoE expert lifespan using options framework. Addresses real MoE memory/churn problem at scale.