Tag: efficient-LLM

All the articles with the tag "efficient-LLM".

6.0
Efficient Mixture-of-Experts LLM Inference with Apple Silicon NPUs
2026年04月22日
· cs.LG updates on arXiv.org· 04/22 14:31 采集
解决MoE LLM在Apple Neural Engine上推理的三大核心挑战：动态张量形状、不规则算子和显存碎片