Skip to content
星际流动

LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs

发布
采集
学术前沿 6.5 分 — Practical efficiency approach: layer-aware attention reduction avoids uniform degradation. Useful for inference optimization.
原文: arxiv.org

评分 6.5 · 来源: · 发布于 2026-04-27

评分依据:Practical efficiency approach: layer-aware attention reduction avoids uniform degradation. Useful for inference optimization.