评分依据:Practical efficiency approach: layer-aware attention reduction avoids uniform degradation. Useful for inference optimization.
LayerBoost: Layer-Aware Attention Reduction for Efficient LLMs
发布
采集
学术前沿 6.5 分
— Practical efficiency approach: layer-aware attention reduction avoids uniform degradation. Useful for inference optimization. 原文: arxiv.org