Skip to content
星际流动

LongSpec: Long-Context Lossless Speculative Decoding with Efficient Drafting and Verification

发布
采集
学术前沿 5.7 分 — 有一定参考价值的AI研究论文
原文: cs.CL updates on arXiv.org

评分 5.7 · 来源:cs.CL updates on arXiv.org · 发布于 2026-04-08

评分依据:有一定参考价值的AI研究论文

arXiv:2502.17421v3 Announce Type: replace Abstract: As Large Language Models (LLMs) can now process extremely long contexts, efficient inference over these extended inputs has become increasingly important, especially for emerging applications like LLM agents that highly depend on this capability. Speculative decoding (SD) offers a promising lossless acceleration technique compared to lossy alternatives such as quantization and model cascades. However, most state-of-the-art SD methods are trained on short texts (typically fewer than 4k tokens), making them unsuitable for long-context scenarios


标签: