LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning

发布

2026年04月17日

采集 2026年04月17日 04:31

学术前沿 6.5 分 — Exploits high-magnitude Q/K activations in long context for RL guidance, novel intrinsic signal for long-context training

评分 6.5 · 来源：cs.LG updates on arXiv.org · 发布于 2026-04-17

评分依据：Exploits high-magnitude Q/K activations in long context for RL guidance, novel intrinsic signal for long-context training

arXiv:2604.14922v1 Announce Type: new Abstract: Reinforcement Learning (RL) has emerged as a critical driver for enhancing the reasoning capabilities of Large Language Models (LLMs). While recent advancements have focused on reward engineering or data synthesis, few studies exploit the model’s intrinsic representation characteristics to guide the training process. In this paper, we first observe the presence of high-magnitude activations within the query and key vectors when processing long contexts.