评分依据:KV cache compression for reasoning models addresses critical deployment cost issue
LongFlow: Efficient KV Cache Compression for Reasoning Models
发布
采集
学术前沿 8.0 分
— KV cache compression for reasoning models addresses critical deployment cost issue 原文: arxiv.org