SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models

发布

2026年04月14日

采集 2026年04月14日 04:31

学术前沿 5.5 分 — 中等偏上：有一定信息增量和参考价值

评分 5.5 · 来源：cs.CL updates on arXiv.org · 发布于 2026-04-14

评分依据：中等偏上：有一定信息增量和参考价值

SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models

arXiv:2604.10091v1 Announce Type: new Abstract: Large language models (LLMs) have shown remarkable performance in various domains, but they are constrained by massive computational and storage costs. Quantization, an effective technique for compressing models to fit resource-limited devices while preserving generative quality, encompasses two primary methods: quantization aware training (QAT) and post-training quantization (PTQ). QAT involves additional retraining or fine-tuning, thus…