评分依据:Mixed-precision quantization across output features. Practical system design for efficient LLM deployment.
MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design
发布
采集
行业动态 6.5 分
— Mixed-precision quantization across output features. Practical system design for efficient LLM deployment. 原文: arXiv