Skip to content
星际流动

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

发布
采集
行业动态 6.5 分 — Mixed-precision quantization across output features. Practical system design for efficient LLM deployment.
原文: arXiv

评分 6.5 · 来源: · 发布于

评分依据:Mixed-precision quantization across output features. Practical system design for efficient LLM deployment.