Tag: 训练

All the articles with the tag "训练".

6.3
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
2026年04月08日
· cs.CL updates on arXiv.org· 04/08 12:31 采集
arXiv:2604.05091v1 Announce Type: new Abstract: We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large languag...
6.3
FreakOut-LLM: The Effect of Emotional Stimuli on Safety Alignment
2026年04月08日
· cs.AI updates on arXiv.org· 04/08 12:31 采集
arXiv:2604.04992v1 Announce Type: cross Abstract: Safety-aligned LLMs go through refusal training to reject harmful requests, but whether these mec...
7.0
Revisiting On-Policy Distillation：实证失败模式与简单修复
2026年03月27日
· cs.CL updates on arXiv.org· 03/27 12:31 采集
重新审视 OPD 在长程场景下的脆弱性，揭示采样 token 变体将分布匹配简化为单 token 信号的系统性问题。

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU