评分依据:Enables continuous multi-objective control in diffusion post-training. Useful for reward fine-tuning flexibility.
ParetoSlider: Diffusion Models Post-Training for Continuous Reward Control
发布
采集
行业动态 6.0 分
— Enables continuous multi-objective control in diffusion post-training. Useful for reward fine-tuning flexibility. 原文: arXiv