Tag: policy-optimization
All the articles with the tag "policy-optimization".
- 7.5
Frictive Policy Optimization for LLMs: Epistemic Intervention, Risk-Sensitive Control, and Reflective Alignment
FPO 框架将澄清、验证、挑战、重定向和拒绝作为显式控制动作来管理认识论和规范性风险
All the articles with the tag "policy-optimization".
FPO 框架将澄清、验证、挑战、重定向和拒绝作为显式控制动作来管理认识论和规范性风险