Skip to content

修复 PPO-Lagrange 算法中的 NaN 问题和断言错误#385

Open
Mi-729 wants to merge 2 commits intoPKU-Alignment:mainfrom
Mi-729:fix/ppo-lag-nan-cost-assertion
Open

修复 PPO-Lagrange 算法中的 NaN 问题和断言错误#385
Mi-729 wants to merge 2 commits intoPKU-Alignment:mainfrom
Mi-729:fix/ppo-lag-nan-cost-assertion

Commits

Commits on Jan 22, 2026