This website requires JavaScript.
Explore
Help
Sign In
wangyu
/
CompanionGuard-RL
Watch
1
Star
0
Fork
0
You've already forked CompanionGuard-RL
Code
Issues
Pull Requests
Actions
Packages
Projects
Releases
Wiki
Activity
Files
66b2f84588b0a40cd40e32f8e85471f2140cc29f
CompanionGuard-RL
/
experiments
/
eval_v6_done.flag
2 lines
5 B
Plaintext
Raw
Normal View
History
Unescape
Escape
feat: Module C v5/v6 training complete, ablations, SOTA baselines, paper updates - Module C: BC+PPO training v5/v6 done; eval results in experiments/eval_intervention_v{5,6}.json - Reward: v5 label-aligned constrained reward (code/src/rl/reward.py) - Ablations: Module B (history_r, response_only, full) + Module C (wo_category_reward) - SOTA baselines: WildGuard and ShieldGemma2b eval scripts and results - Paper: update sections 05–08 (Module B/C description, experiments table, discussion) - Docs: add record.md (change log), update state.md and exp.md; retire change.md - Tools: add html-to-ppt utilities and run_shieldgemma2b.sh - Configs: add ablation YAML configs for Module B and C - Cleanup: remove stale reference/ PNG screenshots Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 14:24:09 +08:00
DONE
Reference in New Issue
Copy Permalink