CompanionGuard-RL

wangyu/CompanionGuard-RL

Fork 0

Commit Graph

Author	SHA1	Message	Date
wangyu	de3272b222	paper: fill RQ3 ablation summary and IRB ethics statement - 07_experiments.tex: replace \todo placeholder in RQ3 with actual ablation analysis referencing tab:moduleB_ablation (§5) and tab:moduleC_ablation (§6); summarize key takeaways for both modules - 08_discussion.tex: replace \todo IRB placeholder with full ethics declaration — synthetic data origin, public dataset attribution, DUA policy, no human-subjects experiment needed Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 15:07:09 +08:00
zhangsiyuan	52ba43f08d	feat: Module C v5/v6 training complete, ablations, SOTA baselines, paper updates - Module C: BC+PPO training v5/v6 done; eval results in experiments/eval_intervention_v{5,6}.json - Reward: v5 label-aligned constrained reward (code/src/rl/reward.py) - Ablations: Module B (history_r, response_only, full) + Module C (wo_category_reward) - SOTA baselines: WildGuard and ShieldGemma2b eval scripts and results - Paper: update sections 05–08 (Module B/C description, experiments table, discussion) - Docs: add record.md (change log), update state.md and exp.md; retire change.md - Tools: add html-to-ppt utilities and run_shieldgemma2b.sh - Configs: add ablation YAML configs for Module B and C - Cleanup: remove stale reference/ PNG screenshots Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-20 14:24:09 +08:00
zhangsiyuan	804ebd2f77	feat: add paper/ LaTeX draft, English data scripts, update progress docs - paper/: 22-page LaTeX framework (7/10 sections complete, compiles cleanly) main.tex + 10 section files + refs.bib + compiled PDF (329KB) - code/scripts/: three English dataset generation & merging scripts generate_english.py / generate_english_targeted.py / merge_v5.py - CLAUDE.md: update paper writing status, add paper/ file map entry - state.md: add section 8 paper writing progress (2026-05-15) - .gitignore: add LaTeX build artifact exclusion rules Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-18 11:19:39 +08:00

Author

SHA1

Message

Date

wangyu

de3272b222

paper: fill RQ3 ablation summary and IRB ethics statement

- 07_experiments.tex: replace \todo placeholder in RQ3 with actual
  ablation analysis referencing tab:moduleB_ablation (§5) and
  tab:moduleC_ablation (§6); summarize key takeaways for both modules
- 08_discussion.tex: replace \todo IRB placeholder with full ethics
  declaration — synthetic data origin, public dataset attribution,
  DUA policy, no human-subjects experiment needed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-20 15:07:09 +08:00

zhangsiyuan

52ba43f08d

feat: Module C v5/v6 training complete, ablations, SOTA baselines, paper updates

- Module C: BC+PPO training v5/v6 done; eval results in experiments/eval_intervention_v{5,6}.json
- Reward: v5 label-aligned constrained reward (code/src/rl/reward.py)
- Ablations: Module B (history_r, response_only, full) + Module C (wo_category_reward)
- SOTA baselines: WildGuard and ShieldGemma2b eval scripts and results
- Paper: update sections 05–08 (Module B/C description, experiments table, discussion)
- Docs: add record.md (change log), update state.md and exp.md; retire change.md
- Tools: add html-to-ppt utilities and run_shieldgemma2b.sh
- Configs: add ablation YAML configs for Module B and C
- Cleanup: remove stale reference/ PNG screenshots

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-20 14:24:09 +08:00

zhangsiyuan

804ebd2f77

feat: add paper/ LaTeX draft, English data scripts, update progress docs

- paper/: 22-page LaTeX framework (7/10 sections complete, compiles cleanly)
  main.tex + 10 section files + refs.bib + compiled PDF (329KB)
- code/scripts/: three English dataset generation & merging scripts
  generate_english.py / generate_english_targeted.py / merge_v5.py
- CLAUDE.md: update paper writing status, add paper/ file map entry
- state.md: add section 8 paper writing progress (2026-05-15)
- .gitignore: add LaTeX build artifact exclusion rules

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-18 11:19:39 +08:00

3 Commits