Commit Graph

3 Commits

Author SHA1 Message Date
52ba43f08d feat: Module C v5/v6 training complete, ablations, SOTA baselines, paper updates
- Module C: BC+PPO training v5/v6 done; eval results in experiments/eval_intervention_v{5,6}.json
- Reward: v5 label-aligned constrained reward (code/src/rl/reward.py)
- Ablations: Module B (history_r, response_only, full) + Module C (wo_category_reward)
- SOTA baselines: WildGuard and ShieldGemma2b eval scripts and results
- Paper: update sections 05–08 (Module B/C description, experiments table, discussion)
- Docs: add record.md (change log), update state.md and exp.md; retire change.md
- Tools: add html-to-ppt utilities and run_shieldgemma2b.sh
- Configs: add ablation YAML configs for Module B and C
- Cleanup: remove stale reference/ PNG screenshots

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-20 14:24:09 +08:00
b50cf395ab refactor: move README/CLAUDE to root; rewrite CLAUDE.md as project constitution
- git mv code/README.md → README.md (project-level)
- Rewrite CLAUDE.md: accurate Module C status (v5 pending),
  Red Lines table (6 rules from real incidents), file map,
  server quick-reference, updated SCP commands
- Merge code/.gitignore into root .gitignore (dist/, build/,
  wandb/, *.jsonl, *.json.gz); delete code/.gitignore
- code/ now contains only: src/ scripts/ configs/ tests/
  checkpoints/ data/ requirements.txt

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-15 08:52:40 +08:00
bd1f51c496 chore: initial commit — unified project repo
Merged code repo (CompanionGuard-RL) into single project-level git.
Reorganized root: docs/, reference/, experiments/, tmp/active|archives/.
Gitignored: data/, checkpoints/, .venv, experiment logs, tmp/archives.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-14 11:28:42 +08:00