feat: initial CompanionGuard-RL framework

Two-module pipeline for AI companion safety:
- Module B: context-aware risk detector with CrossAttention fusion
- Module C: PPO-based adaptive intervention policy

Includes CompanionRisk Taxonomy (10 primary + 14 fine-grained labels),
dataset generation/annotation pipeline, training scripts, and eval suite.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-09 17:21:11 +08:00
commit 7d4345c29d
29 changed files with 3317 additions and 0 deletions

35
.gitignore vendored Normal file
View File

@@ -0,0 +1,35 @@
__pycache__/
*.py[cod]
*.egg-info/
dist/
build/
.eggs/
# Virtual environments
.venv/
venv/
env/
# Data (raw and processed — do not commit large datasets)
data/raw/
data/processed/
# Model checkpoints
checkpoints/
# Experiment outputs
experiments/eval_results.json
wandb/
# Editor
.idea/
.vscode/
*.swp
# OS
.DS_Store
Thumbs.db
# API keys
.env
*.env