feat: initial CompanionGuard-RL framework

Two-module pipeline for AI companion safety:
- Module B: context-aware risk detector with CrossAttention fusion
- Module C: PPO-based adaptive intervention policy

Includes CompanionRisk Taxonomy (10 primary + 14 fine-grained labels),
dataset generation/annotation pipeline, training scripts, and eval suite.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

This commit is contained in:

wangyu

2026-05-09 17:21:11 +08:00

commit 7d4345c29d

29 changed files with 3317 additions and 0 deletions

feat: initial CompanionGuard-RL framework

0 experiments/.gitkeep Normal file Unescape Escape View File

0

experiments/.gitkeep Normal file

View File