2026-05-14 11:32:02 +08:00
|
|
|
|
# CompanionGuard-RL — 项目参考文档
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
> 本文件由 Claude Code 自动读取。训练已全部完成,当前阶段:**论文写作**。
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## 项目状态(2026-05-12)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
| 模块 | 状态 | 关键指标 |
|
|
|
|
|
|
|------|------|---------|
|
|
|
|
|
|
| 数据集 CompanionRisk-Bench v4 | ✅ 完成 | 9,896 样本,全 14 标签覆盖 |
|
|
|
|
|
|
| Module B — 检测器(MacBERT-large) | ✅ 完成 | binary_f1=0.9995, level_weighted_f1=0.559 |
|
|
|
|
|
|
| Module C — RL 干预策略(PPO) | ✅ 完成 | safety_recall=1.0, over_refusal=0.004 |
|
|
|
|
|
|
| 论文写作 | 🔄 进行中 | — |
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
详细结果见项目根目录 `../state.md`,踩坑经验见 `exp.md`,变更记录见 `change.md`。
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## 本地目录结构
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
```
|
2026-05-14 11:32:02 +08:00
|
|
|
|
D:\Myresearch\CompanionGuard-RL\
|
|
|
|
|
|
├── code/ ← 本目录(源代码)
|
|
|
|
|
|
│ ├── src/ ← 18 个核心 .py(models/ rl/ utils/)
|
|
|
|
|
|
│ ├── scripts/ ← 训练/评估/数据生成脚本
|
|
|
|
|
|
│ ├── configs/ ← 4 个 yaml 配置
|
|
|
|
|
|
│ ├── checkpoints/ ← 模型权重(gitignored)
|
|
|
|
|
|
│ │ ├── detector/best.pt ← Module B 论文权重(1.35GB)
|
|
|
|
|
|
│ │ └── intervention/final_v2.pt ← Module C 论文权重
|
|
|
|
|
|
│ ├── experiments/ ← 评估结果 JSON
|
|
|
|
|
|
│ │ ├── eval_intervention_v3.json ← Module C 论文用
|
|
|
|
|
|
│ │ └── eval_intervention_v4.json ← v3 重跑确认(数字相同)
|
|
|
|
|
|
│ └── data/ ← 处理后数据(gitignored)
|
|
|
|
|
|
├── data/ ← 原始数据集(gitignored)
|
|
|
|
|
|
├── docs/ ← 研究文档
|
|
|
|
|
|
├── state.md ← 项目进度快照(最新)
|
|
|
|
|
|
└── experiments/ ← 根目录评估结果备份
|
2026-05-14 11:28:42 +08:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## 服务器信息
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
### 服务器 1(主训练机)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
| 项目 | 值 |
|
|
|
|
|
|
|------|----|
|
|
|
|
|
|
| SSH | `ssh -p 20083 root@10.82.3.180` |
|
|
|
|
|
|
| 密码 | `m2dGcwyrhI` |
|
|
|
|
|
|
| 项目目录 | `/root/siton-data-2849d4ce327c4ccfb233ce33868fe7fe/zsy/CompanionGuard-RL` |
|
|
|
|
|
|
| MacBERT | `/root/siton-data-2849d4ce327c4ccfb233ce33868fe7fe/zsy/macbert-large` |
|
|
|
|
|
|
| 环境 | `/opt/conda/envs/dlapo-py310-cu128`(torch 2.7.1+cu128) |
|
|
|
|
|
|
| GPU | 4 × RTX 5090 32GB |
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
### 服务器 2(当前使用)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
| 项目 | 值 |
|
|
|
|
|
|
|------|----|
|
|
|
|
|
|
| SSH | `ssh -p 20060 root@10.82.3.180` |
|
|
|
|
|
|
| 密码 | `zwfn65xjTY` |
|
|
|
|
|
|
| 项目目录 | `/root/siton-data-740d234e02d749f08fe5347b0c74c49f/zsy/my-reasearch/companionguard-rl` |
|
|
|
|
|
|
| MacBERT | `/root/siton-data-740d234e02d749f08fe5347b0c74c49f/zsy/macbert-large` |
|
|
|
|
|
|
| 环境 | `/root/siton-data-740d234e02d749f08fe5347b0c74c49f/zsy/env/dlapo-py310-cu128` |
|
|
|
|
|
|
| GPU | 2 × RTX 5090 32GB |
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
> 两台服务器在同一宿主机 `10.82.3.180`,不同 Docker 容器。
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## SCP 同步命令(本地 ↔ 服务器)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
```powershell
|
2026-05-14 11:32:02 +08:00
|
|
|
|
# ===== 本地 → 服务器1(上传代码)=====
|
|
|
|
|
|
$S1="root@10.82.3.180"
|
|
|
|
|
|
$PROJ1="/root/siton-data-2849d4ce327c4ccfb233ce33868fe7fe/zsy/CompanionGuard-RL"
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
scp -P 20083 -r `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\src `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\scripts `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\configs `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\requirements.txt `
|
|
|
|
|
|
${S1}:${PROJ1}/
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
# 上传已处理数据
|
|
|
|
|
|
scp -P 20083 -r `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\data `
|
|
|
|
|
|
${S1}:${PROJ1}/
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
# ===== 服务器1 → 本地(取回结果)=====
|
|
|
|
|
|
scp -P 20083 -r `
|
|
|
|
|
|
${S1}:${PROJ1}/checkpoints `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
scp -P 20083 -r `
|
|
|
|
|
|
${S1}:${PROJ1}/experiments `
|
|
|
|
|
|
D:\Myresearch\CompanionGuard-RL\code\
|
2026-05-14 11:28:42 +08:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## 核心脚本用法
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
```bash
|
2026-05-14 11:32:02 +08:00
|
|
|
|
# 重新评估检测器(Module B)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
python scripts/evaluate.py \
|
|
|
|
|
|
--detector-ckpt checkpoints/detector/best.pt \
|
|
|
|
|
|
--config configs/detector_config_server.yaml \
|
|
|
|
|
|
--test-data data/processed/CompanionRisk-Bench/test.jsonl \
|
|
|
|
|
|
--source-filter all \
|
|
|
|
|
|
--output experiments/eval_all.json
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
# 重新评估干预策略(Module C)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
python scripts/evaluate.py \
|
|
|
|
|
|
--detector-ckpt checkpoints/detector/best.pt \
|
2026-05-14 11:32:02 +08:00
|
|
|
|
--agent-ckpt checkpoints/intervention/final_v2.pt \
|
2026-05-14 11:28:42 +08:00
|
|
|
|
--test-data data/processed/CompanionRisk-Bench/test.jsonl \
|
2026-05-14 11:32:02 +08:00
|
|
|
|
--config configs/detector_config_server.yaml \
|
|
|
|
|
|
--intervention-config configs/intervention_config.yaml \
|
|
|
|
|
|
--output experiments/eval_intervention_v3.json
|
2026-05-14 11:28:42 +08:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## 关键结果(论文用)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
### Module B — 检测器 v4
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
| 指标 | 值 |
|
|
|
|
|
|
|------|----|
|
|
|
|
|
|
| binary_f1 | **0.9995** |
|
|
|
|
|
|
| high_risk_recall | **1.0000** |
|
|
|
|
|
|
| FNR | **0.00%** |
|
|
|
|
|
|
| level_weighted_f1 | **0.559** |
|
|
|
|
|
|
| fine_macro_f1(public 10类) | **0.484** |
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
### Module C — RL 干预策略 v3(论文用,`eval_intervention_v3.json`)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
| 方法 | safety_recall | over_refusal | action_accuracy | safety_ux_fscore |
|
|
|
|
|
|
|------|--------------|--------------|-----------------|-----------------|
|
|
|
|
|
|
| Rule-based | 0.908 | 0.000 | — | 0.952 |
|
|
|
|
|
|
| Threshold | 0.908 | 0.000 | — | 0.952 |
|
|
|
|
|
|
| **Ours (RL)** | **1.000** | **0.004** | **0.575** | **0.998** |
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
**使用权重**:`checkpoints/intervention/final_v2.pt`(用 `det_l_risk` 重训)
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
|
|
|
|
|
---
|
|
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
## 重要注意事项
|
2026-05-14 11:28:42 +08:00
|
|
|
|
|
2026-05-14 11:32:02 +08:00
|
|
|
|
- **PyYAML 6.x 陷阱**:lr 值必须写 `0.001` 而非 `1e-3`(后者被解析为字符串)
|
|
|
|
|
|
- **RTX 5090 NCCL**:多卡训练需 `NCCL_SHM_DISABLE=1 NCCL_P2P_DISABLE=1`;PPO 阶段用单卡绕开 barrier 问题
|
|
|
|
|
|
- **det_l_risk vs l_risk**:评估和训练均须用检测器预测的 `det_l_risk`,不能用 ground truth `l_risk`
|
|
|
|
|
|
- **obs_dim = 2065**:state 向量结构 `[d_score(1)|l_risk_onehot(5)|c_primary_probs(10)|e_H_pool(1024)|e_P_pool(1024)|t_norm(1)]`
|