feat: add paper/ LaTeX draft, English data scripts, update progress docs
- paper/: 22-page LaTeX framework (7/10 sections complete, compiles cleanly) main.tex + 10 section files + refs.bib + compiled PDF (329KB) - code/scripts/: three English dataset generation & merging scripts generate_english.py / generate_english_targeted.py / merge_v5.py - CLAUDE.md: update paper writing status, add paper/ file map entry - state.md: add section 8 paper writing progress (2026-05-15) - .gitignore: add LaTeX build artifact exclusion rules Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -41,9 +41,10 @@
|
||||
| Module B 泛化验证 | ✅ | human subset binary_f1=0.9848,无同源过拟合 |
|
||||
| Module C v3(当前) | ⚠️ | safety_recall=1.0 ✅,over_refusal=0.004 ✅,action_accuracy=**0.575** ❌,crisis_precision=**0.421** ❌ |
|
||||
| Module C v5(下一步) | 🔄 | reward 重写 + 环境修复,**见 `change.md` 完整路线** |
|
||||
| 论文写作 | 🔄 | 待 Module C v5 完成后启动 |
|
||||
| 论文写作 | 🔄 | LaTeX 框架已搭建(`paper/`),方法节完整,结果节等 v5 + SOTA baseline |
|
||||
|
||||
> **Module C 尚未完成**。v3 的 action_accuracy 和 crisis_precision 均未达标,需要按 `change.md` 执行 v5。
|
||||
> **投稿前必补实验**:① Llama Guard v2 / WildGuard 评估(Module B SOTA 对标);② LLM-as-judge baseline(Module C);③ 消融实验(BC-only / 无 CrossAttention)。
|
||||
|
||||
---
|
||||
|
||||
@@ -71,6 +72,7 @@
|
||||
| `experiments/eval_intervention_v3.json` | Module C 当前最佳结果(论文参考基准) |
|
||||
| `experiments/eval_intervention_v4.json` | v3 重跑确认(数字相同,验证可复现) |
|
||||
| `docs/` | 研究文档(研究框架、数据集设计、前期报告) |
|
||||
| `paper/` | **论文 LaTeX 源码**(主框架已就绪,见 state.md §八) |
|
||||
|
||||
### 代码级(code/)
|
||||
| 路径 | 用途 |
|
||||
|
||||
Reference in New Issue
Block a user