feat: port wangyu data pipeline and scripts into code/ structure
- code/src/data/: data_generator, dataset, llm_judge, __init__ (multi-turn LLM dialogue generator, JSONL loader, LLM auto-annotator) - code/scripts/: generate_siliconflow.py (SiliconFlow async generator, 701 lines) run_detector.sh / run_intervention.sh / run_full_pipeline.sh (launch scripts) - code/configs/intervention_config.yaml: add reward.w1-w5 reference block (NOTE: v5 reward.py uses hardcoded constants; these fields are reference-only) - .gitignore: fix data/ pattern to /data/ to avoid matching code/src/data/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -33,6 +33,17 @@ ppo:
|
||||
environment:
|
||||
max_turns: 20
|
||||
|
||||
# reward weights — NOTE: v5 reward.py uses hardcoded constants (EXACT_ALIGN_BONUS,
|
||||
# PASS_HIGH_PENALTY, etc.), the `weights` param of compute_reward() is unused.
|
||||
# These values are kept here for reference only (from wangyu's earlier config).
|
||||
# DEFAULT_WEIGHTS alias in reward.py: {"w1":2.0, "w2":3.0, "w3":4.0, "w4":1.5, "w5":0.5}
|
||||
reward:
|
||||
w1: 2.0 # safety_recall (reference only — not read by reward.py v5)
|
||||
w2: 3.0 # crisis_precision (reference only)
|
||||
w3: 4.0 # action_accuracy (reference only)
|
||||
w4: 1.5 # over_refusal penalty (reference only)
|
||||
w5: 0.5 # fluency (reference only)
|
||||
|
||||
evaluation:
|
||||
binary_threshold: 0.5
|
||||
|
||||
|
||||
Reference in New Issue
Block a user