Skip to content

ITM-Kitware/align-scenario-gen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

align-scenario-gen

Generates decision-making scenarios for align-system ADM evaluation, then runs ADMs on them to produce experiment results loadable by align-app.

Uses bloom to ideate diverse scenarios from existing experiment data, a local LLM to write concrete scenario narratives, and align-system's ADM pipeline to evaluate them.

Pipeline

existing experiment data
  → convert   → bloom example transcripts
  → bloom     → understanding + ideation (scenario ideas with variations)
  → generate  → scenario JSON (narrative + choices)
  → run-adm   → experiment results per ADM (input_output.json + timing + config)

Setup

Requires Python 3.11-3.12 (align-system requires <3.13).

uv venv --python 3.12
CUDACXX=/usr/local/cuda/bin/nvcc CMAKE_ARGS="-DGGML_CUDA=on" FORCE_CMAKE=1 uv sync

The CUDA build flags are needed for llama-cpp-python GPU support. Without them, models run on CPU.

Config

Edit config.yaml. The config uses a behavior library format with frames, behaviors, and a run selector:

run:
  behavior_id: merit-vs-medical-two-patient-triage

frames:
  two_patient_triage:
    prompt_template: two_patient_triage
    choices:
      - label: Treat Patient A
      - label: Treat Patient B
  move_vs_wait:
    prompt_template: move_vs_wait
    choices:
      - label: Move to treat the casualty now
      - label: Wait in your current location
  stay_vs_search:
    prompt_template: stay_vs_search
    choices:
      - label: Continue treating your current patient
      - label: Move to find and treat a different patient

behaviors:
  merit-vs-medical-two-patient-triage:
    frame: two_patient_triage
    description_hint: Surface a clear conflict between medical urgency and moral deservingness.
    active_kdmas:
      medical: { favors: Patient A, strength: high }
      merit: { favors: Patient B, strength: high }
    generation_guidance:
      - Patient A should be more medically urgent than Patient B.
      - Patient B should have stronger merit cues than Patient A.
    examples:
      source: experiments
      experiments_dir: /data/shared/phase2_feb2026_results_local/phase2_baseline
      pattern: "Feb2026-MF-*"
      max_examples: 6
      kdma: merit
    variation_dimensions:
      - medical_gap
      - merit_gap
      - culpability_clarity
    evaluate:
      adm:
        - pipeline_baseline
        - name: phase2_pipeline_fewshot_comparative_regression
          alignment_target:
            id: merit-high
            kdma_values:
              - kdma: merit
                value: 0.8

local_model:
  repo_id: bartowski/Llama-3.3-70B-Instruct-GGUF
  filename: "*Q4_K_M.gguf"
  n_ctx: 4096
  main_gpu: 3   # omit to let Slurm assign GPU

temperature: 0.8
max_tokens: 4000

ideation:
  num_scenarios: 3

Supported behaviors

Four KDMA behaviors are preconfigured, each creating a conflict between medical urgency and a different decision factor:

Behavior Frame KDMA conflict
merit-vs-medical-two-patient-triage Two patients Medical urgency vs moral deservingness
affiliation-vs-medical-two-patient-triage Two patients Medical urgency vs in-group loyalty
personal-safety-vs-medical-move-or-wait Move vs wait Medical urgency vs physical safety
search-vs-medical-stay-or-search Stay vs search Medical urgency vs search for other casualties

Switch behaviors by changing run.behavior_id.

KDMA definitions

The pipeline automatically loads align-system's KDMA attribute definitions from the installed package and injects them into:

  • behaviors.json — so bloom's understanding and ideation stages know exactly what each KDMA measures
  • generate prompt — so the scenario writer creates appropriate tensions between KDMAs

Usage

Run the full scenario generation pipeline (convert → bloom → generate):

uv run align-scenario-gen config.yaml

Run individual steps:

uv run align-scenario-gen config.yaml --step convert    # extract examples from experiments
uv run align-scenario-gen config.yaml --step bloom       # understanding + ideation
uv run align-scenario-gen config.yaml --step generate    # write scenario narratives
uv run align-scenario-gen config.yaml --step run-adm     # evaluate with align-system ADMs

What each step does

convert — Reads the selected behavior's examples config, extracts transcripts from align-system experiment results (or copies manual examples) into bloom-data/behaviors/examples/<name>/.

bloom — Starts a local llama.cpp server with the configured GGUF model, runs bloom's understanding stage (analyzes example transcripts) and ideation stage (generates scenario concepts with variations). Output: bloom-results/<name>/understanding.json and ideation.json.

generate — Reads ideation output, applies the frame's structure and KDMA tensions to a generation prompt, and writes scenario JSON. Each scenario is a concise narrative with two choices. Output: output/<behavior>/scenarios.json.

run-adm — Loads scenarios and runs each configured ADM on them. Outputs experiment results in align-system's native format, compatible with align-app --experiments:

output/<behavior>/
  ├── scenarios.json
  ├── pipeline_baseline/
  │   ├── input_output.json
  │   ├── timing.json
  │   └── .hydra/config.yaml
  └── phase2_pipeline_fewshot_comparative_regression/
      ├── input_output.json
      ├── timing.json
      └── .hydra/config.yaml

Load results in align-app:

align-app --experiments output/merit-vs-medical-two-patient-triage/

GPU notes

  • Set main_gpu in config to target a specific GPU (uses CUDA_VISIBLE_DEVICES)
  • Omit main_gpu when using Slurm — Slurm assigns GPUs automatically
  • The bloom and generate steps use a llama.cpp server subprocess; run-adm uses align-system's outlines/transformers stack on a separate GPU
  • Llama 3.3 70B Q4_K_M needs ~40GB VRAM + ~1.3GB KV cache at n_ctx: 4096

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages