Fixes for upstream by Johann3141592 · Pull Request #1 · clg-admin/osemosys-rdm

Johann3141592 · 2026-06-10T16:44:16Z

Description

Five bugfixes discovered while running the RDM + PRIM pipeline on a simple test model. Its possible some fixes are not necessary but rather stem from me misunderstanding how the codebase is expected to be run (e.g. is it convention to always call models "Scenario1" or is it a genuine bug that the code has some hardcoded references to that name (same with regions)?). The fixes where made with the help of Opencode & Deepseek v4 pro.

Motivation

I needed to make the code work with the simple model since thats my starting point for a Dissertation project (extending rdm with a gsa).

Changes

Fix pyDOE import (case sensitivity on some filesystems)
Fix hardcoded region name in preprocessing -> now config-driven
Replace hardcoded scenario/period keys with dynamic lookup
Fix YEAR dtype in parquet output and auto-detect CSV delimiter
Deduplicate overlapping driver/outcome columns; drive PRIM outcome
directions from YAML config instead of hardcoded dicts

Note on PRIM_t3f2.yaml

The YAML includes model's specific outcome names for the simple model as illustration of
the new outcome_directions schema. Upstream should substitute their own. Its also a choice I made to push the definition of risky/beneficial to the yaml, doing it in the excel setup might be even more appropriate. Its just the model specific hard code of them which caused caused errors so I needed to fix it.

Testing

The full pipeline went through fine with my simple model and my respective rdm and prim setup files. Tried to run the pipeline with the current upstream model (Botswana?) and config files, rdm went fine but prim caused an error stemming from a line which I didnt edit. I didnt investigate this further, can also be because I only included one model run since my pc is not very capable.

Style

Did not run Black because it changed every file in the upstream codebase and
reformatting would thus obscure the actual bugfixes in the diff.

…vity

Replace hardcoded 'Scenario1' and period_list with dynamic key discovery from the actual data, and deduplicate DataFrame columns to prevent Series-vs-scalar errors in scaling functions.

- z_auxiliar_code: use pd.api.types.is_string_dtype() to detect string columns in parquet writer, ensuring YEAR stays int64 - t3f2_prim_files_creator: auto-detect CSV delimiter (; or ,) to handle both separator formats in experiment data files

- t3f3_prim_manager: skip driver columns that overlap with outcome columns to prevent duplicate key errors in dict_large_table - t3f4_range_finder_mapping: replace hardcoded desirable/risk outcome dictionaries with YAML-driven config - PRIM_t3f2.yaml: add outcome_directions mapping section

Johann3141592 added 5 commits June 10, 2026 16:51

fix(0_experiment_manager) pyDOE would resolve because of case sensiti…

8c769c0

…vity

fix(preprocessing) replace region with config driven value

5c1771c

fix(prim): replace hardcoded scenario/period keys with dynamic lookup

93c4bc3

Replace hardcoded 'Scenario1' and period_list with dynamic key discovery from the actual data, and deduplicate DataFrame columns to prevent Series-vs-scalar errors in scaling functions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes for upstream#1

Fixes for upstream#1
Johann3141592 wants to merge 5 commits into
clg-admin:mainfrom
Johann3141592:fixes-for-upstream

Johann3141592 commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Johann3141592 commented Jun 10, 2026

Description

Motivation

Changes

Note on PRIM_t3f2.yaml

Testing

Style

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant