Skip to content

enh: Improve Config File Integration and Unify Hardcoded Parameters#68

Open
cbueth wants to merge 4 commits intomainfrom
feat/improve-config
Open

enh: Improve Config File Integration and Unify Hardcoded Parameters#68
cbueth wants to merge 4 commits intomainfrom
feat/improve-config

Conversation

@cbueth
Copy link
Collaborator

@cbueth cbueth commented Feb 25, 2026

Summary

This PR largly improves the configuration management and code maintainability. It introduces a centralized, flexible configuration system and replaces hardcoded literals across the codebase with unified settings.

Key Changes

Configuration System

preprocessing/config.py now has a Settings Object, replacing the static constants.py with a dynamic Settings class exposed as a singleton settings at the package level.

Settings can be flexibly loaded with a fixed precedence/priority order.

  1. Explicit Path: via CLI --config_path or programmatic settings.load(path).
  2. Environment Variable: MULTIPLEYE_CONFIG.
  3. Local Default: multipleye_settings_preprocessing.yaml in the CWD.
  4. Legacy Path: Repository root (supported with a DeprecationWarning). This is the only option as of now.

Configuration is only loaded upon the first access to a setting, avoiding side effects during package import. Configuration changes are logged to the log txt.

Hardcoded Parameters

Several hardcoded strings, regex patterns, and numeric thresholds are now centralised into config.py.
This includes: Column Names: TRIAL_COL, STIMULUS_COL, PAGE_COL, ACTIVITY_COL, WORD_IDX_COL, etc. Regex Patterns. Thresholds: sanity check bounds for calibrations, validations, and data loss. Experiment Logic: NUM_TRIALS and NUM_QUESTIONS_EXPERIMENT.

Usage and Documentation

To uses thes one must import from preprocessing import settings to access or modify any parameter. The notebooks are refactored (preprocessing.ipynb) to demonstrate the new configuration workflow.
docs/guide/configuration.md has been updated.
Also run_multipleye_preprocessing.py can use external configuration files.

Testing

Added tests/unit/preprocessing/test_config.py.

Migration

  • The legacy multipleye_settings_preprocessing.yaml at the repository root is still supported but triggers a deprecation warning!
  • Users are encouraged to move their configuration to the current working directory or use the --config_path argument.

Closes #59

- replace `constants` module with new `settings` configuration class
- update imports and references across preprocessing components

Signed-off-by: Carlson Büth <commit@cbueth.de>
- replace hardcoded column names, patterns, and constants with `settings` attributes across preprocessing components

Signed-off-by: Carlson Büth <commit@cbueth.de>
@cbueth cbueth requested a review from theDebbister February 25, 2026 10:40
@cbueth cbueth self-assigned this Feb 25, 2026
@cbueth cbueth added documentation Improvements or additions to documentation enhancement New feature or request labels Feb 25, 2026
Signed-off-by: Carlson Büth <commit@cbueth.de>
Signed-off-by: Carlson Büth <commit@cbueth.de>
Comment on lines 117 to 120
"# get the data collection name from the settings and create the path to the data folder\n",
"this_repo = Path().resolve()\n",
"data_collection_name = settings.DATA_COLLECTION_NAME\n",
"data_folder_path = settings.DATASET_DIR"
"print(f\"Active Data Collection: {settings.DATA_COLLECTION_NAME}\")\n",
"print(f\"Dataset Directory: {settings.DATASET_DIR}\")"
]
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed preoprocessing notebook to directly use settings. See here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enh: Improve config file integration

1 participant