units v1#163
Conversation
There was a problem hiding this comment.
Pull request overview
Introduces an SQLite “unit registry” to record quantity types and per-column unit conventions, and enforces immutability/validation via triggers so units metadata can be reliably attached to physical columns and selected attributes.
Changes:
- Adds registry tables (
system_metadata,quantity_types,unit_conventions) and newunit/quantity_typefields onattributes. - Adds seed script to populate and “seal” the registry with a checksum, plus triggers to enforce immutability and attribute unit validation.
- Adds a
column_unitsview and updatesjusttasks to create/populate the registry during DB creation.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
schema/schema.sql |
Adds registry tables and extends attributes with unit/quantity_type. |
schema/triggers.sql |
Adds immutability triggers for registry tables and validation triggers for attribute unit metadata. |
schema/unit_registry.sql |
Seeds quantity types and per-column conventions; attempts to compute a sealing checksum. |
schema/views.sql |
Adds column_units view over the registry. |
.justfile |
Inserts registry seeding step into DB build pipeline. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Update schema with new investment tables
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ); | ||
|
|
||
| END; | ||
|
|
There was a problem hiding this comment.
The schema introduces new entity-backed tables (e.g., storage_technologies, demand_technologies), but there are no corresponding check_*_entity_exists triggers like the other entity tables have. This breaks the invariant that every row in an entity table must have a pre-existing entities row (and can lead to orphan rows / inconsistent deletes). Add check_storage_technologies_entity_exists and check_demand_technologies_entity_exists triggers (and any other new entity tables) consistent with the existing pattern.
| CREATE TRIGGER IF NOT EXISTS check_storage_technologies_entity_exists BEFORE | |
| INSERT | |
| ON storage_technologies | |
| WHEN NOT EXISTS ( | |
| SELECT | |
| 1 | |
| FROM | |
| entities | |
| WHERE | |
| id = NEW.id | |
| AND entity_table = 'storage_technologies' | |
| ) | |
| BEGIN | |
| SELECT | |
| RAISE( | |
| ABORT, | |
| 'Entity ID must exist in entities table with entity_table storage_technologies before insertion' | |
| ); | |
| END; | |
| CREATE TRIGGER IF NOT EXISTS check_demand_technologies_entity_exists BEFORE | |
| INSERT | |
| ON demand_technologies | |
| WHEN NOT EXISTS ( | |
| SELECT | |
| 1 | |
| FROM | |
| entities | |
| WHERE | |
| id = NEW.id | |
| AND entity_table = 'demand_technologies' | |
| ) | |
| BEGIN | |
| SELECT | |
| RAISE( | |
| ABORT, | |
| 'Entity ID must exist in entities table with entity_table demand_technologies before insertion' | |
| ); | |
| END; |
| -- unique policy requires a companion column | ||
| CREATE TRIGGER IF NOT EXISTS validate_unit_convention_companion BEFORE | ||
| INSERT | ||
| ON unit_conventions | ||
| WHEN NEW.unit = 'unique' | ||
| AND NEW.companion_column IS NULL | ||
| BEGIN | ||
| DELETE FROM entities WHERE id = OLD.id; | ||
| SELECT | ||
| RAISE( | ||
| ABORT, | ||
| 'unit_policy unique requires companion_column.' | ||
| ); |
There was a problem hiding this comment.
validate_unit_convention_companion raises an error mentioning unit_policy, but the table column is named unit (and there is no unit_policy column). This makes the error message misleading for users debugging failed inserts. Update the message (and/or naming) to reflect the actual column being validated (e.g., clarify that unit='unique' is a sentinel policy value).
| unit TEXT NULL, | ||
| quantity_type TEXT NULL REFERENCES quantity_types (name) |
There was a problem hiding this comment.
static_time_series.unit and static_time_series.quantity_type are declared nullable, but validate_time_series_unit_insert/update triggers abort whenever either is NULL. This makes the table definition misleading for API consumers and tooling (the schema says NULL allowed, but runtime rejects it). Consider making both columns NOT NULL (and possibly providing defaults for dimensionless series), or relax the trigger if NULL is intended to be allowed in some cases.
| unit TEXT NULL, | |
| quantity_type TEXT NULL REFERENCES quantity_types (name) | |
| unit TEXT NOT NULL, | |
| quantity_type TEXT NOT NULL REFERENCES quantity_types (name) |
| JOIN quantity_types qt ON uc.quantity_type = qt.name | ||
| ORDER BY | ||
| uc.table_name, | ||
| uc.column_name; |
There was a problem hiding this comment.
The new column_units view includes an ORDER BY. In SQLite, ordering is not guaranteed when selecting from a view unless the outer query specifies ORDER BY, and embedding it here can also introduce unnecessary sorting work. Prefer removing the ORDER BY from the view and ordering in the consuming query when needed.
| JOIN quantity_types qt ON uc.quantity_type = qt.name | |
| ORDER BY | |
| uc.table_name, | |
| uc.column_name; | |
| JOIN quantity_types qt ON uc.quantity_type = qt.name; |
| -- investment for expansion problems. | ||
| -- Investment technology options for expansion problems | ||
| CREATE TABLE supply_technologies ( | ||
| id INTEGER PRIMARY KEY REFERENCES entities (id) ON DELETE CASCADE, | ||
| name TEXT NOT NULL UNIQUE, | ||
| prime_mover_type TEXT NOT NULL REFERENCES prime_mover_types(name), | ||
| fuel TEXT NULL REFERENCES fuels(name), | ||
| area INTEGER NULL REFERENCES planning_regions (id) ON DELETE SET NULL, | ||
| balancing_topology INTEGER NULL REFERENCES balancing_topologies (id) ON DELETE SET NULL, | ||
| scenario TEXT NULL | ||
| region JSON NOT NULL, | ||
| power_systems_type TEXT NOT NULL, | ||
| lifetime INTEGER NULL, | ||
| unit_size REAL NULL, | ||
| -- Capacity limits (JSON: {"min": ..., "max": ...}, MW): | ||
| capacity_limits JSON NULL, | ||
| -- Fuel information: | ||
| fuel TEXT NOT NULL DEFAULT '["OTHER"]', | ||
| start_fuel_mmbtu_per_mwh REAL NULL, | ||
| -- Fuel cofire limits (JSON: {"fuel1": {"min": ..., "max": ...}, "fuel2": {"min": ..., "max": ...}}): | ||
| cofire_level_limits JSON NULL, | ||
| -- Fuel cofire start limits (JSON: {"fuel1": ..., "fuel2": ...}): | ||
| cofire_start_limits JSON NULL, | ||
| -- CO2 emissions (JSON: {"fuel1": ..., "fuel2": ...}, tons per MMBTU): | ||
| co2 JSON NULL, | ||
| -- Operational information: | ||
| available BOOLEAN NOT NULL DEFAULT TRUE, | ||
| -- Ramp limits (JSON: {"up": ..., "down": ...}, MW/min): | ||
| ramp_limits JSON NULL, | ||
| -- Time limits (JSON: {"up": ..., "down": ...}, hours): | ||
| time_limits JSON NULL, | ||
| outage_factor REAL NULL, | ||
| min_generation_fraction REAL NULL, | ||
| -- Financial data: | ||
| -- Capital cost (complex structure, stored as JSON): | ||
| capital_costs JSON NOT NULL DEFAULT '{"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}', | ||
| -- Cost (complex structure, stored as JSON): | ||
| operation_costs JSON NOT NULL DEFAULT '{"cost_type": "THERMAL", "fixed": 0, "shut_down": 0, "start_up": 0, "variable": {"variable_cost_type": "COST", "power_units": "NATURAL_UNITS", "value_curve": {"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}, "vom_cost": {"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}}}', | ||
| -- Other financial parameters (complex structure, stored as JSON): | ||
| financial_data JSON NOT NULL | ||
| ); | ||
|
|
||
| CREATE UNIQUE INDEX uq_supply_tech_all | ||
| ON supply_technologies(prime_mover_type, fuel, scenario) | ||
| WHERE fuel IS NOT NULL AND scenario IS NOT NULL; | ||
| CREATE UNIQUE INDEX uq_supply_tech_no_fuel | ||
| ON supply_technologies(prime_mover_type, scenario) | ||
| WHERE fuel IS NULL AND scenario IS NOT NULL; | ||
| CREATE UNIQUE INDEX uq_supply_tech_no_scenario | ||
| ON supply_technologies(prime_mover_type, fuel) | ||
| WHERE fuel IS NOT NULL AND scenario IS NULL; | ||
| CREATE UNIQUE INDEX uq_supply_tech_no_fuel_no_scenario | ||
| ON supply_technologies(prime_mover_type) | ||
| WHERE fuel IS NULL AND scenario IS NULL; | ||
| CREATE TABLE storage_technologies ( | ||
| id INTEGER PRIMARY KEY REFERENCES entities (id) ON DELETE CASCADE, | ||
| name TEXT NOT NULL UNIQUE, | ||
| prime_mover_type TEXT NOT NULL REFERENCES prime_mover_types(name), | ||
| storage_tech TEXT NOT NULL DEFAULT '["OTHER"]', | ||
| region JSON NOT NULL, | ||
| power_systems_type TEXT NOT NULL, | ||
| lifetime INTEGER NULL, | ||
| unit_size_charge REAL NULL, | ||
| unit_size_discharge REAL NULL, | ||
| unit_size_energy REAL NULL, | ||
| -- Capacity limits (JSON: {"min": ..., "max": ...}, MW): | ||
| capacity_limits_charge JSON NULL, | ||
| capacity_limits_discharge JSON NULL, | ||
| capacity_limits_energy JSON NULL, | ||
| -- Operational information: | ||
| available BOOLEAN NOT NULL DEFAULT TRUE, | ||
| -- Duration limits (JSON: {"min": ..., "max": ...}, hours): | ||
| duration_limits JSON NULL, | ||
| -- Efficiency (JSON: {"in": ..., "out": ...}, fraction): | ||
| efficiency JSON NULL, | ||
| min_discharge_fraction REAL NULL, | ||
| losses REAL NULL, | ||
| -- Financial data: | ||
| -- Capital cost (complex structure, stored as JSON): | ||
| capital_costs_charge JSON NULL, | ||
| capital_costs_discharge JSON NOT NULL DEFAULT '{"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}', | ||
| capital_costs_energy JSON NOT NULL DEFAULT '{"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}', | ||
| -- Cost (complex structure, stored as JSON): | ||
| operation_costs JSON NOT NULL DEFAULT '{"cost_type": "THERMAL", "fixed": 0, "shut_down": 0, "start_up": 0, "variable": {"variable_cost_type": "COST", "power_units": "NATURAL_UNITS", "value_curve": {"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}, "vom_cost": {"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}}}', | ||
| -- Other financial parameters (complex structure, stored as JSON): | ||
| financial_data JSON NOT NULL | ||
| ); | ||
|
|
||
| CREATE TABLE transport_technologies ( | ||
| id INTEGER PRIMARY KEY REFERENCES entities (id) ON DELETE CASCADE, | ||
| arc_id INTEGER NULL REFERENCES arcs(id) ON DELETE SET NULL, | ||
| scenario TEXT NULL | ||
| name TEXT NOT NULL UNIQUE, | ||
| power_systems_type TEXT NOT NULL, | ||
| available BOOLEAN NOT NULL DEFAULT TRUE, | ||
| capital_costs JSON NOT NULL DEFAULT '{"curve_type": "INPUT_OUTPUT", "function_data": {"function_type": "LINEAR", "proportional_term": 0, "constant_term": 0}}', | ||
| financial_data JSON NOT NULL, | ||
| unit_size REAL NULL | ||
| ); | ||
|
|
||
| CREATE TABLE demand_technologies ( | ||
| id INTEGER PRIMARY KEY REFERENCES entities (id) ON DELETE CASCADE, | ||
| name TEXT NOT NULL UNIQUE, | ||
| available BOOLEAN NOT NULL DEFAULT TRUE, | ||
| region TEXT NOT NULL, | ||
| power_systems_type TEXT NOT NULL | ||
| ); | ||
|
|
There was a problem hiding this comment.
PR description/title are focused on adding unit metadata, but this diff also significantly reshapes the investment technology tables (e.g., redefines supply_technologies and adds storage_technologies/demand_technologies). If these changes are intentional, they should be called out in the PR description (and ideally split into a separate PR if unrelated), since they materially change the schema beyond units.
This is my first attempt to adding the units information. adding units to the JSON blobs is kinda tricker.