Skip to content

Feat/db optimizations (STIT-502)#135

Draft
AlexAxthelm wants to merge 9 commits into
mainfrom
feat/db-optimizations
Draft

Feat/db optimizations (STIT-502)#135
AlexAxthelm wants to merge 9 commits into
mainfrom
feat/db-optimizations

Conversation

@AlexAxthelm

@AlexAxthelm AlexAxthelm commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Major refactor to DB, so that rather than storing data values for sources in "wide" format in oil_gas_field_sources, it's stored "long" in oil_gas_field_source_values, allowing for more idiomatic SQL queries.

Python data model untouched, but there are some changes to the SQLAlchemy translation layer between the two data models.

@AlexAxthelm AlexAxthelm self-assigned this Jun 17, 2026
Copilot AI review requested due to automatic review settings June 17, 2026 19:18
@AlexAxthelm AlexAxthelm marked this pull request as draft June 17, 2026 19:18
Comment thread deployments/api/src/stitch/api/db/model/oil_gas_field_source_value.py Dismissed
@github-actions

Copy link
Copy Markdown

CD summary 583dbaa

Frontend: https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net

Deployments (4)
service url fqdn
api open pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io
entity-linkage open pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io
frontend https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net
stitch-llm open pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io
Database (1)
db_name postgres_host postgres_port postgres_db
pr_135 stitch-dev.postgres.database.azure.com 5432 pr_135
Jobs (2)
job image postgres_db api_url auth_mode
db-migrations ghcr.io/rmi/stitch-api:pr-135@sha256:543d5f19d402e28a2d67307a32e722da8810c1a42596d4ce998707a64232baac pr_135
seed ghcr.io/rmi/stitch-seed:pr-135@sha256:a21a5196ac0892d25fc2fd3772b1946049c406e63f06238f429a27698c84ce7b https://pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io/api/v1 stitch-client-bearer-token
Images (4)
build_time commit_time git_sha image image_digest
2026-06-17T19:18:59Z 2026-06-17T19:18:27Z 11e6640 ghcr.io/rmi/stitch-api:pr-135 ghcr.io/rmi/stitch-api:pr-135@sha256:543d5f19d402e28a2d67307a32e722da8810c1a42596d4ce998707a64232baac
2026-06-17T19:19:01Z 2026-06-17T19:18:27Z 11e6640 ghcr.io/rmi/stitch-entity-linkage:pr-135 ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:51b763538153a35aa6631424bb7b291d08fc3027b3a017b3afa723662199d831
2026-06-17T19:18:53Z 2026-06-17T19:18:27Z 11e6640 ghcr.io/rmi/stitch-seed:pr-135 ghcr.io/rmi/stitch-seed:pr-135@sha256:a21a5196ac0892d25fc2fd3772b1946049c406e63f06238f429a27698c84ce7b
2026-06-17T19:18:57Z 2026-06-17T19:18:27Z 11e6640 ghcr.io/rmi/stitch-stitch-llm:pr-135 ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:d3dd65b48be406ab4ff124044aae862806b5bd524780c4f39d0ea10f0932edc7

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a long-form (EAV) storage model for OG field source attributes and shifts coalescing/query work into SQL to improve performance, including support for per-resource source-priority overrides.

Changes:

  • Added oil_gas_field_source_values long table + ORM model and updated OilGasFieldSourceModel to store only header + raw payload.
  • Implemented SQL-based coalescing (coalesce_sql.py) and updated resource list/detail paths to use it.
  • Refactored source-record querying (OGFieldQueryMixin) to pivot long values for filtering/sorting, and updated integration tests/fixtures accordingly.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
deployments/api/tests/utils.py Adds a helper to build long-form source ORM models for tests.
deployments/api/tests/routers/test_licensed_sources_routes.py Updates seeding to use the new long-form source model helper.
deployments/api/tests/db/test_resource_actions.py Updates resource/source seeding and adapts to new list CTE builder.
deployments/api/tests/db/test_base_query.py Adjusts query tests to the new membership-gated, long-aware source query.
deployments/api/src/stitch/api/db/utils.py Switches resource coalescing to SQL (coalesce_resource).
deployments/api/src/stitch/api/db/og_field_resource_actions.py Uses SQL coalesced list CTE and deserializes JSON fields emitted as text.
deployments/api/src/stitch/api/db/model/types.py Adds portable float + JSON(NULL) types for the long values table.
deployments/api/src/stitch/api/db/model/oil_gas_field_source.py Converts source model to header+relationship, builds/reads entities via long values.
deployments/api/src/stitch/api/db/model/oil_gas_field_source_value.py New EAV/long storage model with constraints + typed value routing.
deployments/api/src/stitch/api/db/model/og_field_resource_source_priority.py Adds per-resource source priority override table/model.
deployments/api/src/stitch/api/db/model/og_field_query_mixin.py Reworks filtering/sorting/pagination to pivot long values.
deployments/api/src/stitch/api/db/model/init.py Exports new models.
deployments/api/src/stitch/api/db/coalesce_sql.py New SQL-side coalescing + list CTE builder + detail coalesce helper.
deployments/api/alembic/versions/f3fb36006ce6_baseline.py Updates baseline schema for long values + priority overrides.
Comments suppressed due to low confidence (2)

deployments/api/alembic/versions/f3fb36006ce6_baseline.py:121

  • This migration depends on stitch.api.db.model.types.StitchJson(), which requires importing application code at migration runtime. That can make migrations brittle if the app module path/type changes later. Prefer an Alembic-local SQLAlchemy type definition (e.g., sa.JSON() with a Postgres JSONB variant).
    deployments/api/alembic/versions/f3fb36006ce6_baseline.py:14
  • After inlining the JSON/JSONB type for source_record, this import becomes unnecessary and (more importantly) pulls application code into the migration environment. Removing it helps keep migrations self-contained and stable over time.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread deployments/api/src/stitch/api/db/coalesce_sql.py Outdated
Comment thread deployments/api/src/stitch/api/db/model/og_field_query_mixin.py
Comment thread deployments/api/src/stitch/api/db/coalesce_sql.py
Comment thread deployments/api/tests/utils.py Outdated
@github-actions

Copy link
Copy Markdown

CD summary eddeb80

Frontend: https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net

Deployments (4)
service url fqdn
api open pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io
entity-linkage open pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io
frontend https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net
stitch-llm open pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io
Database (1)
db_name postgres_host postgres_port postgres_db
pr_135 stitch-dev.postgres.database.azure.com 5432 pr_135
Jobs (1)
job image postgres_db
db-migrations ghcr.io/rmi/stitch-api:pr-135@sha256:ba0d3c9d7e85000fb0e3abf1b6dcda3c28133c24077be050ced31fc847c42ec5 pr_135
Images (4)
build_time commit_time git_sha image image_digest
2026-06-18T13:56:50Z 2026-06-18T13:56:18Z 3e6fbf4 ghcr.io/rmi/stitch-api:pr-135 ghcr.io/rmi/stitch-api:pr-135@sha256:ba0d3c9d7e85000fb0e3abf1b6dcda3c28133c24077be050ced31fc847c42ec5
2026-06-18T13:56:45Z 2026-06-18T13:56:18Z 3e6fbf4 ghcr.io/rmi/stitch-entity-linkage:pr-135 ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:572856b806ba6ca6e02f25152eb67013d743daf542c1a16eea6c3e1c32b38e68
2026-06-18T13:56:39Z 2026-06-18T13:56:18Z 3e6fbf4 ghcr.io/rmi/stitch-seed:pr-135 ghcr.io/rmi/stitch-seed:pr-135@sha256:340f315800796fcc3478db9421b5ec615adcf7a128fe58866650c8920f6217bc
2026-06-18T13:56:47Z 2026-06-18T13:56:18Z 3e6fbf4 ghcr.io/rmi/stitch-stitch-llm:pr-135 ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:0f7089fc357ed2fe385f8fb4ccd34fec8ef52423c429e4a4bd88801eb4eb7eba

@AlexAxthelm AlexAxthelm changed the title Feat/db optimizations Feat/db optimizations (STIT-502) Jun 18, 2026
@github-actions

Copy link
Copy Markdown

CD summary ea47e94

Frontend: https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net

Deployments (4)
service url fqdn
api open pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io
entity-linkage open pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io
frontend https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net
stitch-llm open pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io
Database (1)
db_name postgres_host postgres_port postgres_db
pr_135 stitch-dev.postgres.database.azure.com 5432 pr_135
Jobs (1)
job image postgres_db
db-migrations ghcr.io/rmi/stitch-api:pr-135@sha256:ef0d4fad2b07ed4a78fc7675a2b279f63a3de0489461e4276dbf5d936cbfb353 pr_135
Images (4)
build_time commit_time git_sha image image_digest
2026-06-18T16:22:19Z 2026-06-18T16:21:59Z 1d91184 ghcr.io/rmi/stitch-api:pr-135 ghcr.io/rmi/stitch-api:pr-135@sha256:ef0d4fad2b07ed4a78fc7675a2b279f63a3de0489461e4276dbf5d936cbfb353
2026-06-18T16:22:18Z 2026-06-18T16:21:59Z 1d91184 ghcr.io/rmi/stitch-entity-linkage:pr-135 ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:3329be67464cc68b289dbc0af0932d8791c433966e45b1140134c0bc9d5aaaee
2026-06-18T16:22:16Z 2026-06-18T16:21:59Z 1d91184 ghcr.io/rmi/stitch-seed:pr-135 ghcr.io/rmi/stitch-seed:pr-135@sha256:19c7993a0cdb070f786d36ff7e4ae476fa6601ce84bbe7a509aaf4a0e7de9818
2026-06-18T16:22:15Z 2026-06-18T16:21:59Z 1d91184 ghcr.io/rmi/stitch-stitch-llm:pr-135 ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:5bec43526e30dd5e17fbf358dbe3312f5b8d49178d6b56b664db845da40ef68f

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

deployments/api/alembic/versions/f3fb36006ce6_baseline.py:380

  • The NOTE claims ILIKE '%term%' substring search is "backed by" the (colname, value_text) B-tree index. A standard B-tree cannot accelerate a leading-wildcard pattern; at best it can help pre-filter by colname before scanning. This comment should be clarified to avoid overestimating performance.

Comment thread deployments/api/src/stitch/api/db/coalesce_sql.py
Comment thread deployments/api/src/stitch/api/db/model/oil_gas_field_source.py Outdated
Comment thread deployments/api/src/stitch/api/db/model/oil_gas_field_source_value.py Outdated
@github-actions

Copy link
Copy Markdown

CD summary 903f66c

Frontend: https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net

Deployments (4)
service url fqdn
api open pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io
entity-linkage open pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io
frontend https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net
stitch-llm open pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io
Database (1)
db_name postgres_host postgres_port postgres_db
pr_135 stitch-dev.postgres.database.azure.com 5432 pr_135
Jobs (1)
job image postgres_db
db-migrations ghcr.io/rmi/stitch-api:pr-135@sha256:88335d4f96be3784cf7c2cd2aef4447e1e557878617562fd337d74183a18dbd8 pr_135
Images (4)
build_time commit_time git_sha image image_digest
2026-06-18T16:38:17Z 2026-06-18T16:37:59Z ce6ea53 ghcr.io/rmi/stitch-api:pr-135 ghcr.io/rmi/stitch-api:pr-135@sha256:88335d4f96be3784cf7c2cd2aef4447e1e557878617562fd337d74183a18dbd8
2026-06-18T16:38:18Z 2026-06-18T16:37:59Z ce6ea53 ghcr.io/rmi/stitch-entity-linkage:pr-135 ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:da61d488065cbc88d64895eaa3ee45beca8f792d09996d8e3b26311089d62c3f
2026-06-18T16:38:14Z 2026-06-18T16:37:59Z ce6ea53 ghcr.io/rmi/stitch-seed:pr-135 ghcr.io/rmi/stitch-seed:pr-135@sha256:223a0f3d3f721b9f327e95bfd4303ce1ae975a62d24a4280d249bbea48bba5e7
2026-06-18T16:38:14Z 2026-06-18T16:37:59Z ce6ea53 ghcr.io/rmi/stitch-stitch-llm:pr-135 ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:5d8104b71eee353d036d367d8d7cf189dc3835fa83b76229f8cd7294d9826352

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants