Feat/db optimizations (STIT-502)#135
Conversation
CD summary
|
| service | url | fqdn |
|---|---|---|
| api | open | pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| entity-linkage | open | pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| frontend | https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net | |
| stitch-llm | open | pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
Database (1)
| db_name | postgres_host | postgres_port | postgres_db |
|---|---|---|---|
| pr_135 | stitch-dev.postgres.database.azure.com |
5432 |
pr_135 |
Jobs (2)
| job | image | postgres_db | api_url | auth_mode |
|---|---|---|---|---|
| db-migrations | ghcr.io/rmi/stitch-api:pr-135@sha256:543d5f19d402e28a2d67307a32e722da8810c1a42596d4ce998707a64232baac |
pr_135 |
||
| seed | ghcr.io/rmi/stitch-seed:pr-135@sha256:a21a5196ac0892d25fc2fd3772b1946049c406e63f06238f429a27698c84ce7b |
https://pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io/api/v1 |
stitch-client-bearer-token |
Images (4)
| build_time | commit_time | git_sha | image | image_digest |
|---|---|---|---|---|
| 2026-06-17T19:18:59Z | 2026-06-17T19:18:27Z | 11e6640 | ghcr.io/rmi/stitch-api:pr-135 |
ghcr.io/rmi/stitch-api:pr-135@sha256:543d5f19d402e28a2d67307a32e722da8810c1a42596d4ce998707a64232baac |
| 2026-06-17T19:19:01Z | 2026-06-17T19:18:27Z | 11e6640 | ghcr.io/rmi/stitch-entity-linkage:pr-135 |
ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:51b763538153a35aa6631424bb7b291d08fc3027b3a017b3afa723662199d831 |
| 2026-06-17T19:18:53Z | 2026-06-17T19:18:27Z | 11e6640 | ghcr.io/rmi/stitch-seed:pr-135 |
ghcr.io/rmi/stitch-seed:pr-135@sha256:a21a5196ac0892d25fc2fd3772b1946049c406e63f06238f429a27698c84ce7b |
| 2026-06-17T19:18:57Z | 2026-06-17T19:18:27Z | 11e6640 | ghcr.io/rmi/stitch-stitch-llm:pr-135 |
ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:d3dd65b48be406ab4ff124044aae862806b5bd524780c4f39d0ea10f0932edc7 |
There was a problem hiding this comment.
Pull request overview
This PR introduces a long-form (EAV) storage model for OG field source attributes and shifts coalescing/query work into SQL to improve performance, including support for per-resource source-priority overrides.
Changes:
- Added
oil_gas_field_source_valueslong table + ORM model and updatedOilGasFieldSourceModelto store only header + raw payload. - Implemented SQL-based coalescing (
coalesce_sql.py) and updated resource list/detail paths to use it. - Refactored source-record querying (
OGFieldQueryMixin) to pivot long values for filtering/sorting, and updated integration tests/fixtures accordingly.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| deployments/api/tests/utils.py | Adds a helper to build long-form source ORM models for tests. |
| deployments/api/tests/routers/test_licensed_sources_routes.py | Updates seeding to use the new long-form source model helper. |
| deployments/api/tests/db/test_resource_actions.py | Updates resource/source seeding and adapts to new list CTE builder. |
| deployments/api/tests/db/test_base_query.py | Adjusts query tests to the new membership-gated, long-aware source query. |
| deployments/api/src/stitch/api/db/utils.py | Switches resource coalescing to SQL (coalesce_resource). |
| deployments/api/src/stitch/api/db/og_field_resource_actions.py | Uses SQL coalesced list CTE and deserializes JSON fields emitted as text. |
| deployments/api/src/stitch/api/db/model/types.py | Adds portable float + JSON(NULL) types for the long values table. |
| deployments/api/src/stitch/api/db/model/oil_gas_field_source.py | Converts source model to header+relationship, builds/reads entities via long values. |
| deployments/api/src/stitch/api/db/model/oil_gas_field_source_value.py | New EAV/long storage model with constraints + typed value routing. |
| deployments/api/src/stitch/api/db/model/og_field_resource_source_priority.py | Adds per-resource source priority override table/model. |
| deployments/api/src/stitch/api/db/model/og_field_query_mixin.py | Reworks filtering/sorting/pagination to pivot long values. |
| deployments/api/src/stitch/api/db/model/init.py | Exports new models. |
| deployments/api/src/stitch/api/db/coalesce_sql.py | New SQL-side coalescing + list CTE builder + detail coalesce helper. |
| deployments/api/alembic/versions/f3fb36006ce6_baseline.py | Updates baseline schema for long values + priority overrides. |
Comments suppressed due to low confidence (2)
deployments/api/alembic/versions/f3fb36006ce6_baseline.py:121
- This migration depends on
stitch.api.db.model.types.StitchJson(), which requires importing application code at migration runtime. That can make migrations brittle if the app module path/type changes later. Prefer an Alembic-local SQLAlchemy type definition (e.g.,sa.JSON()with a Postgres JSONB variant).
deployments/api/alembic/versions/f3fb36006ce6_baseline.py:14 - After inlining the JSON/JSONB type for
source_record, this import becomes unnecessary and (more importantly) pulls application code into the migration environment. Removing it helps keep migrations self-contained and stable over time.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
CD summary
|
| service | url | fqdn |
|---|---|---|
| api | open | pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| entity-linkage | open | pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| frontend | https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net | |
| stitch-llm | open | pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
Database (1)
| db_name | postgres_host | postgres_port | postgres_db |
|---|---|---|---|
| pr_135 | stitch-dev.postgres.database.azure.com |
5432 |
pr_135 |
Jobs (1)
| job | image | postgres_db |
|---|---|---|
| db-migrations | ghcr.io/rmi/stitch-api:pr-135@sha256:ba0d3c9d7e85000fb0e3abf1b6dcda3c28133c24077be050ced31fc847c42ec5 |
pr_135 |
Images (4)
| build_time | commit_time | git_sha | image | image_digest |
|---|---|---|---|---|
| 2026-06-18T13:56:50Z | 2026-06-18T13:56:18Z | 3e6fbf4 | ghcr.io/rmi/stitch-api:pr-135 |
ghcr.io/rmi/stitch-api:pr-135@sha256:ba0d3c9d7e85000fb0e3abf1b6dcda3c28133c24077be050ced31fc847c42ec5 |
| 2026-06-18T13:56:45Z | 2026-06-18T13:56:18Z | 3e6fbf4 | ghcr.io/rmi/stitch-entity-linkage:pr-135 |
ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:572856b806ba6ca6e02f25152eb67013d743daf542c1a16eea6c3e1c32b38e68 |
| 2026-06-18T13:56:39Z | 2026-06-18T13:56:18Z | 3e6fbf4 | ghcr.io/rmi/stitch-seed:pr-135 |
ghcr.io/rmi/stitch-seed:pr-135@sha256:340f315800796fcc3478db9421b5ec615adcf7a128fe58866650c8920f6217bc |
| 2026-06-18T13:56:47Z | 2026-06-18T13:56:18Z | 3e6fbf4 | ghcr.io/rmi/stitch-stitch-llm:pr-135 |
ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:0f7089fc357ed2fe385f8fb4ccd34fec8ef52423c429e4a4bd88801eb4eb7eba |
CD summary
|
| service | url | fqdn |
|---|---|---|
| api | open | pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| entity-linkage | open | pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| frontend | https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net | |
| stitch-llm | open | pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
Database (1)
| db_name | postgres_host | postgres_port | postgres_db |
|---|---|---|---|
| pr_135 | stitch-dev.postgres.database.azure.com |
5432 |
pr_135 |
Jobs (1)
| job | image | postgres_db |
|---|---|---|
| db-migrations | ghcr.io/rmi/stitch-api:pr-135@sha256:ef0d4fad2b07ed4a78fc7675a2b279f63a3de0489461e4276dbf5d936cbfb353 |
pr_135 |
Images (4)
| build_time | commit_time | git_sha | image | image_digest |
|---|---|---|---|---|
| 2026-06-18T16:22:19Z | 2026-06-18T16:21:59Z | 1d91184 | ghcr.io/rmi/stitch-api:pr-135 |
ghcr.io/rmi/stitch-api:pr-135@sha256:ef0d4fad2b07ed4a78fc7675a2b279f63a3de0489461e4276dbf5d936cbfb353 |
| 2026-06-18T16:22:18Z | 2026-06-18T16:21:59Z | 1d91184 | ghcr.io/rmi/stitch-entity-linkage:pr-135 |
ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:3329be67464cc68b289dbc0af0932d8791c433966e45b1140134c0bc9d5aaaee |
| 2026-06-18T16:22:16Z | 2026-06-18T16:21:59Z | 1d91184 | ghcr.io/rmi/stitch-seed:pr-135 |
ghcr.io/rmi/stitch-seed:pr-135@sha256:19c7993a0cdb070f786d36ff7e4ae476fa6601ce84bbe7a509aaf4a0e7de9818 |
| 2026-06-18T16:22:15Z | 2026-06-18T16:21:59Z | 1d91184 | ghcr.io/rmi/stitch-stitch-llm:pr-135 |
ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:5bec43526e30dd5e17fbf358dbe3312f5b8d49178d6b56b664db845da40ef68f |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.
Comments suppressed due to low confidence (1)
deployments/api/alembic/versions/f3fb36006ce6_baseline.py:380
- The NOTE claims
ILIKE '%term%'substring search is "backed by" the(colname, value_text)B-tree index. A standard B-tree cannot accelerate a leading-wildcard pattern; at best it can help pre-filter bycolnamebefore scanning. This comment should be clarified to avoid overestimating performance.
CD summary
|
| service | url | fqdn |
|---|---|---|
| api | open | pr-135-api.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| entity-linkage | open | pr-135-entity-linkage.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
| frontend | https://witty-mushroom-017a3dc1e-135.westus2.1.azurestaticapps.net | |
| stitch-llm | open | pr-135-stitch-llm.purplegrass-c07d0a94.westus2.azurecontainerapps.io |
Database (1)
| db_name | postgres_host | postgres_port | postgres_db |
|---|---|---|---|
| pr_135 | stitch-dev.postgres.database.azure.com |
5432 |
pr_135 |
Jobs (1)
| job | image | postgres_db |
|---|---|---|
| db-migrations | ghcr.io/rmi/stitch-api:pr-135@sha256:88335d4f96be3784cf7c2cd2aef4447e1e557878617562fd337d74183a18dbd8 |
pr_135 |
Images (4)
| build_time | commit_time | git_sha | image | image_digest |
|---|---|---|---|---|
| 2026-06-18T16:38:17Z | 2026-06-18T16:37:59Z | ce6ea53 | ghcr.io/rmi/stitch-api:pr-135 |
ghcr.io/rmi/stitch-api:pr-135@sha256:88335d4f96be3784cf7c2cd2aef4447e1e557878617562fd337d74183a18dbd8 |
| 2026-06-18T16:38:18Z | 2026-06-18T16:37:59Z | ce6ea53 | ghcr.io/rmi/stitch-entity-linkage:pr-135 |
ghcr.io/rmi/stitch-entity-linkage:pr-135@sha256:da61d488065cbc88d64895eaa3ee45beca8f792d09996d8e3b26311089d62c3f |
| 2026-06-18T16:38:14Z | 2026-06-18T16:37:59Z | ce6ea53 | ghcr.io/rmi/stitch-seed:pr-135 |
ghcr.io/rmi/stitch-seed:pr-135@sha256:223a0f3d3f721b9f327e95bfd4303ce1ae975a62d24a4280d249bbea48bba5e7 |
| 2026-06-18T16:38:14Z | 2026-06-18T16:37:59Z | ce6ea53 | ghcr.io/rmi/stitch-stitch-llm:pr-135 |
ghcr.io/rmi/stitch-stitch-llm:pr-135@sha256:5d8104b71eee353d036d367d8d7cf189dc3835fa83b76229f8cd7294d9826352 |
Major refactor to DB, so that rather than storing data values for sources in "wide" format in
oil_gas_field_sources, it's stored "long" inoil_gas_field_source_values, allowing for more idiomatic SQL queries.Python data model untouched, but there are some changes to the SQLAlchemy translation layer between the two data models.