fix(query): re-probe Parquet schema when cache changes underneath a running engine#422
Conversation
The DuckDB engine probed each Parquet table's optional columns once at construction and trusted that snapshot for the process lifetime. A long-running mcp-http server therefore went stale when build-cache/sync rewrote the analytics cache with a different column set: the cached "column present" verdict put a now-absent column into a SELECT * REPLACE list, which DuckDB rejects with Binder Error: Column "message_type" in REPLACE list not found in FROM clause crashing every search_messages and aggregate call until restart. Fingerprint the cache (per-table file count + size + mtime) and re-probe optional columns (and re-register views) on demand when it changes. Guard optionalCols with a RWMutex since refresh now happens concurrently with reads. Add a regression test that rebuilds the cache underneath a live engine and asserts the query recovers instead of crashing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
roborev: Combined Review (
|
|
looking |
SearchFastWithStats kept a materialized temp table alive for pagination and keyed it only by search predicates. When build-cache or sync rewrote the Parquet analytics files under a long-running engine, the optional-column probe could refresh while the search path still served rows, counts, and stats from the pre-rebuild temp table.\n\nInclude the observed Parquet cache fingerprint in the search cache key and refresh that fingerprint before considering a cache hit, so repeated searches rematerialize against current analytics data after an out-of-band cache rebuild. The regression rewrites Parquet beneath a live engine and repeats the same fast search to prove counts, stats, and row ordering come from the rebuilt cache.\n\nGenerated with Codex (GPT-5)\nCo-authored-by: Codex <codex@openai.com>
roborev: Combined Review (
|
Search cache invalidation now depends on the same Parquet directory set required for a complete analytics cache. Without the auxiliary tables in the fingerprint, attachment, label, recipient, or relationship rebuilds could leave SearchFastWithStats serving stale cached stats or rows for repeated searches.\n\nSchema probing now also records optional columns only from a stable fingerprint window. If build-cache rewrites files while a long-running engine is probing schemas, the probe retries instead of storing columns from one cache generation with the fingerprint from another.\n\nGenerated with Codex (GPT-5)\nCo-authored-by: Codex <codex@openai.com>
roborev: Combined Review (
|
CI runs the custom testify-helper-check analyzer as part of lint-ci, while the local lint target only runs golangci-lint. The cache drift regression tests crossed the direct-package-call threshold, so the CI-only analyzer rejected them even though the tests themselves passed.\n\nUse local assert and require helpers in the assertion-heavy tests so the drift coverage follows the repository's testify style rule.\n\nGenerated with Codex (GPT-5)\nCo-authored-by: Codex <codex@openai.com>
roborev: Combined Review (
|
What
The DuckDB query engine probes each Parquet table's optional columns once at construction and trusts that snapshot for the rest of the process lifetime. A long-running process (the
servedaemon, the MCP server) goes stale whenbuild-cache/syncrewrites the analytics cache with a different column set.When that happens, the cached "column present" verdict puts a now-absent column into a
SELECT * REPLACE (... AS message_type)list, which DuckDB rejects:Every
Search/Aggregatecall then fails until the process is restarted.Why
Mixed-version writers (or full rebuilds with an older builder) can drop optional columns from the cache. The engine assumed the schema was immutable for its lifetime; it isn't for a server whose cache is rebuilt out of band.
How
optionalColswith anRWMutexsince refresh can now race with reads.🤖 Generated with Claude Code