Skip to content

Expanded support for Mutation Tables#1922

Open
mbroecheler wants to merge 10 commits intomainfrom
feature/add_mutation_database
Open

Expanded support for Mutation Tables#1922
mbroecheler wants to merge 10 commits intomainfrom
feature/add_mutation_database

Conversation

@mbroecheler
Copy link
Contributor

@mbroecheler mbroecheler commented Mar 5, 2026

Improves support for mutation tables, i.e. user defined CREATE TABLE statements that do not have a connector which means DataSQRL manages those tables and supports writing to and reading from those tables.

This PR extends support by:

  1. Collecting all mutation tables and producing "database" schema for those tables which is printed to the build directory. The goal is to have an authoritative source of all mutations tables in a project/script. This file can then be checked into version control and be used to ensure that future additions/evolutions are backwards compatible. For this, we allow configuring the file under the script.database config option in the package.json and - if such a file is present - we do a backwards compatibility check, failing the compilation in case of incompatibility.
  2. Generalized mutations and MutationEngine so that we can support Apache Iceberg for mutation table storage. Note, that we do not yet support MutationCoords on the serverside for iceberg which means that vertx can not yet write to those tables.

As we generalize mutation support for engines, we are making a requirement that a mutation table MUST have an /*+engine(...)*/ hint with the engine that this table is stored in (e.g. kafka, iceberg). This makes it less ambiguous which engine is used. It also makes it more compatible with Flink since newer versions of Flink allow tables without any connector configuration to be used as "schema" tables that are extended with LIKE. This is currently not possible with SQRL.
Note, that this breaks backwards compatibility since existing SQRL implementations must add /*+engine(kafka) */ to mutation tables.

This PR also introduces support for external catalogs and makes the import mechanism more robust.
It moves to Iceberg v3 format so we can use deletion vectors for efficiency.

Along the way, I encountered a number of bugs in how we handle mutation tables. Some are fixed in this PR and others have separate tickets: #1921.
Another improvement is in the iceberg connector configuration: moving to v3 and adding upsert.enabled for state tables.

This PR also implements a condensed version of the flink compiled plan which is easier to inspect than the comprehensive version. It is written to flink-compiled-plan-summary.json and should be added as an asset to be captured by cloud-backend once this PR lands /cc @velo.

@mbroecheler mbroecheler requested a review from ferenc-csaky March 5, 2026 08:59
@mbroecheler
Copy link
Contributor Author

mbroecheler commented Mar 8, 2026

@ferenc-csaky Apologies, this got a little beefy, but since I was already refactoring how we handle CREATE TABLE statements, I decided to also fix #1924.

Need some help with the seedshop-avro test. How did that pass before with that ${sqrl:topic} var in there?

This requires a 0.10.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant