Skip to content

Conversation

@rfratto
Copy link
Member

@rfratto rfratto commented Nov 1, 2025

This adds two new packages, expressionpb and physicalpb, which are serializable representations of physical.Expression and physical.Plan, respectively.

These packages include utility functions to convert between the protobuf representations and the planner types.

A translation layer is used due to the complexity of integrating protobuf throughout the engine, as well as difficulties with finding a clean pattern to construct node types. #19638 took an initial attempt at fully integrating the protobuf types, but revealed that it is very challenging.

While helping with #19638, I observed that it's very clunky to work with the protobuf types, especially with how often we rely on interface values; these do not work as smoothly with protobuf's oneofs, resulting in quite painful code.

It's clear to me that we will want to eventually remove the translation layer, but we need more time to figure out how we should interact with the protobuf types cleanly throughout the codebase. Skipping straight to using the protobuf types now has too much of a risk of needing another massive PR. Given this, it's much safer bet to start with a translation layer, find the right abstraction for constructing the protobuf, and then migrate once we have confidence in the pattern.

Updates all physical nodes to use a ULID as their ID, and makes the
field public for explicit node construction (which will be used for
protobuf conversion).

Unit tests which previously explicitly set the ULID have been updated to
leave the ID as the empty ULID.

Currently this field is never set (but will be in the following commit).
When creating a physical plan, each plan node will now have a unique
ULID. The Clone method has been updated to generate a new ULID for the
resulting cloned node.

Workflows, for the time being, will reuse some node ULIDs when a node is
found across multiple sharded tasks.
@rfratto rfratto requested a review from a team as a code owner November 1, 2025 03:07
@rfratto rfratto force-pushed the physical-plan-proto branch from 2704b2e to 7ac0a91 Compare November 1, 2025 03:23
This adds two new packages, expressionpb and physicalpb, which are
serializable representations of physical.Expression and physical.Plan,
respectively.

These packages include utility functions to convert between the protobuf
representations and the planner types.

A translation layer is used due to the complexity of integrating
protobuf throughout the engine, as well as difficulties with finding a
clean pattern to construct node types. #19638 took an
initial attempt at fully integrating the protobuf types, but revealed
that it is very challenging.

While investiating the code, I observed that it's very clunky to work
with the protobuf types, especailly with how often we rely on interface
values. It's clear to me that we will want to eventually remove our
translation layer, but doing it too soon means needing to update the
entire engine code path twice. It is a much safer bet to start with a
translation layer, find the right abstraction for constructing the
protobuf, and then migrate once we have confidence in the pattern.

Co-authored-by: Sophie Waldman <[email protected]>
@rfratto rfratto force-pushed the physical-plan-proto branch from 7ac0a91 to ff047e3 Compare November 1, 2025 03:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant