-
Notifications
You must be signed in to change notification settings - Fork 485
Design doc for replacing materialized views #34106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
|
Overall looks great! Things I'm wondering:
I can somewhat reverse engineer these from the draft PR etc. but it'd be nice to have them documented here. |
Signed-off-by: Moritz Hoffmann <[email protected]>
|
I have a question that explores the design space a bit: Instead of a "replacement materialized view" being its own separate first class concept in the catalog, could we just make the replacement (almost) the same thing as a normal materialized view, and let the interesting action happen only at the This would have the advantage that less new commands would be needed:
Also, regarding #34032 (comment) , the user could just use the already existing introspection to know "(a) when the replacement is hydrated and caught up and (b) how many resources it roughly requires compared to the old version". This means less implementation work, and maybe also less concept for the user to keep in mind. Importantly, this could also eliminate a lot of the code duplication that is currently in #34032. However, one reason for wanting to not treat the replacement as a completely normal materialized view is that maybe we want to do some schema compatibility validation already when creating it, so that the user is not surprised by a schema incompatibility when running the we'd have which would create |
|
Two more thoughts: Can the user SELECT from a replacement materialized view? With the above suggestion of making the replacement be an (almost) normal materialized view, we'd get this for free. With the above suggestion of modeling the replacement as a normal materialized view, we'd have the danger that the user creates some further objects that depend on the replacement, before doing the |
Unfortunately, this doesn't work. I'll update the design to include a description why, but the gist is that a materialized view names a persist shard, which downstream objects read. If we create a new materialized view, we create a new shard. Then we have no logic that would take two shards, apply the updates from the other to the first, and cut over to the new writer. The MVP currently uses the read-only mode for replacement MVs, so they read the shard, but do not write any updates. Only after applying the replacement, they start writing. This solves the problem of "writing to the same shard", but, as you point out, it's now not possible to query the replacement MV. |
Signed-off-by: Moritz Hoffmann <[email protected]>
How about we don't reconcile at just cutover time, but continuously channel data from the new shard to the old shard, with the same read-only trick as the current PR. That is, we'd render as a normal MV dataflow + a small extra dataflow fragment that is reading the replacement MV's normal output shard and writing into the old MV's shard with the read-only trick at first, and then for real after the replacement happens. This would allow SELECTing from the replacement MV before the cutover (from the new shard), but would also allow downstream consumers of the old MV to keep consuming the old shard (even after cutover). A downside of this approach would be that we'd have two shards associated with the replacement MV after the cutover happens. But maybe this is fine: any new reference to the MV could use the new shard, and only old dependant dataflows would keep consuming the old shard. So, most code that looks up the shard for an MV would not need to be modified. Also, we'd have somewhat larger resource requirements than just writing to one shard, because we'd have two MV sinks instead of one. But MV sinks have significant resource requirements only at system restarts? If yes, that should be fine, because all these extra dataflow fragments could disappear at system restart, because then all downstream consumers could switch to consuming from the latest shard. (One more complication with this approach would be that if an MV m1 is replaced by m2, and then m2 is later replaced by m3, then m3's dataflow needs to have the above-mentioned extra dataflow fragment two times: once for writing into m2's shard, and once for writing into m1's shard.) The advantages of this approach (if it's feasible) could be
|
Signed-off-by: Moritz Hoffmann <[email protected]>
Signed-off-by: Moritz Hoffmann <[email protected]>
|
Can a user create multiple replacements at the same time? Could one create a replacement for sources tables? I'm assuming indexes wouldn't work as one would ideally create a replacement on a separate cluster. |
|
Re: the above... I generally agree that it would be too painful to create a whole new shard and orchestrate a migration to it - the main value of alter-MV in my view is that it minimizes that sort of thing - but:
|
|
Thanks for the feedback, this is super helpful. Keep it coming!
As the MVP stands right now yes. I'm not sure what the right call here is, but there's no requirement from a correctness point to only have one replacement in flight.
Indexes wouldn't work because they're even tighter coupled to their definition than materialized views. For example, we might pick an index because some optimizer decision that's not trivial to understand to the user. Changing the index definition would be very surprising to all parties, so that's not something I'd consider a path worth pursuing (plus a whole lot of decidability problems on the way, such as determining nullability, Rice's theorem.) A crucial property of a replacement is that it can self-correct to appear consistent with its new definition. This is possible for materialized views. I think the only other object where this would be true are upsert sources, since they need to remember the current state of the world, which we could diff against the expected state. However, it might be easier to slot in materialized views if one wants to be able to pivot one source to another. For example to switch from one Postgres source to another, the user could create a materialized view on the table, and then use the replacement to switch to another table.
I disagree with the complexity argument. I tried this, and while it may work from a syntax perspective, we want the distinction in code. I found it hard to implement replacements as extensions of materialized views without their own catalog entry. Replacements need to behave differently in many cases, so a lot of code that matches on the item enum needs to distinguish behavior based on the context. This doesn't mean we couldn't unify the syntax, for example by crating a replacement as part of a Also, all state of a catalog item needs to be stored in its These aren't great reasons, but I feel even on a conceptual level it's nice to have a distinction between a materialized view and its replacement.
Totally, we could provide this feature in compute. A complication is that the primitives to do this don't exist. We can select from indexes, persist, or build a dataflow to read from either of them, but we can't surface the contents of the MV correction buffer. I'm open to suggestions here!
Yup, that could be interesting to explore. We'd need to add a transition from read-write to read-only, which seems doable. A problem might be that the new materialized view might overwhelm the original one, which has the potential to take down the replica it's running on. From this perspective, I'm not convinced it'd add a lot of operational confidence compared to decommissioning it immediately. For example, when the diff caused by the replacement took down downstream replicas, they'd restart and likely hydrate successfully, so switching back to the old MV would cause more stress for them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a design document for replacing materialized views without requiring drop-and-recreate operations. The design proposes a stage-and-apply approach where users can create replacement definitions, validate them, and then apply them to existing materialized views while preserving dependencies.
Key changes:
- Introduces a new "replacement" concept that decouples materialized view definitions from their storage shards
- Proposes new SQL commands:
CREATE REPLACEMENT,ALTER MATERIALIZED VIEW APPLY REPLACEMENT, andDROP REPLACEMENT - Defines compatibility rules and timestamp selection mechanisms for applying replacements
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
doc/developer/design/20251111_replacement_materialized_views.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20251111_replacement_materialized_views.md
Outdated
Show resolved
Hide resolved
doc/developer/design/20251111_replacement_materialized_views.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Moritz Hoffmann <[email protected]>
Can you say a bit more about the places where we need to treat them differently? In my naive mind, a replacement MV is just a normal MV in read-only mode and pointing to a shard that's shared with another MV. I see some annoyance about dropping the replacement, you need to be careful to not drop the original MV's shard as well. But this seems like something the storage controller should transparently handle, since both would use different GlobalIds to refer to the shard. |
|
I hacked something together trying out modeling the replacements as MVs as well: #34177. It doesn't feel too bad. Note that this is missing the crucial "apply" step and there are strange bugs. But on the other hand, all the EXPLAIN/SHOW stuff just works. |
Yes! I think the following should be true:
|
Design for replacing materialized views, as implemented in #34032.
Rendered: https://github.com/antiguru/materialize/blob/alter_mv_design/doc/developer/design/20251111_replacement_materialized_views.md