You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implements the skeleton to support replacing materialized views.
Specifically, the change introduces the following SQL syntax:
```
CREATE REPLACEMENT <replacement_name>
FOR MATERIALIZED VIEW <view_name
AS ...
```
This creates a new dataflow targeting the same shard. The dataflow
selects an as-of of its inputs. The dataflow is read-only.
```
ALTER MATERIALIZED VIEW <view_name>
APPLY REPLACEMENT <replacement_name>
```
Replaces the old materialized view with `<replacement_name>`. Enables
writes for the replacement.
The change adds convenience SQL syntax (`SHOW REPLACEMENTS`, `SHOW
CREATE REPLACEMENT`), and relations to query the state of replacements.
The syntax to create replacements is guarded by a feature flag that is
off by default: `enable_replacement_materialized_views`.
Signed-off-by: Moritz Hoffmann <[email protected]>
-https://github.com/MaterializeInc/materialize/pull/34039 (Per-object read only mode)
6
+
7
+
## Problem
8
+
9
+
At the moment, a materialized view names a _view definition_ and the _output columns_ derived from that definition.
10
+
We cannot change one or the other independently, which means that any change to a materialized view requires dropping and recreating it.
11
+
This is inconvenient for users as changes need to cascade through the dependency graph.
12
+
13
+
We call this strong coupling between a materialized view and its definition and output columns.
14
+
This design explores an alternative in which we can decouple these concepts, allowing users to change one without needing to drop and recreate the other.
15
+
This would move us closer to a losely coupled system, which is easier to maintain and evolve over time.
16
+
17
+
## Success criteria
18
+
19
+
We allow users to change the definition and output columns of a materialized view without needing to drop and recreate it.
20
+
We preserve existing dependencies on the materialized view when doing so, and we ensure that the system remains consistent.
21
+
22
+
Changing a running materialized can cause additional work for downstream consumers.
23
+
While we cannot avoid this work, we aim to provide tools to quantify the amount of changed data.
24
+
25
+
## Solution proposal
26
+
27
+
We introduce the notion of a "replacement" for maintained SQL objects, starting with materialized views.
28
+
A replacement allows users to stage the change of definition and output columns of a materialized view.
29
+
The user can then inspect the replacement, and decide to apply or discard it.
30
+
31
+
We add the following SQL commands:
32
+
*`CREATE REPLACEMENT replacement_name FOR MATERIALIZED VIEW mv_name AS SELECT ...`
33
+
Creates a replacement for the specified materialized view with the new definition.
34
+
The usual properties for materialized views apply, such as the cluster and its options.
Applies the specified replacement to the materialized view.
37
+
This updates the definition and output columns of the materialized view to match those of the replacement.
38
+
Existing dependencies on the materialized view are preserved.
39
+
*`DROP REPLACEMENT replacement_name`
40
+
Discards the specified replacement without applying it.
41
+
42
+
When a replacement is created, we validate that the new definition is compatible with the existing materialized view.
43
+
44
+
### Schema evolution
45
+
46
+
When applying a replacement, we need to ensure that the new schema is compatible with the existing schema.
47
+
We define compatibility as follows:
48
+
1. The schema must be the same as the original schema,
49
+
2. Or, the schema must be a superset of the original schema (i.e., it can add new columns but cannot remove existing ones).
50
+
51
+
## Minimal Viable Prototype
52
+
53
+
* Update the parser to support the above syntax.
54
+
* Implement planning and sequencing for the new commands.
55
+
* Support `SHOW REPLACEMENTS`, `SHOW CREATE REPLACEMENT` commands.
56
+
* Add catalog relations for `replacements`: `mz_replacement_materialized_views`.
57
+
* Treat a replacement as a first-class object in the catalog.
58
+
* Record replacements and state transitions in the audit log.
59
+
* Do not support schema evolution in the MVP.
60
+
61
+
The syntax allows users to create multiple replacements for the same target.
62
+
This has some interesting implications:
63
+
* Creating two replacements and applying them in reverse order creates versions where the more recent version has a smaller global ID than an older version.
64
+
We're not relying on global ID's partial ordering for correctness, so this is acceptable.
65
+
* We need to check the schema once when creating the replacement, and once when applying it.
66
+
This is to ensure that the replacement is still valid at the time of application.
67
+
* Alternatively, we could restrict to a single replacement per target at any time.
68
+
This would simplify the implementation, but would also limit the user's ability to stage multiple changes.
69
+
70
+
## Future work
71
+
72
+
* Provide better introspection data for replacements, such as the ability to see the differences between the current and replacement definitions.
73
+
* Surface metadata about the amount of staged changes (records, bytes) between the current and replacement definitions.
74
+
* Introspect the actual changes.
75
+
For example, which rows would be added or removed.
76
+
* Automate applying a replacement once the new definition is hydrated.
77
+
78
+
## Alternatives
79
+
80
+
<!--
81
+
What other solutions were considered, and why weren't they chosen?
82
+
83
+
This is your chance to demonstrate that you've fully discovered the problem.
84
+
Alternative solutions can come from many places, like: you or your Materialize
85
+
team members, our customers, our prospects, academic research, prior art, or
86
+
competitive research. One of our company values is to "do the reading" and
87
+
to "write things down." This is your opportunity to demonstrate both!
88
+
-->
89
+
90
+
## Open questions
91
+
92
+
<!--
93
+
What is left unaddressed by this design document that needs to be
94
+
closed out?
95
+
96
+
When a design document is authored and shared, there might still be
97
+
open questions that need to be explored. Through the design document
98
+
process, you are responsible for getting answers to these open
99
+
questions. All open questions should be answered by the time a design
0 commit comments