MSC4354: Sticky Events #4354

kegsay · 2025-09-16T07:37:31Z

Rendered

Client (receive/handle) [MatrixRTC] Sticky Events support (MSC4354) matrix-js-sdk#5017
Client (usage) Add sticky event support element-hq/element-call#3513
Server MSC4354: Sticky Events element-hq/synapse#18968
Complement Tests MSC4354: Sticky Events complement#806

SCT Stuff:

FCP tickyboxes

MSC checklist

It wasn't particulalry useful for clients, and doesn't help equivocation much.

proposals/4354-sticky-events.md

Co-authored-by: Johannes Marbach <[email protected]>

proposals/4354-sticky-events.md

Co-authored-by: Travis Ralston <[email protected]>

turt2live · 2025-10-01T16:02:47Z

@mscbot resolve History visibility semantics need clarification/specification
@mscbot resolve Clarify if the sync API returns sticky events on subsequent requests
@mscbot resolve Clarify the endpoint which allows for a start time of "now" on sticky events

Co-authored-by: Travis Ralston <[email protected]>

turt2live · 2025-10-02T16:17:29Z

@mscbot resolve Unclear if addendum is normative for spec process purposes

proposals/4354-sticky-events.md

erikjohnston

Have split the comments into threads (#4354 (comment))

erikjohnston · 2025-10-08T14:00:06Z

proposals/4354-sticky-events.md

+
+To implement these properties, servers MUST:
+
+* Attempt to send their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times.


Moving from #4354 (comment)

The lack of atomicity in /send means clients may flicker RTC member state (update to old values, then immediately to newer values). This happens today too with state events, but less often.

In Synapse this will be especially slow as when we process each sticky event we go and fetch the previous 10 events and then query the state (assuming a large enough gap). This doesn't happen for state, as we'll get the last event and calculate the state for that chunk and atomically persist it. State flickering can happen if the server receives a chunk of events that contain a bunch of state changes, though empirically this is fairly rare.

This doesn't happen for state, as we'll get the last event and calculate the state for that chunk and atomically persist it.

I don't follow this. If I send 50 PDUs all in the same room, a nice linear chain with no forks, we:

treat all 50 PDUs as live (so will send them down /sync)

calculate the state before each PDU (only the earliest incurring a state query hit)

process each PDU atomically, but not the batch of 50.

So you will see flickering?

I think flickering of ephemeral per-user state is inevitable if we wish to hide the key we're modifying in the map from the server. It's definitely a security / UX tradeoff to make, though we've increasingly leant on the side of security for quite some time now. What would the implications be for flicking live-location shares or flickering RTC members? The former likely means the location is updated gradually as the server/client catch up. I think RTC members are reasonably static (they don't change mid-call), so flickering RTC members could make it appear that older members are joined to the call who then leave the call a few milliseconds later? Is this a problem for the call state machine? cc @toger5

Obviously if someone sends 50 sticky events in short succession then that will cause "flickering" as things come down live, but that is reflecting the reality that that state is flickering. That's totally fine.

However, if those 50 events happened over the course of an hour and you see them flickering of state changes then that is a different thing. We have previously made efforts to avoid much flickering on clients.

I think flickering of ephemeral per-user state is inevitable if we wish to hide the key we're modifying in the map from the server

Doesn't some of the encrypted state proposals allow encrypting the state key as well? Or potentially you could have a pointer to previous sticky events that get superseded and these are pulled in automatically (and if the server pulls them in then it knows not to treat them as "live")?

Obviously if someone sends 50 sticky events in short succession then that will cause "flickering" as things come down live, but that is reflecting the reality that that state is flickering. That's totally fine. However, if those 50 events happened over the course of an hour and you see them flickering of state changes then that is a different thing.

Okay, but we have no notion of time in a decentralised system, so how are you in practice determining what is "over the course of an hour" - by the sender timestamp or receiver timestamp? Assuming receiver timestamp because it isn't gameable then by that measure the fact that we sent a bunch of new sticky events over federation on catchup falls into the category of:

that is reflecting the reality that that state is flickering. That's totally fine.

The real concern then is the fact that these events will flicker down to clients due to the fact that we deliver sticky events down /sync in a strict chronological-by-receiver-timestamp order. This then begins to overlap with @MadLittleMods's concern of the lack of a pagination endpoint.

Two birds can be killed with one stone by adding such a pagination endpoint to ensure that we actually deliver the newest stuff first in the common case. The net observed behaviour then for clients will be:

if they are actively syncing when the remote server sends catch up traffic, then it flickers. This is fine.

if they are NOT actively syncing when the remote server sends catch up traffic, then they will get the most recent events first, reducing the chances of flickering occuring. This assumes the server sends catch up sticky events in chronological order, which we already mandate:

Servers MUST send old sticky events in the order they were created on the server (stream ordering / based on origin_server_ts). This ensures that sticky events appear in roughly the right place in the timeline as servers use the arrival ordering to determine an event's position in the timeline.

This should resolve your concern here?

Okay, but we have no notion of time in a decentralised system, so how are you in practice determining what is "over the course of an hour" - by the sender timestamp or receiver timestamp? Assuming receiver timestamp because it isn't gameable then by that measure the fact that we sent a bunch of new sticky events over federation on catchup falls into the category of:

I'm not suggesting we do things based on time, I'm suggesting that we use the same behaviour as for state? I.e. when synchronising you sync send the current/latest state, rather than sending each delta in order. That doesn't require time or anything, and is perfectly possible evidenced by the fact we have it today.

erikjohnston · 2025-10-08T14:02:53Z

proposals/4354-sticky-events.md

+
+To implement these properties, servers MUST:
+
+* Attempt to send their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times.


Moving from #4354 (comment)

how does MatrixRTC handle push notifications for incoming calls? (tangential to this MSC but whatever)

The question is: do we want to use sticky events for MatrixRTC notifications, and if so will that make the flickering problem much more noticeable/problematic?

Naively to me it feels odd to not use sticky events for call notifications, e.g. I'd have thought you would want to be notified for all calls in a DM. If you don't use sticky events you could end up in the situation where you see the call in the UI but not be notified about it.

I don't know. @toger5 will have more context on the tradeoffs and what goes wrong if you do / do not.

The primary concern from the original comment thread was:

Say you have two servers S1 and S2 which share a room, with S2 down. A call is started between multiple people on S1, they chat a bit and then end the call. S2 then comes back online and receives the sticky events for all the call participants in order, and S2 process them in order (including backfilling and fetching /state etc). On syncing clients I think this would result in the call sticky events trickling down, and e.g. triggering the client to ring and then stop ringing when the call-end is finished processing? This would also presumably trigger push notifications as well?

This scenario is somewhat diverged from reality as there is no "call-end" event iirc, but the general idea is valid:

Your server is offline.

A call is placed and ended in a room you're in.

Your server comes back online and gets all these events.

Does it cause the client to ring then stop ringing?

My understanding is that "ringing" is just an @ mention, but I lack the context to know how the client knows when to stop "ringing", hence cc @toger5

erikjohnston · 2025-10-08T14:04:05Z

proposals/4354-sticky-events.md

+
+To implement these properties, servers MUST:
+
+* Attempt to send their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times.


Moving from #4354 (comment)

we will accumulate more forward extremities when catching up as we are now including sticky events in the initial batch of events when catching up. This is a concern, but having to deal with lots of forward extremities isn't a new concern.

One potential security concern here is that it makes it easier for users on one server to generate lots of extremities on another server, which can lead to performance issues in very large rooms. This does only work when the connection between the two servers is down (e.g. the remote server is down).

it makes it easier for users on one server to generate lots of extremities on another server

This is true today via message events surely? Like, I can trivially make lots of events and trickle every Nth message to cause forward extremity accumulation?

You can't as a user on the server, but yes the server can.

The impact here is it's a DoS risk right?

A user can't unilaterally do that unless they can manipulate the connection between two servers, which feels a bit of a stretch. We lack any kind of threat model here to know if this is something we should care about 🤷🏼 but at least in my view if you can control the connection then you're likely a server admin of one of the servers, in which case this opens up no additional weakness. If you're just a user, then this attack can't be reliably done - if you can DoS one of the servers to take them offline then uh.. well, that's just an easier way of doing this, since the attack is also a DoS :S in other words I just don't see how this is concretely affecting anything which isn't affected already?

proposals/4354-sticky-events.md

richvdh

Nothing major, but I do have some minor questions, and a bunch of suggestions for making this easier to read (and hence likelier to land)

proposals/4354-sticky-events.md

richvdh · 2025-10-28T23:04:20Z

proposals/4354-sticky-events.md

+thus leak metadata. As a result, the key now falls within the encrypted `content` payload, and clients are expected to
+implement the map-like semantics should they wish to.  
+[^ttl]: Earlier designs had servers inject a new `unsigned.ttl_ms` field into the PDU to say how many milliseconds were left.
+This was problematic because it would have to be modified every time the server attempted delivery of the event to another server.


This was problematic because it would have to be modified every time the server attempted delivery of the event to another server.

Doesn't the spec require that today with the age field?

Yeah but not over federation. I mostly added this because Erik seemed to think this was a downside in his earlier proposal:

Also having a short expiry makes retries over federation annoying (as they are for events with age), since you need to mutate the contents before retrying a request

Do you want me to add anything to this?

proposals/4354-sticky-events.md

Co-authored-by: Richard van der Hoff <[email protected]> Co-authored-by: Timo <[email protected]>

Sticky Events

57ccc48

kegsay changed the title ~~Sticky Events~~ MSC4354: Sticky Events Sep 16, 2025

kegsay added 3 commits September 16, 2025 08:51

Remove prev_batch

94b1a87

It wasn't particulalry useful for clients, and doesn't help equivocation much.

Update 4354-sticky-events.md

50d76e6

Update 4354-sticky-events.md

3baf0d8

ara4n reviewed Sep 16, 2025

View reviewed changes

proposals/4354-sticky-events.md Outdated Show resolved Hide resolved

Update 4354-sticky-events.md

29e9bf7

turt2live added proposal A matrix spec change proposal client-server Client-Server API kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Sep 16, 2025

turt2live added this to Spec Core Team Workflow Sep 16, 2025

github-project-automation bot moved this to Tracking for review in Spec Core Team Workflow Sep 16, 2025

turt2live added the matrix-2.0 Required for Matrix 2.0 label Sep 16, 2025

turt2live reviewed Sep 16, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

turt2live reviewed Sep 16, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

Syntax

b6e8159

Johennes reviewed Sep 17, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

Johennes reviewed Sep 17, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

kegsay and others added 7 commits September 18, 2025 13:06

Update proposals/4354-sticky-events.md

33ec282

Co-authored-by: Johannes Marbach <[email protected]>

Update proposals/4354-sticky-events.md

7725f74

Co-authored-by: Johannes Marbach <[email protected]>

Update 4354-sticky-events.md

192c6b4

Update 4354-sticky-events.md

97c9c5b

Update 4354-sticky-events.md

8d101fd

Update 4354-sticky-events.md

c75e19c

Update 4354-sticky-events.md

c925a4c

ara4n reviewed Sep 19, 2025

View reviewed changes

proposals/4354-sticky-events.md Outdated Show resolved Hide resolved

kegsay added 3 commits September 22, 2025 08:11

Update 4354-sticky-events.md

6524be2

Update 4354-sticky-events.md

d14448c

Update 4354-sticky-events.md

ce37b02

Half-Shot mentioned this pull request Oct 1, 2025

Implement Sticky Events MSC4354 matrix-org/matrix-js-sdk#5028

Merged

4 tasks

Apply suggestions from code review

b2eab83

Co-authored-by: Travis Ralston <[email protected]>

AndrewFerr mentioned this pull request Oct 1, 2025

MSC4140: Cancellable delayed events #4140

Open

4 tasks

kegsay added 2 commits October 1, 2025 15:51

Update 4354-sticky-events.md

99ee9f8

Update 4354-sticky-events.md

3ff65a5

turt2live mentioned this pull request Oct 1, 2025

Add sticky event parsing matrix-org/gomatrixserverlib#462

Open

kegsay and others added 2 commits October 2, 2025 10:47

Update 4354-sticky-events.md

865746c

Update proposals/4354-sticky-events.md

240d650

Co-authored-by: Travis Ralston <[email protected]>

BillCarsonFr reviewed Oct 3, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

robintown reviewed Oct 6, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

kegsay added 2 commits October 7, 2025 17:04

Update 4354-sticky-events.md

434794d

Update 4354-sticky-events.md

6f94547

erikjohnston reviewed Oct 8, 2025

View reviewed changes

toger5 reviewed Oct 9, 2025

View reviewed changes

proposals/4354-sticky-events.md Outdated Show resolved Hide resolved

toger5 reviewed Oct 9, 2025

View reviewed changes

proposals/4354-sticky-events.md Outdated Show resolved Hide resolved

Half-Shot reviewed Oct 10, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

This was referenced Oct 13, 2025

MSC3489: Sharing streams of location data with history #3489

Open

Support MSC4354 Sticky events matrix-org/matrix-widget-api#141

Open

uhoreg reviewed Oct 14, 2025

View reviewed changes

proposals/4354-sticky-events.md Outdated Show resolved Hide resolved

MadLittleMods reviewed Oct 22, 2025

View reviewed changes

proposals/4354-sticky-events.md Show resolved Hide resolved

MadLittleMods mentioned this pull request Oct 22, 2025

MSC4140: finalised delayed events, and more element-hq/synapse#19038

Open

3 tasks

richvdh requested changes Oct 28, 2025

View reviewed changes

kegsay and others added 6 commits October 29, 2025 08:51

Apply suggestions from code review

0d5e4d8

Co-authored-by: Richard van der Hoff <[email protected]> Co-authored-by: Timo <[email protected]>

Redaction and last-to-expire commentary

7e54063

vdh comments

da7c7c7

more vdh

50c1910

More vdh comments

732a72b

Pagination

e5c1635


		To implement these properties, servers MUST:

		* Attempt to send their own[^origin] sticky events to all joined servers, whilst respecting per-server backoff times.

MSC4354: Sticky Events #4354

Are you sure you want to change the base?

MSC4354: Sticky Events #4354

Conversation

kegsay commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

turt2live commented Oct 1, 2025

Uh oh!

turt2live commented Oct 2, 2025

Uh oh!

Uh oh!

Uh oh!

erikjohnston left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kegsay Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

richvdh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

kegsay commented Sep 16, 2025 •

edited

Loading

erikjohnston left a comment •

edited

Loading

kegsay Oct 29, 2025 •

edited

Loading