Skip to content

Premature 'SUBSCRIBED' status for postgres_changes signals incorrect readiness #1599

@motivated2die

Description

@motivated2die

Bug report

  • I confirm this is a bug with Supabase, not with my own application.
  • I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

There is a subtle but critical race condition in the Realtime subscription process for postgres_changes. The client libraries (e.g., supabase-js) emit a 'SUBSCRIBED' status prematurely, before the backend's PostgreSQL logical replication listener is fully initialized and ready to stream changes.

This leads to a state where the client believes it is successfully subscribed, but any database writes performed within a brief window (approx. 1-3 seconds) immediately following this confirmation will not trigger a postgres_changes event on the client. The event is silently missed.

This issue primarily affects time-sensitive applications that need to perform a database write as a direct result of a successful subscription confirmation.

To Reproduce

The behavior is reproducible in any client-side environment.

  1. Set up a basic client application and a database table (e.g., messages).
  2. In the client code, subscribe to postgres_changes for the messages table.
  3. In the .subscribe() callback, listen for the status === 'SUBSCRIBED' event. When this event is received, immediately enable a button that inserts a new row into the messages table.
  4. In the same component, set up an .on('postgres_changes', ...) listener that logs the payload to the console.
  5. Load the application and, the instant the button becomes enabled, click it.
  6. Observe the error: The database insert will succeed, but the .on('postgres_changes', ...) listener will not fire. No payload will be logged to the console. If you wait a few seconds after the button is enabled before clicking, the event is received correctly.

Expected behavior

The 'SUBSCRIBED' status should be a reliable signal that the end-to-end connection is established and the client is ready to receive all subsequent postgres_changes events.

When a client performs a database write immediately after receiving the 'SUBSCRIBED' status, the corresponding postgres_changes event should be reliably delivered to the client's listener. There should be no missed events due to an internal initialization delay.

System information

  • OS: All
  • Browser (if applies): All
  • Version of supabase-js: Affects all v2.x.x versions.
  • Version of Node.js: N/A (client-side issue)

This is a platform-level issue and likely affects all client libraries that use the Realtime service in the same way.

Additional context

This issue stems from a discrepancy between the public API contract and the internal workings of the Realtime server.

The Interplay Between Realtime Core and Client Libraries:

  1. Two-Stage Backend Process: An analysis of the Realtime Elixir source code reveals a complex, multi-stage, asynchronous process for establishing a postgres_changes subscription.

    • Stage 1 (Fast): The client joins a Phoenix WebSocket channel. This is a very quick operation.
    • Stage 2 (Slow): The Realtime server then initiates a separate, slower workflow managed by Realtime.Tenants.Connect and Realtime.Tenants.ReplicationConnection. This involves connecting to the project's database, checking for/creating a publication, creating a temporary replication slot, and finally starting the logical replication stream. This process can take several seconds.
  2. Misleading API Contract:

    • The Realtime backend does not emit a single, high-level event that signals the completion of Stage 2. The final "ready" state can only be inferred by observing a second, low-level system message in the WebSocket traffic.
    • The client libraries (like supabase-js) are designed to listen for the confirmation of Stage 1 (the Phoenix channel join). They correctly interpret the phx_reply as a successful subscription and translate this into the public 'SUBSCRIBED' status for the developer.

This abstraction, while simple, is misleading. The client library is functioning as designed, but the event it is designed to listen for does not accurately represent the readiness of the postgres_changes stream. The bug is in the API contract between the core service and its clients.

A robust solution would involve either the backend sending a definitive "replication ready" event, or the client libraries becoming smarter by waiting for the internal system message before emitting the public 'SUBSCRIBED' status.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrealtime-jsRelated to the realtime-js library.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions