Skip to content

[Bug]: deleting multiple continuous aggregates with refresh policies in the same transaction sometimes deadlocks #8636

@jedwards1211

Description

@jedwards1211

What type of bug is this?

Locking issue

What subsystems and features are affected?

Continuous aggregate

What happened?

We're using triggers to dynamically create and delete continuous aggregates for different timezones, based upon which timezones are present in a Sites table.

We intermittently get deadlocks when multiple Sites are deleted in one transaction. The deadlock is between our trigger that deletes the continuous aggregates and the background worker execution.

Looking at the timescaledb source code, what I believe the sequence of events is

[pid 55687] Runs trigger for DELETE FROM "Sites" ... that will drop caggs corresponding to jobs A and B
[pid 55687] DROP MATERIALIZED VIEW FOR job A acquires the advisory lock for job A
[pid 55687] DROP MATERIALIZED VIEW FOR job A acquires the ROW EXCLUSIVE lock on bgw_job
[pid 55728] background worker entry for job B acquires the advisory lock for job B
[pid 55728] background worker entry for job B waits to acquire a SHARE lock on bgw_job, blocked by pid 55687
[pid 55687] DROP MATERIALIZED VIEW FOR job B waits to acquire an advisory lock on job B (blocked by pid 55728)

It would be nice to be able to delete multiple continuous aggregates in the same transaction without running the risk of a deadlock. We have no simple way to make a trigger delete each continuous aggregate in a separate transaction; we could make the trigger create a separate job to delete each continuous aggregate, or we could always have a recurring job that cleans up unneeded caggs, but either way would be more complicated than if we can just delete the caggs from this trigger.

TimescaleDB version affected

2.15.1

PostgreSQL version used

16.3

What operating system did you use?

macOS Sequoia 15.6.1

What installation method did you use?

Docker

What platform did you run on?

Not applicable

Relevant log output and stack trace

db-1  | 2025-09-16 00:58:19.878 UTC [55727] ERROR:  canceling statement due to user request
db-1  | 2025-09-16 00:58:19.880 UTC [1] LOG:  background worker "Refresh Continuous Aggregate Policy [1495]" (PID 55727) exited with exit code 1
db-1  | 2025-09-16 00:58:20.887 UTC [55687] ERROR:  deadlock detected
db-1  | 2025-09-16 00:58:20.887 UTC [55687] DETAIL:  Process 55687 waits for AccessExclusiveLock on advisory lock [2996774,1496,0,29749]; blocked by process 55728.
db-1  |         Process 55728 waits for ShareLock on relation 3017475 of database 2996774; blocked by process 55687.
db-1  |         Process 55687: DELETE FROM "Sites" WHERE "id" IN (423)
db-1  |         Process 55728: CALL _timescaledb_functions.policy_refresh_continuous_aggregate()
db-1  | 2025-09-16 00:58:20.887 UTC [55687] HINT:  See server log for query details.
db-1  | 2025-09-16 00:58:20.887 UTC [55687] CONTEXT:  SQL statement "DROP MATERIALIZED VIEW IF EXISTS accounting_entries_1day_etc_utc CASCADE;"
db-1  |         PL/pgSQL function delete_continuous_aggregates_for_deleted_site() line 16 at EXECUTE
db-1  | 2025-09-16 00:58:20.887 UTC [55687] STATEMENT:  DELETE FROM "Sites" WHERE "id" IN (423)
db-1  | 2025-09-16 00:58:20.888 UTC [47268] LOG:  job 1495 failed
db-1  | 2025-09-16 00:58:20.889 UTC [55728] LOG:  job 1496 threw an error
db-1  | 2025-09-16 00:58:20.890 UTC [55728] ERROR:  canceling statement due to user request
db-1  | 2025-09-16 00:58:20.890 UTC [55728] CONTEXT:  SQL statement "SELECT pg_catalog.max("startTime") FROM public."AccountingEntries""
db-1  | 2025-09-16 00:58:20.892 UTC [1] LOG:  background worker "Refresh Continuous Aggregate Policy [1496]" (PID 55728) exited with exit code 1

How can we reproduce the bug?

CREATE TABLE "Sites" (
    "id" INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    "timezone" TEXT NOT NULL,
    "createdAt" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    "updatedAt" TIMESTAMPTZ NOT NULL NOT NULL DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE "AccountingEntries" (
    "siteId" INTEGER NOT NULL,
    "amount" FLOAT NOT NULL,
    "startTime" TIMESTAMPTZ NOT NULL,
    "interval" INTERVAL NOT NULL DEFAULT '5 minutes',
    "createdAt" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
    CONSTRAINT "AccountingEntries_siteId_fkey" FOREIGN KEY ("siteId") REFERENCES "Sites" ("id") ON UPDATE CASCADE ON DELETE CASCADE,
    CONSTRAINT "AccountingEntries_interval_check" CHECK (
        "interval" = '5 minutes'
        OR "interval" = '15 minutes'
        OR "interval" = '1 hour'
        OR "interval" = '1 month'
    )
);
SELECT create_hypertable('"AccountingEntries"', 'startTime');

--------------------------------------------------------------------------------
-- Triggers to create and delete continuous aggregates

CREATE OR REPLACE FUNCTION convert_timezone_to_snake_case(tz TEXT) RETURNS TEXT AS $$ BEGIN
    RETURN lower(regexp_replace(tz, '[^a-zA-Z0-9]+', '_', 'g'));
END; $$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION create_continuous_aggregates_for_timezone() 
RETURNS trigger AS $$
DECLARE
    new_tz TEXT := NEW.timezone;
    new_tz_snake TEXT := convert_timezone_to_snake_case(new_tz);
    interval_settings TEXT[] := ARRAY['15min', '1hour', '1day', '1month'];
    interval_durations TEXT[] := ARRAY[
        '15 minutes', '1 hour', '1 day', '1 month'
    ];
    policies_schedule INTERVAL[] := ARRAY[
        INTERVAL '15 minutes', INTERVAL '1 hour', INTERVAL '1 day', INTERVAL '1 month'
    ];
    idx INT;
    interval_name TEXT;
    duration TEXT;
    schedule INTERVAL;
    timezone_in_use BOOLEAN;
BEGIN
    FOR idx IN 1..array_length(interval_settings, 1) LOOP
        interval_name := interval_settings[idx];
        duration := interval_durations[idx];
        schedule := policies_schedule[idx];

        -- Create the new materialized view
        EXECUTE format($sql$
            CREATE MATERIALIZED VIEW IF NOT EXISTS accounting_entries_%s_%s 
            WITH (timescaledb.continuous) AS
            SELECT time_bucket('%s', "startTime", '%s') AS "startTime",
                "siteId",
                SUM(amount) AS "amount",
                '%s' AS "interval"
            FROM "AccountingEntries"
            WHERE "interval" IN ('5 minutes', '15 minutes', '1 hour', '1 day', '1 month')
            GROUP BY 1, 2, 3
            WITH NO DATA;
        $sql$, interval_name, new_tz_snake, duration, new_tz, duration);

        EXECUTE format($sql$
            SELECT add_continuous_aggregate_policy(
                'accounting_entries_%s_%s',
                start_offset => NULL,
                end_offset => NULL,
                schedule_interval => '%s',
                if_not_exists => TRUE
            );
        $sql$, interval_name, new_tz_snake, schedule);
    END LOOP;

    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE FUNCTION delete_continuous_aggregates_for_deleted_site() 
RETURNS trigger AS $$
DECLARE
    interval_settings TEXT[] := ARRAY['15min', '1hour', '1day', '1month'];
    interval_name TEXT;
BEGIN
    IF NOT EXISTS (SELECT FROM "Sites" WHERE timezone = OLD.timezone) THEN
        FOR interval_name IN SELECT UNNEST(interval_settings) LOOP
            EXECUTE format(
                'DROP MATERIALIZED VIEW IF EXISTS accounting_entries_%s_%s CASCADE;',
                interval_name,
                convert_timezone_to_snake_case(OLD.timezone)
            );
        END LOOP;
    END IF;
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER delete_continuous_aggregates_trigger
AFTER DELETE ON "Sites"
FOR EACH ROW
EXECUTE FUNCTION delete_continuous_aggregates_for_deleted_site();

CREATE TRIGGER create_continuous_aggregates_trigger_insert
AFTER INSERT ON "Sites"
FOR EACH ROW
EXECUTE FUNCTION create_continuous_aggregates_for_timezone();

--------------------------------------------------------------------------------

INSERT INTO "Sites" (timezone) VALUES
    ('America/Chicago'), ('America/New_York'), ('Etc/UTC');

-- if the following statement is run at just the right time when a background
-- job is starting, it will cause a deadlock:

DELETE FROM "Sites" WHERE timezone IN ('America/Chicago', 'America/New_York');

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions