-
Notifications
You must be signed in to change notification settings - Fork 971
Description
What type of bug is this?
Locking issue
What subsystems and features are affected?
Continuous aggregate
What happened?
We're using triggers to dynamically create and delete continuous aggregates for different timezones, based upon which timezones are present in a Sites
table.
We intermittently get deadlocks when multiple Sites are deleted in one transaction. The deadlock is between our trigger that deletes the continuous aggregates and the background worker execution.
Looking at the timescaledb source code, what I believe the sequence of events is
[pid 55687] Runs trigger for DELETE FROM "Sites" ... that will drop caggs corresponding to jobs A and B
[pid 55687] DROP MATERIALIZED VIEW FOR job A acquires the advisory lock for job A
[pid 55687] DROP MATERIALIZED VIEW FOR job A acquires the ROW EXCLUSIVE lock on bgw_job
[pid 55728] background worker entry for job B acquires the advisory lock for job B
[pid 55728] background worker entry for job B waits to acquire a SHARE lock on bgw_job, blocked by pid 55687
[pid 55687] DROP MATERIALIZED VIEW FOR job B waits to acquire an advisory lock on job B (blocked by pid 55728)
It would be nice to be able to delete multiple continuous aggregates in the same transaction without running the risk of a deadlock. We have no simple way to make a trigger delete each continuous aggregate in a separate transaction; we could make the trigger create a separate job to delete each continuous aggregate, or we could always have a recurring job that cleans up unneeded caggs, but either way would be more complicated than if we can just delete the caggs from this trigger.
TimescaleDB version affected
2.15.1
PostgreSQL version used
16.3
What operating system did you use?
macOS Sequoia 15.6.1
What installation method did you use?
Docker
What platform did you run on?
Not applicable
Relevant log output and stack trace
db-1 | 2025-09-16 00:58:19.878 UTC [55727] ERROR: canceling statement due to user request
db-1 | 2025-09-16 00:58:19.880 UTC [1] LOG: background worker "Refresh Continuous Aggregate Policy [1495]" (PID 55727) exited with exit code 1
db-1 | 2025-09-16 00:58:20.887 UTC [55687] ERROR: deadlock detected
db-1 | 2025-09-16 00:58:20.887 UTC [55687] DETAIL: Process 55687 waits for AccessExclusiveLock on advisory lock [2996774,1496,0,29749]; blocked by process 55728.
db-1 | Process 55728 waits for ShareLock on relation 3017475 of database 2996774; blocked by process 55687.
db-1 | Process 55687: DELETE FROM "Sites" WHERE "id" IN (423)
db-1 | Process 55728: CALL _timescaledb_functions.policy_refresh_continuous_aggregate()
db-1 | 2025-09-16 00:58:20.887 UTC [55687] HINT: See server log for query details.
db-1 | 2025-09-16 00:58:20.887 UTC [55687] CONTEXT: SQL statement "DROP MATERIALIZED VIEW IF EXISTS accounting_entries_1day_etc_utc CASCADE;"
db-1 | PL/pgSQL function delete_continuous_aggregates_for_deleted_site() line 16 at EXECUTE
db-1 | 2025-09-16 00:58:20.887 UTC [55687] STATEMENT: DELETE FROM "Sites" WHERE "id" IN (423)
db-1 | 2025-09-16 00:58:20.888 UTC [47268] LOG: job 1495 failed
db-1 | 2025-09-16 00:58:20.889 UTC [55728] LOG: job 1496 threw an error
db-1 | 2025-09-16 00:58:20.890 UTC [55728] ERROR: canceling statement due to user request
db-1 | 2025-09-16 00:58:20.890 UTC [55728] CONTEXT: SQL statement "SELECT pg_catalog.max("startTime") FROM public."AccountingEntries""
db-1 | 2025-09-16 00:58:20.892 UTC [1] LOG: background worker "Refresh Continuous Aggregate Policy [1496]" (PID 55728) exited with exit code 1
How can we reproduce the bug?
CREATE TABLE "Sites" (
"id" INT GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
"timezone" TEXT NOT NULL,
"createdAt" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
"updatedAt" TIMESTAMPTZ NOT NULL NOT NULL DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE "AccountingEntries" (
"siteId" INTEGER NOT NULL,
"amount" FLOAT NOT NULL,
"startTime" TIMESTAMPTZ NOT NULL,
"interval" INTERVAL NOT NULL DEFAULT '5 minutes',
"createdAt" TIMESTAMPTZ NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT "AccountingEntries_siteId_fkey" FOREIGN KEY ("siteId") REFERENCES "Sites" ("id") ON UPDATE CASCADE ON DELETE CASCADE,
CONSTRAINT "AccountingEntries_interval_check" CHECK (
"interval" = '5 minutes'
OR "interval" = '15 minutes'
OR "interval" = '1 hour'
OR "interval" = '1 month'
)
);
SELECT create_hypertable('"AccountingEntries"', 'startTime');
--------------------------------------------------------------------------------
-- Triggers to create and delete continuous aggregates
CREATE OR REPLACE FUNCTION convert_timezone_to_snake_case(tz TEXT) RETURNS TEXT AS $$ BEGIN
RETURN lower(regexp_replace(tz, '[^a-zA-Z0-9]+', '_', 'g'));
END; $$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION create_continuous_aggregates_for_timezone()
RETURNS trigger AS $$
DECLARE
new_tz TEXT := NEW.timezone;
new_tz_snake TEXT := convert_timezone_to_snake_case(new_tz);
interval_settings TEXT[] := ARRAY['15min', '1hour', '1day', '1month'];
interval_durations TEXT[] := ARRAY[
'15 minutes', '1 hour', '1 day', '1 month'
];
policies_schedule INTERVAL[] := ARRAY[
INTERVAL '15 minutes', INTERVAL '1 hour', INTERVAL '1 day', INTERVAL '1 month'
];
idx INT;
interval_name TEXT;
duration TEXT;
schedule INTERVAL;
timezone_in_use BOOLEAN;
BEGIN
FOR idx IN 1..array_length(interval_settings, 1) LOOP
interval_name := interval_settings[idx];
duration := interval_durations[idx];
schedule := policies_schedule[idx];
-- Create the new materialized view
EXECUTE format($sql$
CREATE MATERIALIZED VIEW IF NOT EXISTS accounting_entries_%s_%s
WITH (timescaledb.continuous) AS
SELECT time_bucket('%s', "startTime", '%s') AS "startTime",
"siteId",
SUM(amount) AS "amount",
'%s' AS "interval"
FROM "AccountingEntries"
WHERE "interval" IN ('5 minutes', '15 minutes', '1 hour', '1 day', '1 month')
GROUP BY 1, 2, 3
WITH NO DATA;
$sql$, interval_name, new_tz_snake, duration, new_tz, duration);
EXECUTE format($sql$
SELECT add_continuous_aggregate_policy(
'accounting_entries_%s_%s',
start_offset => NULL,
end_offset => NULL,
schedule_interval => '%s',
if_not_exists => TRUE
);
$sql$, interval_name, new_tz_snake, schedule);
END LOOP;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION delete_continuous_aggregates_for_deleted_site()
RETURNS trigger AS $$
DECLARE
interval_settings TEXT[] := ARRAY['15min', '1hour', '1day', '1month'];
interval_name TEXT;
BEGIN
IF NOT EXISTS (SELECT FROM "Sites" WHERE timezone = OLD.timezone) THEN
FOR interval_name IN SELECT UNNEST(interval_settings) LOOP
EXECUTE format(
'DROP MATERIALIZED VIEW IF EXISTS accounting_entries_%s_%s CASCADE;',
interval_name,
convert_timezone_to_snake_case(OLD.timezone)
);
END LOOP;
END IF;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER delete_continuous_aggregates_trigger
AFTER DELETE ON "Sites"
FOR EACH ROW
EXECUTE FUNCTION delete_continuous_aggregates_for_deleted_site();
CREATE TRIGGER create_continuous_aggregates_trigger_insert
AFTER INSERT ON "Sites"
FOR EACH ROW
EXECUTE FUNCTION create_continuous_aggregates_for_timezone();
--------------------------------------------------------------------------------
INSERT INTO "Sites" (timezone) VALUES
('America/Chicago'), ('America/New_York'), ('Etc/UTC');
-- if the following statement is run at just the right time when a background
-- job is starting, it will cause a deadlock:
DELETE FROM "Sites" WHERE timezone IN ('America/Chicago', 'America/New_York');