fix(ci): repair GCP resource cleanup so it deletes old disks by gustavovalverde · Pull Request #10767 · ZcashFoundation/zebra

gustavovalverde · 2026-06-22T13:32:15Z

Motivation

The daily "Delete GCP resources" job deleted nothing: it passed each disk to gcloud as a single quoted "<name> --zone=<zone>" argument, which gcloud rejected as underspecified. Orphaned per-commit test disks then piled up until the dev project's regional SSD quota filled and integration tests failed at instance creation with Quota 'SSD_TOTAL_GB' exceeded.

Solution

Pass the disk name and its --zone/--region flag as separate arguments, add --quiet, and delete only unattached disks. The instance, template, and cache-image cleanups get the same --quiet fix. Scope the cleanup to the dev project.

AI Disclosure

AI tools were used: Claude Code.

PR Checklist

The PR title follows conventional commits format.
The PR follows the contribution guidelines.
This change was discussed in an issue or with the team beforehand.
The solution is tested.
The documentation and changelogs are up to date.

The daily cleanup deleted nothing: it passed each disk to gcloud as a single quoted "<name> --zone=<zone>" argument, which gcloud rejected as underspecified. The swallowed failure let orphaned per-commit test disks pile up until the regional SSD_TOTAL_GB quota filled and integration tests failed at instance creation. Pass the name and --zone/--region as separate arguments, add --quiet, and delete only unattached disks. Scope the cleanup to the dev project.

Copilot

Pull request overview

This PR fixes the daily "Delete GCP resources" workflow, which was silently deleting nothing. The previous scripts used sed to fold a resource's name and its --zone/--region flag into a single string, then passed that string as one shell‑quoted argument (e.g. "<name> --zone=<zone>"); gcloud rejected it as underspecified. As a result, orphaned per‑commit test disks accumulated until the dev project's regional SSD quota filled and integration tests failed with Quota 'SSD_TOTAL_GB' exceeded. The fix passes the name and location flag as separate arguments, adds --quiet so non‑interactive runs don't abort at the confirmation prompt, restricts disk deletion to unattached disks (-users:*), and scopes the job to the dev environment.

Changes:

Rewrote disk/instance deletion loops to parse tab‑separated value(...) output with while IFS=$'\t' read and pass --zone/--region as separate args.
Added --quiet to disk, instance, template, and cache‑image deletes; limited disk cleanup to unattached disks via the -users:* filter.
Replaced the [dev, prod] cleanup matrix with a single environment: dev, narrowing cleanup to the dev project.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`.github/workflows/scripts/gcp-delete-old-disks.sh`	Refactors to a `delete_disks` helper; separates name/location args, adds `--quiet`, and filters to unattached disks.
`.github/workflows/scripts/gcp-delete-old-instances.sh`	Replaces IFS-juggling loop with `while read`; passes `--zone` separately and adds `--quiet`.
`.github/workflows/scripts/gcp-delete-old-templates.sh`	Adds `--quiet` to the template delete.
`.github/workflows/scripts/gcp-delete-old-cache-images.sh`	Adds `--quiet` to the image delete.
`.github/workflows/zfnd-delete-gcp-resources.yml`	Removes the `[dev, prod]` matrix in favor of `environment: dev` and drops now‑obsolete comments.

The shell refactors are correct: the here-string loops run in the current shell (so exit and error handling work), IFS=$'\t' is scoped to the read builtin, empty input is guarded with the [[ -z "${NAME}" ]] check, and the -users:* filter is valid gcloud syntax for unattached disks. The one item worth a maintainer's attention is that this also drops prod cleanup, which is a behavioral change bundled with the bugfix.

The "zebrad-" disk filter matched zebrad-cache-* chain-state disks, which the -users:* guard only protects while attached. Exclude them so an unattached cached-state disk is never deleted.

gustavovalverde · 2026-06-26T12:36:46Z

@oxarbitrage ready

mergify · 2026-06-26T14:34:12Z

Queued — the merge queue status continues in this comment ↓.

mergify · 2026-06-26T14:34:27Z

github-advanced-security AI found potential problems Jun 22, 2026

View reviewed changes

Comment thread .github/workflows/zfnd-deploy-integration-tests-gcp.yml Fixed

Comment thread .github/workflows/zfnd-deploy-integration-tests-gcp.yml Fixed

Comment thread .github/workflows/zfnd-deploy-integration-tests-gcp.yml Fixed

gustavovalverde force-pushed the fix/gcp-disk-cleanup branch from dd936aa to 5c156ae Compare June 22, 2026 14:01

gustavovalverde changed the title ~~fix(ci): repair GCP disk cleanup and close test-disk leak~~ fix(ci): repair GCP resource cleanup so it deletes old disks Jun 22, 2026

oxarbitrage reviewed Jun 22, 2026

View reviewed changes

Comment thread .github/workflows/scripts/gcp-delete-old-disks.sh Outdated

Merge branch 'main' into fix/gcp-disk-cleanup

d9149fe

gustavovalverde requested a review from Copilot June 26, 2026 10:32

Copilot started reviewing on behalf of gustavovalverde June 26, 2026 10:32 View session

Copilot AI reviewed Jun 26, 2026

View reviewed changes

Comment thread .github/workflows/zfnd-delete-gcp-resources.yml

fix(ci): keep persistent zebrad-cache disks out of cleanup

effb95d

The "zebrad-" disk filter matched zebrad-cache-* chain-state disks, which the -users:* guard only protects while attached. Exclude them so an unattached cached-state disk is never deleted.

gustavovalverde added C-bug Category: This is a bug A-devops Area: Pipelines, CI/CD and Dockerfiles I-heavy Problems with excessive memory, disk, or CPU usage P-High 🔥 labels Jun 26, 2026

oxarbitrage approved these changes Jun 26, 2026

View reviewed changes

mergify Bot added the queued label Jun 26, 2026

mergify Bot added a commit that referenced this pull request Jun 26, 2026

Merge of #10767

c393049

mergify Bot mentioned this pull request Jun 26, 2026

merge queue: checking main (d9a69f1), #10823 and #10767 together #10839

Closed

40 tasks

mergify Bot added dequeued and removed queued labels Jun 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ci): repair GCP resource cleanup so it deletes old disks#10767

fix(ci): repair GCP resource cleanup so it deletes old disks#10767
gustavovalverde wants to merge 3 commits into
mainfrom
fix/gcp-disk-cleanup

gustavovalverde commented Jun 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

gustavovalverde commented Jun 26, 2026

Uh oh!

mergify Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

mergify Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

Conversation

gustavovalverde commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Solution

AI Disclosure

PR Checklist

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

gustavovalverde commented Jun 26, 2026

Uh oh!

mergify Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mergify Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge Queue Status

Reason

Hint

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

gustavovalverde commented Jun 22, 2026 •

edited

Loading

mergify Bot commented Jun 26, 2026 •

edited

Loading

mergify Bot commented Jun 26, 2026 •

edited

Loading