Add offline backup for foremanctl by sjha4 · Pull Request #507 · theforeman/foremanctl

sjha4 · 2026-05-13T15:23:17Z

Implements comprehensive offline backup functionality for Foreman deployments:

Backs up all databases (foreman, candlepin, pulp, 5 IOP DBs)
Backs up podman secrets, networks, volumes, quadlet files
Backs up systemd units and foremanctl state
Includes metadata with container image digests for restore compatibility
Preflight checks for running tasks and database integrity (amcheck)
Automatic service restoration on failure

Why are you introducing these changes? (Problem description, related links)

What are the changes introduced in this pull request?

Offline backup

How to test this pull request

I got a foremanctl box with normal deploy. On this box, clone foremanctl repo and checkout this branch.
cd /root/foremanctl
source .venv/bin/activate
export OBSAH_STATE=/var/lib/foremanctl

Then try ./foremanctl --help

Also,

(.venv) [root@ip-10-0-167-40 foremanctl]# ./foremanctl backup  --help
usage: foremanctl backup [-h] [-v] [--incremental] [--online] [--skip-pulp-content] [--tar-volume-size TAR_VOLUME_SIZE] [--wait-for-tasks]
                         backup_dir

Create offline backup of Foreman databases and configuration

positional arguments:
  backup_dir            Directory where backup files will be stored

optional arguments:
  -h, --help            show this help message and exit
  -v, --verbose         verbose output
  --incremental         Perform incremental backup (not yet implemented)
  --online              Perform online backup without stopping services (not yet implemented)
  --skip-pulp-content   Skip Pulp content directory backup (not yet implemented)
  --tar-volume-size TAR_VOLUME_SIZE
                        Split tar archives at specified size in MB (not yet implemented, for Pulp content only)
  --wait-for-tasks      Wait for running tasks to complete instead of failing immediately

The help section has placeholders for incremental, online and --tar-volume-size which will be implemented in follow up cards/PRs.

You can run :
Command:

cd /root/foremanctl
source .venv/bin/activate
export OBSAH_STATE=/var/lib/foremanctl
./foremanctl backup /var/tmp/foreman-backup-test --wait-for-tasks

I have a dummy restore script which can be used for testing and also getting steps to run manually when testing.:

Download the script

wget https://gist.githubusercontent.com/sjha4/35d98b318f15753a678a406fb0fb14ad/raw/test-restore-final.sh

Make it executable

chmod +x test-restore-final.sh

Run restore

./test-restore-final.sh /path/to/backup/foreman-backup-TIMESTAMP

Steps to reproduce:

On a foremanctl box, run foremanctl backup BACKUP_DIR

Checklist

Tests added/updated (if applicable)
Documentation updated (if applicable)

sjha4 · 2026-05-13T19:33:17Z

+    # Critical volumes to backup
+    critical_podman_volumes:
+      - iop-core-kafka-data
+      - iop-service-vmaas-data


I have not tested IOP backups outside of DB extensively with this. Will need some eyes here.

ianballou

I did an automated test run, take these with a grain of salt, but I wanted to post early in case they're real for extra time to test and fix.

Firstly, the DBs did get backed up, so awesome, but there were some hiccups that stopped the full process from running:

Bug #1: podman_network.yaml fails when no custom networks exist

File: src/playbooks/backup/tasks/podman_network.yaml
Severity: Blocker — prevents backup from completing
Description: The shell command podman network ls --format '{{.Name}}' | grep -v '^podman$' | while read net; do ... returns exit code 1 when there are no custom networks, because grep -v finds no matching lines.
Fix: Add || true after the grep, or use a different approach:

# Option A: tolerate empty result
  failed_when: networks_json.rc not in [0, 1]

# Option B: check first, skip if no custom networks

Bug #2: Wrong Foreman tasks API endpoint

File: src/playbooks/backup/tasks/preflight.yaml
Severity: High — preflight silently skips running task detection
Description: Uses https://{{ fqdn }}/api/v2/tasks?state=running which returns 404. The correct endpoint is https://{{ fqdn }}/foreman_tasks/api/tasks?state=running&search=state%3Drunning. Because failed_when: false is set, the error is silently ignored.
Impact: Backups will proceed even with running Foreman tasks, risking data inconsistency.

Bug #3: `pg_isready` and `pg_dump` not available on host

... I cut the output here, I'm not sure why these commands weren't on my box. It's not related to this PR I don't think.

Bug #4: Hardcoded parameters.yaml path in metadata task

File: src/playbooks/backup/tasks/metadata.yaml
Severity: Low — affects metadata accuracy only
Description: ansible.builtin.slurp reads from /var/lib/foremanctl/parameters.yaml but foremanctl's state directory is configurable via OBSAH_STATE. In dev/vagrant setups, the actual path is different (e.g., /vagrant/.var/lib/foremanctl/parameters.yaml).
Impact: enabled_features: [] in metadata despite features being configured. Doesn't affect DB dump correctness.
Fix: Use the state_dir variable instead of hardcoding the path.

sjha4 · 2026-05-14T18:12:47Z

Will update the tasks endpoint and parameters.yaml path..

About the podman networks, those are created for IOP.. Like https://github.com/theforeman/foremanctl/blob/master/src/roles/iop_network/tasks/main.yaml so I'd assume we have that present in production deployments. We can add some handling for when it's not.

aidenfine · 2026-05-14T18:14:33Z

Maybe nitpick but does it make sense to include the not yet implemented flags when doing backup --help? Would you be opposed to just leaving them out in this PR?

sjha4 · 2026-05-14T18:27:29Z

Maybe nitpick but does it make sense to include the not yet implemented flags when doing backup --help? Would you be opposed to just leaving them out in this PR?

I am fine either way but it's helpful guidance for future PRs and documentation to look at.

ianballou · 2026-05-14T18:10:43Z

@@ -0,0 +1,40 @@
+---
+- name: List podman networks


Are podman networks something we need to backup?

Required for IOP..Not sure if we eed backup for this..I guess restore can just rely on foremanctl deploy if it can and backup files here can be for reference/verification.

foremanctl can redeploy all of that, no need to backup

ianballou · 2026-05-14T18:11:33Z

+- name: Export critical volumes
+  ansible.builtin.command:
+    cmd: podman volume export {{ item }} -o {{ backup_dir_full }}/volume-{{ item }}.tar
+  loop: "{{ critical_podman_volumes }}"


Cool, just backs up the important volumes the foreman-maintain backs up today.

ianballou · 2026-05-14T18:16:45Z

@@ -0,0 +1,190 @@
+---
+# Preflight checks for backup operation


Part of me wonders if the checks should be abstracted out to roles if they will be helpful outside of backup. I suppose we could cross that bridge when we get there. Eventually I envision foremanctl having a library of checks much like foreman-maintain. These checks will go into different playbooks, but even then the checks may run or not run depending on the configuration. Certain flavors or configurations will make some checks applicable and others not.

There is src/roles/checks/ today, which looks like is meant to be a single spot for all checks.

If any of these checks seem applicable to deployment or perhaps even health which is going to be implemented soon, it might be worth starting to pull the checks out somewhere.

The main things here are checking for active tasks in foreman and pulp and running amcheck on DBs to be backed up..The Db index checks can be helpful for health and generic checks. Will move those into the checks/ roles. 👍🏼

ianballou · 2026-05-14T18:24:53Z

@@ -0,0 +1,28 @@
+---
+- name: Backup podman quadlet container definitions


I'm thinking about what we would do with quadlet files. These files will contain secrets, volume info, image references, FQDNs. Perhaps some of these we can expect to remain the same, but is the volume info and image info going to remain the same?

Currently, we do not enforce that the z-stream between versions of Foreman/Satellite have to be the same, which tells me these quadlet files may need to be regenerated within the context of the new environment.

Ack..The restore would generate these on the system with foremanctl deploy..I am wondering if there's value in backing up the container definitions regardless for reference..

Yeah, I suppose it wouldn't hurt as long as their collection is simple and safe.

ianballou · 2026-05-14T18:34:17Z

+  register: foremanctl_state_dir
+
+- name: Backup foremanctl state directory
+  community.general.archive:


In order to support incremental exports, tar --listed-incremental needs to be used. It seems this is not supported by community.general.archive, so it might be better to avoid its use altogether.

- name: Backup config files ansible.builtin.command: cmd: > tar --create --gzip --listed-incremental={{ backup_dir_full }}/.config.snar --ignore-failed-read --file {{ backup_dir_full }}/foremanctl_state.tar.gz {{ config_file_paths | join(' ') }}

Also we use config_files but foremanctl-state, bit of an inconsistency there.

there is no need to do incrementals for the configs. (yes, we do them today in foreman maintain, but I think it's more work than use)

ehelms · 2026-05-14T20:13:13Z

+    dest: "{{ backup_dir_full }}/quadlet-files.tar.gz"
+    format: gz
+    mode: '0644'
+  when: database_mode == 'internal'


I've seen this in a few spots, why is this scoped to only internal database? This seems unrelated to the database.

I did try this only on internal DB so the scoping was intentional. However this is not needed on non-DB backups..WIll update. 👍🏼

evgeni · 2026-05-19T13:30:31Z

@@ -0,0 +1,237 @@
+---
+# Detect which databases exist on the system


Can't we take the info from the enabled features?

+1, I think foreman-maintain previously did a lot of discovery, but with foremanctl we should be able to rely on the saved user configuration.

evgeni · 2026-05-19T13:31:25Z

+      - item.stat.exists
+      - item.stat.size > 0
+    fail_msg: >-
+      Database dump failed or produced empty file:


if it failed, pg_dump would have existed non zero, right? then we'd never reach this step.

evgeni · 2026-05-19T13:33:36Z

+---
+- name: Check if foremanctl state directory exists
+  ansible.builtin.stat:
+    path: "{{ lookup('env', 'OBSAH_STATE') }}"


this is already available via the obsah_state_path variable (see theforeman/obsah#86)

evgeni · 2026-05-19T13:35:29Z

+    format: gz
+    mode: '0644'
+    exclude_path:
+      - "{{ lookup('env', 'OBSAH_STATE') }}/certs"


For some reason I was thinking podman secrets would be enough to get these but I think that was incorrect..Will back this up..

evgeni · 2026-05-19T13:36:07Z

+- name: Get hostname
+  ansible.builtin.command:
+    cmd: hostname -f
+  register: hostname_result


ansible_facts['fqdn'] exists

evgeni · 2026-05-19T13:39:48Z

+
+- name: Get OS version
+  ansible.builtin.command:
+    cmd: cat /etc/redhat-release


I am sure we can get that from ansible directly

evgeni · 2026-05-19T13:40:25Z

+
+- name: Gather package facts
+  ansible.builtin.package_facts:
+    manager: rpm


do we need to force rpm here? I would have expected no, and then foremanctl will continue working on Debian

evgeni · 2026-05-19T13:45:10Z

+
+- name: Query container images
+  ansible.builtin.command:
+    cmd: podman images --format json


https://docs.ansible.com/projects/ansible/latest/collections/containers/podman/podman_image_info_module.html seems like the better fit?

evgeni · 2026-05-19T13:46:12Z

+    - parameters_slurp is succeeded
+    - parameters_slurp.content is defined
+
+- name: Set enabled features from parameters


you should have access to enabled_features var already

evgeni · 2026-05-19T13:47:01Z

@@ -0,0 +1,112 @@
+---
+# Backup podman secrets


all these come from foremanctl data, no need to backup these

The only secrets I was thinking of that need preserving are pulp-django-secret-key and pulp-symmetric-key which we'd need for pulp data restore..Those are saved in /var/lib/pulp/ which are backing up so it would be restored from there. Will remove the secrets and networks backup here.. 👍🏼

evgeni · 2026-05-19T13:47:20Z

+---
+- name: List podman volumes
+  ansible.builtin.command:
+    cmd: podman volume ls --format {% raw %}'{{.Name}}'{% endraw %}


I am sure there is a module in containers.podman for this

evgeni · 2026-05-19T13:49:05Z

not a full review (I stopped somewhere around the secrets backup), but overall this feels a lot like "let's write a huge bash script and then wrap it in YAML" and not like Ansible :(

sjha4 · 2026-05-19T16:42:59Z

Based on direction of reviews, I am thinking that we do not need to backup anything which can be safely regenerated by foremanctl deploy..That would mean I can take out podman_secrets, podman networks, podman volumes, quadlet and systemd file backups. That would leave us with DBs, /var/lib/pulp for pulp content and foremanctl state .. Am I missing anything?

sjha4 · 2026-05-20T16:58:41Z

Updated PR to drop backups for podman network, volume and secrets. We can rely on deploy to recreate these. The backup now only backs up DBs, pulp content and foremanctl state.. Also addressed some reviews around using modules where applicable.

ianballou

Just a couple of small bugs, once I worked through them I got a backup!

$ ls /var/tmp/foreman-backup-test/foreman-backup-20260520T193319/
candlepin.dump  foreman.dump  metadata.yml  pulp.dump

ianballou · 2026-05-20T19:16:34Z

+
+- name: Check for running Foreman tasks
+  ansible.builtin.uri:
+    url: "https://{{ foreman_server_fqdn }}/foreman_tasks/api/tasks?state=running&per_page=1"


Suggested change

url: "https://{{ foreman_server_fqdn }}/foreman_tasks/api/tasks?state=running&per_page=1"

url: "https://{{ foreman_server_fqdn }}/foreman_tasks/api/tasks?search=state%3Drunning&per_page=1"

Just state=running doesn't seem to filter anything in foreman-tasks. I had to include search for the tasks to be filtered. Otherwise all of my running tasks returned.

Also, can we not use Foreman Ansible Modules for this search?

ianballou · 2026-05-20T19:18:48Z

@@ -0,0 +1,259 @@
+---
+- name: Backup Foreman databases and configuration
+  hosts: all


Without this, if you try doing this via the vagrant-libvirt hypervisor and not localhost, it will try running backup on all your machines. Also, it matches the other playbooks.

Suggested change

hosts: all

hosts:

- quadlet

ianballou · 2026-05-20T19:33:36Z

+          'name': (item.RepoTags | first) if item.RepoTags | default([]) | length > 0 else '<none>',
+          'digest': (item.RepoDigests | first) if item.RepoDigests | default([]) | length > 0 else '',
+          'id': item.Id,
+          'created': item.Created | int


This ended up being 0 for me, I think because item.Created returns a timestamp.

ianballou · 2026-05-20T19:34:47Z

+
+- name: Set Foreman running tasks count
+  ansible.builtin.set_fact:
+    foreman_running_tasks: "{{ foreman_tasks_check.json.total | default(0) | int }}"


Suggested change

foreman_running_tasks: "{{ foreman_tasks_check.json.total | default(0) | int }}"

foreman_running_tasks: "{{ foreman_tasks_check.json.subtotal | default(0) | int }}"

The total is pre-filtering.

ianballou · 2026-05-20T19:35:10Z

+    url: "https://{{ foreman_server_fqdn }}/foreman_tasks/api/tasks?state=running&per_page=1"
+    method: GET
+    user: "{{ foreman_initial_admin_username }}"
+    password: "{{ foreman_initial_admin_password }}"
+    force_basic_auth: true
+    validate_certs: false
+    return_content: true
+  register: foreman_tasks_wait
+  until: foreman_tasks_wait.json.total | default(0) == 0


Same issues as above with the querying and json total.

sjha4 · 2026-05-21T15:49:18Z

Pushed some changes based on the last set of reviews.. 👍🏼

sjha4 · 2026-05-26T17:50:38Z

Before Ian left for vacation 🍹 , he sent me his review which have been addressed in the last commit.

## foremanctl backup PR #507 — Test Report v3

**Commit:** `6740de3` | **Box:** katello-production | **Date:** 2026-05-21

### Verdict: Backup works end-to-end ✅ | Two non-blocking issues 🔍

**All three v2 bugs are fixed.** The backup ran to successful completion — no errors, no rescue block. All 3 databases dumped in `pg_dump -Fc` format, verified valid. Services stopped and restarted cleanly. API healthy post-backup. Preflight correctly queries foreman_tasks with `?search=state%3Drunning` and checks `subtotal`. Only `quadlet` host targeted.

### New Issues

| #     | Issue                                               | Severity    | Description                                                                                |
|-------|-----------------------------------------------------|-------------|--------------------------------------------------------------------------------------------|
| **1** | Container images duplicated in metadata             | **Low**     | Metadata task is called twice (before dumps + after pulp). The image list uses             |
|       |                                                     |             | `default([]) + [...]` so the second run appends duplicates. 6 images appear as 12.         |
| **2** | Pulp encryption keys not backed up when media empty | **Medium**  | `database_fields.symmetric.key` and `django_secret_key` are included in the pulp content   |
|       |                                                     |             | tar, which is gated on `pulp_media_files.matched > 0`. If media is empty, keys aren't      |
|       |                                                     |             | backed up — but they're critical for restore.                                              |
| —     | Dead code: `critical_podman_volumes` var            | **Cleanup** | Defined but never referenced. Leftover from removed podman volumes backup.                 |

Implements comprehensive offline backup functionality for Foreman deployments: - Backs up all databases (foreman, candlepin, pulp, 5 IOP DBs) - Backs up podman secrets, networks, volumes, quadlet files - Backs up systemd units and foremanctl state - Includes metadata with container image digests for restore compatibility - Preflight checks for running tasks and database integrity (amcheck) - Automatic service restoration on failure Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

pr-processor Bot added the Not yet reviewed label May 13, 2026

sjha4 marked this pull request as ready for review May 13, 2026 19:28

sjha4 commented May 13, 2026

View reviewed changes

Comment thread src/playbooks/backup/tasks/metadata.yaml Outdated

pr-processor Bot removed the Not yet reviewed label May 13, 2026

sjha4 commented May 13, 2026

View reviewed changes

sjha4 force-pushed the backup-offline-implementation branch 2 times, most recently from 15ba9dd to c6de1eb Compare May 14, 2026 14:14

ianballou reviewed May 14, 2026

View reviewed changes

ehelms reviewed May 14, 2026

View reviewed changes

sjha4 force-pushed the backup-offline-implementation branch 4 times, most recently from 4da7174 to 10c3e00 Compare May 15, 2026 17:44

aneta-petrova mentioned this pull request May 18, 2026

Update backup documentation for foremanctl theforeman/foreman-documentation#4850

Draft

10 tasks

evgeni reviewed May 19, 2026

View reviewed changes

sjha4 force-pushed the backup-offline-implementation branch 2 times, most recently from a759002 to 90c76de Compare May 20, 2026 16:12

sjha4 commented May 20, 2026

View reviewed changes

Comment thread src/playbooks/backup/tasks/pulp_content.yaml Outdated

sjha4 force-pushed the backup-offline-implementation branch from 90c76de to fc68cdc Compare May 20, 2026 16:46

ianballou reviewed May 20, 2026

View reviewed changes

sjha4 force-pushed the backup-offline-implementation branch from fc68cdc to 6740de3 Compare May 21, 2026 15:40

sjha4 force-pushed the backup-offline-implementation branch from 6740de3 to e17ecbc Compare May 26, 2026 17:22

sjha4 requested review from ehelms and evgeni May 27, 2026 14:32

sjha4 force-pushed the backup-offline-implementation branch from e17ecbc to fe3ed91 Compare May 29, 2026 19:33

		@@ -0,0 +1,28 @@
		---
		- name: Backup podman quadlet container definitions

		@@ -0,0 +1,237 @@
		---
		# Detect which databases exist on the system

	url: "https://{{ foreman_server_fqdn }}/foreman_tasks/api/tasks?state=running&per_page=1"
	url: "https://{{ foreman_server_fqdn }}/foreman_tasks/api/tasks?search=state%3Drunning&per_page=1"

	foreman_running_tasks: "{{ foreman_tasks_check.json.total \| default(0) \| int }}"
	foreman_running_tasks: "{{ foreman_tasks_check.json.subtotal \| default(0) \| int }}"

Conversation

sjha4 commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are you introducing these changes? (Problem description, related links)

What are the changes introduced in this pull request?

How to test this pull request

Download the script

Make it executable

Run restore

Checklist

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ianballou left a comment

Choose a reason for hiding this comment

Bug #1: podman_network.yaml fails when no custom networks exist

Bug #2: Wrong Foreman tasks API endpoint

Bug #3: pg_isready and pg_dump not available on host

Bug #4: Hardcoded parameters.yaml path in metadata task

Uh oh!

sjha4 commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

aidenfine commented May 14, 2026

Uh oh!

sjha4 commented May 14, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sjha4 commented May 13, 2026 •

edited

Loading

Bug #3: `pg_isready` and `pg_dump` not available on host

sjha4 commented May 14, 2026 •

edited

Loading

sjha4 commented May 20, 2026 •

edited

Loading