Backups and Restore
FreeSDN ships two distinct backup mechanisms with different scopes. Using the wrong one during an incident is costly - understand each before you need them.
Two backup mechanisms
Section titled “Two backup mechanisms”| Mechanism | What it captures | Use case |
|---|---|---|
| pg-backup (DB dumps) | Full Postgres state: users, audit log, device inventory, encrypted credentials, agent registry, plugin state, all module data | Disaster recovery - restore a complete instance |
Configuration Backup module (.fsdn archives) | Portable config snapshot: Sites, Controllers, Devices, users, automation rules. No credentials. | Migration between instances, dev-to-prod copy, config version history |
pg-backup: daily GPG-encrypted DB dumps
Section titled “pg-backup: daily GPG-encrypted DB dumps”The freesdn-pg-backup container is part of the core stack - no profile required. It dumps both databases daily:
freesdn- the primary PostgreSQL DB (19 schemas: access, agents, ai, analytics, audit, backup, cameras, collector, core, devices, enterprise, events, fabric, firewall, gateway, hypervisor, network, voip, vpn)freesdn_logs- the TimescaleDB observability DB
Dumps land in the pg_backups volume with 7-day local retention.
GPG encryption (required in production)
Section titled “GPG encryption (required in production)”By default, pg-backup fail-closes without GPG configured: it refuses to write unencrypted dumps. This applies to every deployment tier - the guard checks only whether GPG is active, not ENVIRONMENT. Any stack that lacks a valid GPG recipient must set BACKUP_ALLOW_PLAINTEXT=1 explicitly (the homelab .env.lite template includes a commented-out opt-in; uncomment it only if you have no GPG key and understand the risk - pro and max tier templates include GPG placeholder values that you must fill in before running - generate a dedicated keypair, set BACKUP_GPG_RECIPIENT to the real recipient email, and place the exported public key at ./secrets/backup-public.asc - and must not set this flag). Configure a GPG recipient before running in production.
Setup:
-
On a secure restore host (not the production server), generate a dedicated backup key pair:
Terminal window gpg --full-generate-key# Choose RSA 4096, no expirygpg --export --armor backup@example.com > backup-public.asc -
Copy
backup-public.ascto./secrets/backup-public.ascin the repo. -
Set in your env file:
Terminal window BACKUP_GPG_RECIPIENT=backup@example.comBACKUP_GPG_PUBLIC_KEY_PATH=./secrets/backup-public.asc
The container holds only the public key and cannot decrypt its own dumps. Store the private key and its passphrase in a secrets manager completely separate from the production server.
Manually trigger a backup
Section titled “Manually trigger a backup”The scheduled dump runs every 24 hours. To force one immediately:
docker compose --env-file .env.pro exec pg-backup bash -c ' STAMP=$(date +%Y%m%d_%H%M%S) pg_dump -h postgres -U "$PGUSER" -d "${POSTGRES_DB:-freesdn}" --no-owner --no-privileges \ | gzip > /backups/manual_freesdn_$STAMP.sql.gz pg_dump -h logdb -U "$LOGDB_USER" -d "$LOGDB_DB" --no-owner --no-privileges \ | gzip > /backups/manual_logdb_$STAMP.sql.gz ls -lh /backups/manual_*.sql.gz'List and retrieve backups
Section titled “List and retrieve backups”# List all dumps in the backup volumedocker compose --env-file .env.pro exec pg-backup ls -lh /backups
# Copy a GPG-encrypted dump to the host (production - files end in .sql.gz.gpg)docker compose --env-file .env.pro exec pg-backup cat /backups/freesdn_20260520_030000.sql.gz.gpg \ > ./freesdn_20260520_030000.sql.gz.gpg
# Copy a plaintext dump to the host (dev/homelab with BACKUP_ALLOW_PLAINTEXT=1 - files end in .sql.gz)docker compose --env-file .env.pro exec pg-backup cat /backups/freesdn_20260520_030000.sql.gz \ > ./freesdn_20260520_030000.sql.gzOff-site DR via rclone (dr profile)
Section titled “Off-site DR via rclone (dr profile)”Enable the dr Compose profile to run a sidecar that syncs encrypted dumps off-site using rclone. See Compose Profiles for setup steps.
Off-site retention defaults to 30 days, managed independently from local retention. Use a bucket with object-lock / write-once policy to protect against ransomware.
Default RPO with the shipped daily-dump model: up to 24 hours. For tighter recovery points, configure WAL streaming as described in the DR procedure docs.
The Configuration Backup module
Section titled “The Configuration Backup module”The Configuration Backup module (at /backup in the UI) produces portable .fsdn archives. Use them to:
- Migrate configuration from a staging instance to production
- Snapshot configuration before a major change
- Seed a new Site with an existing Site’s configuration
A .fsdn archive carries no credentials. After restoring a config archive to a new instance, re-enter all module secrets (Controller passwords, NVR credentials, etc.).
The module supports selective restore (individual sections), strict semver schema gating (a mis-versioned section is rejected, not silently applied), and automatic rollback slots (a pre-restore snapshot is captured before any non-dry-run restore).
Restore procedure
Section titled “Restore procedure”From a pg_dump (disaster recovery)
Section titled “From a pg_dump (disaster recovery)”Full restore from scratch. Estimated time: 30-60 minutes for DB-only loss; 3-4 hours for total data-center loss including provisioning.
# 1. Stop write surfaces (keep DBs running if they exist)docker compose --env-file .env.pro stop api worker worker-io scheduler
# 2. Drop and recreate the target databasesdocker compose --env-file .env.pro exec postgres \ psql -U "$POSTGRES_USER" -d postgres -c \ "DROP DATABASE IF EXISTS freesdn; CREATE DATABASE freesdn OWNER $POSTGRES_USER;"
docker compose --env-file .env.pro exec logdb \ psql -U "$LOGDB_USER" -d postgres -c \ "DROP DATABASE IF EXISTS freesdn_logs; CREATE DATABASE freesdn_logs OWNER $LOGDB_USER;"
# 3. Restore from the dumps# Production (GPG-encrypted, .sql.gz.gpg): decrypt first, then pipe into psql.# The private key must be available on the restore host (not the production server).gpg --decrypt ./freesdn_20260520_030000.sql.gz.gpg | gunzip -c | \ docker compose --env-file .env.pro exec -T postgres \ psql -U "$POSTGRES_USER" -d "$POSTGRES_DB"
gpg --decrypt ./logdb_20260520_030000.sql.gz.gpg | gunzip -c | \ docker compose --env-file .env.pro exec -T logdb \ psql -U "$LOGDB_USER" -d "$LOGDB_DB"
# Dev/homelab only (BACKUP_ALLOW_PLAINTEXT=1, .sql.gz - never in production):# gunzip -c ./freesdn_20260520_030000.sql.gz | \# docker compose --env-file .env.pro exec -T postgres \# psql -U "$POSTGRES_USER" -d "$POSTGRES_DB"## gunzip -c ./logdb_20260520_030000.sql.gz | \# docker compose --env-file .env.pro exec -T logdb \# psql -U "$LOGDB_USER" -d "$LOGDB_DB"
# 4. Run migrations (detects existing schema, stamps to head - safe to re-run)docker compose --env-file .env.pro up -d api worker worker-io schedulerdocker compose --env-file .env.pro exec api python scripts/migrate.py
# 5. Invalidate stale sessions (forces re-login - required after every restore)docker compose --env-file .env.pro exec postgres \ psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c \ "TRUNCATE core.user_sessions; UPDATE core.users SET token_version = COALESCE(token_version, 0) + 1;"Post-restore validation
Section titled “Post-restore validation”Always run this after any restore:
BASE=https://freesdn.example.com
# Health checks must pass before allowing trafficcurl -fsS "$BASE/api/v1/health/live"curl -fsS "$BASE/api/v1/health/ready"
# Audit chain integrity - the most important check# Requires super_admin role. Use a super_admin Bearer token or super_admin session cookie.# -f is omitted so a 403 body is visible if the wrong role is used.curl -sS -H "Authorization: Bearer <super_admin_token>" "$BASE/api/v1/audit/validate?limit=100000" | jq# Expected: {"valid": true, "broken_at": null, ...}valid: false with broken_reason: "tampered" indicates either a partial commit during restore or genuine tampering. Cross-reference your off-site immutable backup before taking further action.
RPO and RTO targets
Section titled “RPO and RTO targets”| Scenario | Default RPO | Target RTO |
|---|---|---|
| Daily dump, local restore | Up to 24 hours | 30-60 min (DB-only loss) / 3-4 hours (total loss) |
| Daily dump + off-site sync | Up to 24 hours | 3-4 hours (includes downloading the off-site dump) |
| WAL streaming (operator-configured) | ~5 minutes | 3-4 hours + WAL replay time |
These are planning targets, not contractual SLAs. Define and validate your own RPO/RTO with quarterly restore drills.
Next steps: High Availability - Valkey automatic failover and the Postgres standby topology.