Staged Writes

Every vendor write in FreeSDN passes through a staged-write pipeline before it touches a live device. This page explains what that pipeline is, why it exists, how the dual gate works, and what you need to configure to let writes through.

What staging is and why it exists

Network devices are stateful and often unrecoverable from a bad push. FreeSDN treats live-device writes as a privileged, audited operation rather than a direct consequence of an API call.

When you call a write endpoint - create a VPN tunnel, change a firewall rule, push a VLAN - the API does not contact the controller. It writes a pending-change record to the database and returns 201 Created. The change sits there, visible to operators, until someone explicitly applies it. Only the apply step contacts the controller, and only if both halves of the dual gate are open.

This gives you:

A queue you can inspect and discard before anything changes in production.
An audit trail of who staged what and when, and who applied or discarded each change.
A safe staging period to review changes across multiple feature areas before committing any of them.
Protection against automation mistakes - a runaway script that calls write endpoints repeatedly produces a growing pending queue, not a flood of live device mutations.

The dual gate

Two independent conditions must be true before an apply call is allowed to push to the device. Either condition failing is a hard block - there is no override.

Condition	What controls it	Default
Read-only flag is false	`ADAPTER_READ_ONLY` env var (global) or `OMADA_READ_ONLY` (Omada-specific)	`true` (read-only)
Apply call carries `force=true`	Request body field on the apply endpoint	Must be explicitly sent

Setting ADAPTER_READ_ONLY=false (or OMADA_READ_ONLY=false) in your environment is a deliberate operator decision. Do not set it in a production deployment until you have reviewed the pending-change workflow and confirmed your team understands the apply step.

force=true in the apply body is a second, per-operation opt-in. It cannot be sent accidentally by a passive API consumer - it must be an explicit choice at apply time.

Environment variables

Variable	Default	Purpose
`ADAPTER_READ_ONLY`	`true`	Global read-only gate; applies to all vendor adapters
`OMADA_READ_ONLY`	`true`	Legacy alias for `ADAPTER_READ_ONLY`; retained for backward compatibility. Both flags feed the same `is_read_only()` check via OR - if either is `true`, no adapter (not just Omada) can push live writes.

Set these in your .env.<tier> file or in the environment before starting the stack. The values are loaded at startup by app/core/config.py (lines 494, 498) and are not hot-reloadable - a restart is required after changing them.

Pending-change records

Each staged write is a row in core.adapter_pending_changes (the schema is shared across vendor surfaces; the table was originally named omada_pending_changes and renamed in migration 019). The row carries:

feature - a dot-separated string that identifies the write domain (e.g., vpn.tunnel, firewall.rule, proxmox.vm.destroy).
operation - create, update, or delete.
payload - the full request payload as a JSONB blob. Incoming payloads are capped at 1 MiB by BodySizeLimitMiddleware to prevent DB bloat.
target_id - optional, the vendor resource ID being modified.
notes - free-text operator annotation (optional).
organization_id - tenant scope; apply and discard operations 404 if change.organization_id does not match the caller’s org.
status - pending (not yet applied), applying (apply in-flight, row locked by the apply worker), applied (successfully pushed to the controller), discarded (operator chose not to apply), or failed (apply attempted but the controller rejected it).
applied_at - UTC timestamp written when the change is applied. The applier’s identity is recorded in the updated_by audit column on the row but is not returned in PendingChangeResponse; query the audit log for attribution.

API endpoints

All routes sit under /api/v1/gateway-<area>/. There are two URL patterns depending on whether the adapter is site-scoped (Omada) or controller-scoped (OPNsense, pfSense, MikroTik, OpenWrt, Proxmox, and all other non-Omada adapters).

Stage, list, and manage changes

Omada adapters (gateway-vpn, gateway-firewall, gateway-wifi, gateway-bulk, gateway-profiles, gateway-firmware, gateway-hotspot, gateway-routing, gateway-system): changes are scoped to a specific site within the controller.

Method	Path	Purpose
`POST`	`/api/v1/gateway-<area>/{controller_id}/sites/{site_id}/changes/{feature}`	Stage a change; returns 201 `PendingChangeResponse`
`GET`	`/api/v1/gateway-<area>/{controller_id}/sites/{site_id}/changes`	List pending changes (filter by `feature_prefix`, `status`, `limit`)

Firewall and network adapters (OPNsense, pfSense, MikroTik, OpenWrt, Proxmox, and all non-Omada adapters): changes are scoped to the controller only - there is no /sites/{site_id}/ segment.

Method	Path	Purpose
`POST`	`/api/v1/gateway-<area>/{controller_id}/changes/{feature}`	Stage a change; returns 201 `PendingChangeResponse`
`GET`	`/api/v1/gateway-<area>/{controller_id}/changes`	List pending changes (filter by `feature_prefix`, `status`, `limit`)
`GET`	`/api/v1/gateway-vpn/changes/by-gateway/{gateway_id}?vendor=<vendor>`	Single-query fanout across vendors for the pending-changes drawer; `vendor` is required

The stage body is PendingChangeRequest:

{
  "payload": { ...vendor-specific fields... },
  "target_id": "optional-vendor-resource-id",
  "notes": "optional operator annotation"
}

The operation is passed as a query parameter: ?operation=create|update|delete.

Apply and discard

Method	Path	Purpose
`POST`	`/api/v1/gateway-vpn/changes/{change_id}/apply`	Apply a pending change to the live device
`POST`	`/api/v1/gateway-vpn/changes/{change_id}/discard`	Discard without applying

The apply body is ApplyPendingChangeRequest:

{ "force": true }

Omitting force or sending false returns an error even if ADAPTER_READ_ONLY=false. Both halves of the gate must be satisfied.

Permissions on apply and discard

The apply endpoint serves every feature domain. It resolves the required permission at apply time by reading change.feature from the fetched row, not from the request URL. This means staging a change in one session and applying it in another still enforces the correct permission check against the applier’s credentials.

The permission map (from adapter_omada_vpn.py, _required_apply_permission):

Feature prefix	Required permission
`system.`, `monitoring.`	`controller:write`
`vpn.*`	`vpn:write`
`firewall.`, `opnsense.`, `pfsense.*`	`firewall:write`
`proxmox.*`	`hypervisor:write`
`mikrotik.*`	`network:write` (escalates to `controller:write` for device-rooting sub-features)
`unifi.*`	`network:write` (escalates to `controller:write` for destructive subset)
(default)	`network:write`

Tenant isolation is structural: apply and discard return 404 (not 403) if change.organization_id does not match the caller’s organization. The response does not reveal whether the change exists in a different tenant.

Catastrophic-operation gates

Some operations are irreversible or fleet-wide. FreeSDN applies a second check for these: the caller must hold at minimum the site_admin role in addition to the write permission above. This applies at both stage time and apply time.

Why two gates?

The stage-time gate (called enforce_catastrophic_stage_role) closes a queue-poisoning window: without it, a lower-privilege operator could stage a catastrophic change and wait for a higher-privilege operator to apply it. Both gating points must hold independently.

The stage-time gate is applied router-wide to all 12 Proxmox stage routers via a FastAPI dependency (__init__.py, lines 222-270). MikroTik and OPNsense stage endpoints apply it inline.

Catastrophic feature set

The following feature values trigger the elevated role requirement (_CATASTROPHIC_FEATURE_PREFIXES in adapter_omada_vpn.py):

Proxmox

Feature	Operation
`vm.destroy`	Permanent VM deletion
`node.shutdown`, `node.reboot`	Host power operations
`snapshot.rollback`, `snapshot.delete`	Snapshot lifecycle
`backup.restore`, `backup.prune`	Backup restore and prune
`storage.delete_volume`	Volume deletion
`certificate_upload`, `certificate_delete`	Certificate management
`vm.guest_agent_exec`	Arbitrary guest execution
`vm.guest_agent_file_write`	Arbitrary file write into a guest
`vm.remote_migrate`	Cross-cluster VM migration (delete-source default)
`vm.cloudinit`	Cloud-init reconfiguration (writes credentials to VM)
`container.destroy`	Permanent container deletion
`container.remote_migrate`	Cross-cluster container migration (delete-source default)

OPNsense / pfSense

OPNsense and pfSense share the same sub-feature names under their respective top-level prefix:

opnsense.system.reboot, opnsense.system.halt, opnsense.system.firmware_update, opnsense.system.firmware_upgrade, opnsense.system.backup_restore, opnsense.system.config_restore

pfsense.system.reboot, pfsense.system.halt, pfsense.system.firmware_update, pfsense.system.firmware_upgrade, pfsense.system.backup_restore, pfsense.system.config_restore

MikroTik

mikrotik.system.reboot, mikrotik.system.shutdown, mikrotik.system.backup_load, mikrotik.system.tool_fetch, mikrotik.system.export_config, mikrotik.system.firmware.install, mikrotik.system.package.uninstall, mikrotik.system.backup.restore

UniFi

unifi.devices.restart, unifi.devices.disable

Omada

firmware.upgrade, bulk.device.factory_reset, bulk.device.reboot, system.admin, system.backup.restore, system.ssl_cert, system.controller_factory_reset

Path-traversal allow-list on the raw passthrough

The Omada adapter exposes a raw passthrough endpoint at POST /api/v1/gateway-raw/{controller_id}/call for advanced use cases where no typed endpoint exists. This route is guarded by:

The controller:write permission.
The full dual gate (OMADA_READ_ONLY=false AND force=true).
A path-traversal allow-list that rejects any raw path containing whitespace, control characters, or backslashes before the request is forwarded.

Reading your pending queue

Before applying changes, review what is staged:

List pending changes for a controller and site:

GET /api/v1/gateway-vpn/{controller_id}/sites/{site_id}/changes?status=pending

Review each change’s feature, operation, and payload.
For a cross-vendor view (useful when a gateway has multiple adapter types behind it), supply the required vendor parameter (mikrotik, pfsense, opnsense, openwrt, or unifi):
```
GET /api/v1/gateway-vpn/changes/by-gateway/{gateway_id}?vendor=mikrotik&status=pending
```

Discard any change you do not want to apply:

POST /api/v1/gateway-vpn/changes/{change_id}/discard

Apply a change you have reviewed:
```
POST /api/v1/gateway-vpn/changes/{change_id}/apply
Content-Type: application/json

{ "force": true }
```
The apply endpoint writes applied_at (UTC timestamp) to the change row. The applier’s identity is captured in the internal audit column updated_by but is not included in the PendingChangeResponse payload; query the audit log for attribution.

Enabling writes in production

Work through this checklist in order:

Review all pending changes in your queue. Applying all of them may be your intent, but verify first.
Set ADAPTER_READ_ONLY=false in your .env.pro (or .env.max) file. If you are using only the Omada adapter, you may instead set OMADA_READ_ONLY=false.
Restart the API container - the flag is read at startup.
Confirm the flag is active by checking application logs at startup for the settings summary.
Apply changes individually using the apply endpoint with "force": true. Do not bulk-apply across feature domains without reviewing each payload.

Multi-tenancy and the staging pipeline

Tenant isolation in the staging pipeline is enforced at every step, not just at authentication:

Stage: the organization_id from the authenticated caller’s JWT is written into the pending-change row at creation time.
List: queries filter by organization_id; cross-tenant rows are never returned.
Apply / discard: the endpoint fetches the change row and compares change.organization_id to user.organization_id. A mismatch returns 404, not 403 - the response does not confirm the change exists.

Honesty notes

“Staged” does not mean “applied.” A 201 response from a write endpoint means the change is in the queue. No device has been contacted. Operators who do not check the pending queue may believe their changes are live when they are not.
Audit rows are written by the apply step, not the stage step. If you inspect audit logs and see no record of a change, it may still be pending in the queue.
The raw passthrough bypasses typed validation. The allow-list guards path traversal but does not validate the payload against a schema. Malformed payloads will be forwarded as-is to the controller.
Catastrophic feature detection is prefix-based. If a vendor surfaces a new destructive feature under an unrecognized prefix, it will fall through to the default permission (network:write) without the elevated role check. Review _CATASTROPHIC_FEATURE_PREFIXES in adapter_omada_vpn.py when adding new vendor surfaces.

Next steps

Security Model - authentication, RBAC, and the 7-tier role hierarchy.
Roles and Permissions - full permission map including vpn:write, firewall:write, and controller:write.
Multi-Tenancy - how org-scoped queries and per-user site grants interact with the staging pipeline.
Production Hardening Checklist - the full checklist for safely enabling writes in a production deployment.