Compute (Hypervisor)

The Compute module (id hypervisor, v1.0.0) connects FreeSDN to one or more Proxmox VE clusters. It gives you a unified view of cluster health, node resources, virtual machine and container lifecycle, snapshots, scheduled backups, storage pools, SDN zones, and per-guest firewall rules - all under the same 7-tier RBAC and staged-write safety contract that governs every other FreeSDN adapter.

The Proxmox adapter spans approximately across adapter.py, client.py, constants.py, and models.py, and ships with full dual-gate enforcement, circuit breaker, central secret redaction, tenant scoping, SSRF protection, and role gates. See Supported Vendors for the complete contract matrix.

Connecting a Proxmox cluster

1. Create an API token in Proxmox VE

In the Proxmox web UI navigate to Datacenter → Permissions → API Tokens. Create a token for a user with PVEAdmin or an equivalent privilege set. Note the token ID (format user@realm!tokenname) and the token secret - the secret is shown only once.

API token authentication is preferred over username/password. If you use username/password, the adapter caches the PVE ticket for 90 minutes (tickets are valid 2 hours; the cache is deliberately shorter to avoid serving an expiring ticket). On a 401 from an idempotent request it drops the cached ticket and retries once.

2. Register the controller in FreeSDN

POST /api/v1/controllers
Content-Type: application/json

{
  "name": "pve-cluster-1",
  "controller_type": "proxmox",
  "host": "pve.lan",
  "port": 8006,
  "use_ssl": true,
  "verify_ssl": false,
  "config": {
    "token_id": "root@pam!freesdn",
    "token_secret": "<token-secret>"
  },
  "site_id": "<site-uuid>"
}

Default connection parameters: port 8006, SSL on, certificate verification off (Proxmox nodes ship with self-signed certs). Set verify_ssl: true only if your cluster uses a trusted CA.

3. Trigger an initial sync

POST /api/v1/discovery/controllers/<controller-uuid>

The sync loop runs automatically every 120 seconds (configurable). Each Proxmox node becomes a ProxmoxNode record in the unified device inventory. Individual VMs and containers are tracked in hypervisor.virtual_machines.

Module settings

Setting	Default	Range	Description
`sync_interval`	`120`	30-3600 s	How often FreeSDN polls the cluster for node and VM state
`show_templates`	`false`	bool	Show VM templates in the Virtual Machines list

Configure via Hypervisor → Settings in the UI or:

PUT /api/v1/modules/org/{organization_id}/hypervisor/settings
Content-Type: application/json

{ "sync_interval": 60, "show_templates": false }

Feature domains

The adapter covers 12 feature domains:

Domain	What you get
Cluster	Status, quorum, resources, log, replication jobs, cluster-wide options
Nodes	Inventory, CPU/memory/disk metrics, RRD history, sensors, services, physical disks, syslog, APT updates, certificates, subscription
VMs (QEMU)	List, config, create, power ops, clone, migrate, resize, template conversion, pending config diff, CloudInit, console proxy, bulk ops
Containers (LXC)	Same lifecycle as VMs; remote-migrate included
Snapshots	Create (with optional RAM state), rollback, delete per VM/CT
Backups	Scheduled job CRUD, manual run, prune, restore from PBS or local storage, backup age report
Storage pools	Browse pools and content by type, ISO/template upload (4 GB cap), volume delete, prune preview
Tasks	List recent tasks, status detail, log tail, stop a running task
HA	Resource and group CRUD
SDN	Zone and VNet CRUD, dependency-safe zone delete, apply pending SDN config
Ceph	Cluster status and detail (404 if Ceph is not deployed on the node)
Firewall	Cluster, node, and per-guest firewall rule CRUD

Key API endpoints

All endpoints mount under /api/v1/hypervisor/. Reads require a valid session (viewer+). Writes require site_admin minimum role. Browse the full surface at /api/v1/docs (enable ENABLE_DOCS=true in non-production environments).

Cluster and fleet

Method	Path	Purpose
GET	`/controllers/{id}/dashboard`	Cluster dashboard summary
GET	`/controllers/{id}/cluster/status`	Quorum state, node count, PVE version
GET	`/controllers/{id}/cluster/resources`	All resources; `?type=node\|qemu\|lxc\|storage\|sdn`
GET	`/controllers/{id}/cluster/log`	Cluster log; `?max_entries=1..5000` (default 50)
GET	`/fleet/dashboard`	Cross-cluster summary across all Proxmox controllers; `?site_id`
GET	`/fleet/task-statistics`	Cross-cluster task statistics; `?site_id`

Nodes

Method	Path	Purpose
GET	`/controllers/{id}/nodes`	List nodes
GET	`/controllers/{id}/nodes/{node}`	Node detail (CPU, memory, disk, PVE version)
GET	`/controllers/{id}/nodes/{node}/services`	Node service list
GET	`/controllers/{id}/nodes/{node}/disks`	Physical disks
GET	`/controllers/{id}/nodes/{node}/disks/smart`	SMART data; `?disk=/dev/sda`
GET	`/controllers/{id}/nodes/{node}/syslog`	Syslog tail; `?limit=1..500`
GET	`/controllers/{id}/nodes/{node}/sensors`	Sensor/temperature readings
GET	`/controllers/{id}/nodes/{node}/rrd`	RRD history (LTTB-downsampled); `?timeframe=hour\|day\|week\|month\|year&max_points=10..5000`
POST	`/controllers/{id}/nodes/{node}/reboot`	Reboot node (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/shutdown`	Shut down node (`site_admin`)

The node path parameter is validated against ^[a-zA-Z0-9._-]+$ (max 63 chars).

VMs and containers

Method	Path	Purpose
GET	`/controllers/{id}/vms`	All VMs across nodes; `?type=qemu\|lxc`
GET	`/controllers/{id}/nodes/{node}/vms`	VMs on a specific node
GET	`/controllers/{id}/nodes/{node}/containers`	LXC containers on a node
GET	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/config`	VM/CT config (secrets redacted)
POST	`/controllers/{id}/vms`	Create QEMU VM (`site_admin`)
POST	`/controllers/{id}/containers`	Create LXC container (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/action`	Power action: `start\|stop\|shutdown\|reboot\|suspend\|resume` (`site_admin`)
PUT	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/config`	Update VM/CT config (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/clone`	Clone to new VM (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/migrate`	Migrate to another node (`site_admin`)
PUT	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/resize`	Resize disk (`site_admin`)
DELETE	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}`	Delete VM/CT - irreversible (`site_admin`)
POST	`/controllers/{id}/bulk-action`	Run an action on multiple VMs/CTs (`site_admin`)
POST	`/controllers/{id}/bulk-migrate`	Migrate multiple VMs to a target node (`site_admin`)

vmid is validated as an integer in the range 100-999,999,999. vm_type accepts only qemu or lxc.

Snapshots

Method	Path	Purpose
GET	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/snapshots`	List snapshots
POST	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/snapshots`	Create snapshot; body: `snapname` (alphanum/`_-`, ≤40), `description` (≤255), `vmstate` bool
POST	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/snapshots/{snapname}/rollback`	Roll back to snapshot (`site_admin`)
DELETE	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/snapshots/{snapname}`	Delete snapshot (`site_admin`)

Storage pools and content

Method	Path	Purpose
GET	`/controllers/{id}/nodes/{node}/storage`	List storage pools with usage stats
GET	`/controllers/{id}/nodes/{node}/storage/{storage}/content`	Browse content; `?content=images\|iso\|backup\|rootdir\|vztmpl\|snippets&vmid=`
POST	`/controllers/{id}/nodes/{node}/storage/{storage}/upload`	Upload ISO or template (multipart, 4 GB cap)
GET	`/controllers/{id}/nodes/{node}/storage/{storage}/prune-preview`	Preview what a prune would remove
POST	`/controllers/{id}/nodes/{node}/storage/{storage}/prune`	Execute prune with retention policy (`site_admin`)
DELETE	`/controllers/{id}/nodes/{node}/storage/{storage}/content/{volume}`	Delete a storage volume (`site_admin`)

Upload streams through a 1 MB-chunk temp file then posts to PVE. The temp file is removed in a finally block. Uploads beyond 4 GB receive HTTP 413.

Backups

Method	Path	Purpose
GET	`/controllers/{id}/backup/jobs`	List scheduled backup jobs
POST	`/controllers/{id}/backup/jobs`	Create backup job (`site_admin`)
PUT	`/controllers/{id}/backup/jobs/{job_id}`	Update backup job (`site_admin`)
DELETE	`/controllers/{id}/backup/jobs/{job_id}`	Delete backup job (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/{vm_type}/{vmid}/backup`	Run manual backup; body: storage, mode, compress (`site_admin`)
POST	`/controllers/{id}/backup/restore`	Restore from archive; body: node, vm_type, archive, vmid, storage, start_after_restore, unique_mac (`site_admin`)
GET	`/controllers/{id}/backup/age-report`	Age report; `?threshold_hours=1..8760` (default 24)

SDN zones and VNets

Method	Path	Purpose
GET	`/controllers/{id}/sdn/zones`	List SDN zones
GET	`/controllers/{id}/sdn/vnets`	List VNets
POST	`/controllers/{id}/sdn/zones`	Create zone (`site_admin`)
POST	`/controllers/{id}/sdn/vnets`	Create VNet (`site_admin`)
DELETE	`/controllers/{id}/sdn/zones/{zone}`	Delete zone - returns 409 with blocking VNet names if dependents exist (`site_admin`)
DELETE	`/controllers/{id}/sdn/vnets/{vnet}`	Delete VNet (`site_admin`)
POST	`/controllers/{id}/sdn/apply`	Apply pending SDN configuration (`site_admin`)

Zone deletes are dependency-safe: the API refuses with HTTP 409 (listing the blocking VNets) rather than leaving orphaned VNets.

HA resources and groups

Method	Path	Purpose
GET	`/controllers/{id}/ha/resources`	List HA resources
POST	`/controllers/{id}/ha/resources`	Add VM/CT to HA; `sid` format `(vm\|ct):\d+` (`site_admin`)
DELETE	`/controllers/{id}/ha/resources/{sid}`	Remove from HA (`site_admin`)
GET	`/controllers/{id}/ha/groups`	List HA groups
POST	`/controllers/{id}/ha/groups`	Create HA group (`site_admin`)
DELETE	`/controllers/{id}/ha/groups/{group}`	Delete HA group (`site_admin`)

Guest agent (QEMU)

Method	Path	Purpose
GET	`/controllers/{id}/nodes/{node}/qemu/{vmid}/agent/info`	Guest network interfaces (502 if agent unavailable)
POST	`/controllers/{id}/nodes/{node}/qemu/{vmid}/agent/exec`	Execute command in guest; body: `command`, `input_data` - output redacted (`site_admin`)
GET	`/controllers/{id}/nodes/{node}/qemu/{vmid}/agent/exec-status/{pid}`	Poll exec stdout/status - redacted (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/qemu/{vmid}/agent/file-read`	Read file from guest filesystem - redacted (`site_admin`)
POST	`/controllers/{id}/nodes/{node}/qemu/{vmid}/agent/file-write`	Write file into guest filesystem (`site_admin`)

These endpoints accept site_admin minimum role. The QEMU guest agent package must be installed and running inside the VM.

Staged writes

FreeSDN’s Proxmox integration uses a dual-gate safety contract for all mutations. The gate has two independent conditions; both must be cleared before a write reaches the cluster:

Environment gate: ADAPTER_READ_ONLY=false must be set on the api container. The default is true. OMADA_READ_ONLY must also be set to false (default true). The Proxmox client OR’s both flags; leaving either at true keeps all writes refused - regardless of whether an Omada controller is registered. OMADA_READ_ONLY is a legacy alias for the global adapter write gate, not a feature toggle for Omada users.
Call-site gate: the staging applier must pass force=true on the internal adapter call. The module service layer does not pass force=true. Only the gateway-proxmox staging endpoints pass it, and only after an operator applies a pending change.

What this means in practice

With default settings (ADAPTER_READ_ONLY=true), every write endpoint in /api/v1/hypervisor/… is refused by the adapter read-only gate and returns an AdapterError - the request does not record a pending change and does not touch the cluster. The HypervisorService layer never passes force=True; the read-only gate therefore blocks all writes unconditionally when the default is in effect.

To stage mutations, use the gateway-proxmox staging endpoints described below, which create PendingChange records and queue them for operator review. An operator with site_admin+ role reviews and applies from the Hypervisor UI (Pending Changes tab) or via:

POST /api/v1/gateway-vpn/changes/{change-id}/apply

The staging endpoints live under /api/v1/gateway-proxmox-{vm,container,snapshot,storage,backup,cluster}/ (gated at include time by enforce_catastrophic_stage_role). You do not call them directly from user-facing code - the UI and Fabric executor drive them.

Staging flow for a VM operation

Operator authors a change in the UI (e.g., stop a VM for maintenance).
FreeSDN creates a PendingChange record with feature proxmox.vm.stop, stores the payload, and returns a PendingChangeResponse.
A site_admin reviews the pending change in the UI.
On apply: the staging applier calls the adapter with force=True; the dual-gate clears; Proxmox executes the stop; the change is marked applied.

Fabric integration

The hypervisor module exposes five Fabric operation targets. All are staged writes - an operator must sign off before they execute.

Operation id	Required inputs	Permission
`hypervisor.vm.snapshot`	`controller_id`, `node`, `vmid`, `snapname`	`hypervisor.manage_snapshots`
`hypervisor.vm.start`	`controller_id`, `node`, `vmid` (+ `vm_type` qemu/lxc)	`hypervisor.manage_vms`
`hypervisor.vm.stop`	same	`hypervisor.manage_vms`
`hypervisor.vm.shutdown`	same	`hypervisor.manage_vms`
`hypervisor.vm.reboot`	same	`hypervisor.manage_vms`

Example wiring: OPNsense firewall rule applied → snapshot affected VMs. Author this as a Fabric Connection targeting hypervisor.vm.snapshot and wire it to the controller.change.applied event from your firewall controller. See Fabric for wiring syntax and Connection authoring.

Permissions

Permission code	Minimum role	Covers
`hypervisor.view`	`viewer`	Read-only access to all cluster, node, VM, and storage data
`hypervisor.manage_vms`	`site_admin`	VM/CT power operations, console, guest agent, bulk ops
`hypervisor.manage_snapshots`	`site_admin`	Create, rollback, and delete snapshots
`hypervisor.manage_backups`	`site_admin`	Create, update, and trigger backup jobs
`hypervisor.manage_nodes`	`site_admin`	Node-level operations (reboot, shutdown, services)

Frontend: the Hypervisor page

Navigate to Hypervisor in the left sidebar (route /hypervisor). Choose a controller from the dropdown when you have multiple Proxmox clusters registered.

With no controller selected - the page shows a fleet dashboard: clusters online/total, total nodes, VMs, containers, and aggregate CPU/memory/storage utilization drawn from /fleet/dashboard.

With a controller selected - tabs include:

Tab	Contents
Dashboard	Cluster health, quorum state, HA active count, per-node resource bars
Nodes	Node list with CPU/memory/disk sparklines; click a node for a detail drawer with sub-tabs: Overview · VMs · Containers · Services · Disks · Network · Sensors
Virtual Machines	VM list with status, vCPU, memory; power actions; bulk action bar
Containers	LXC container list; same operations as VMs
Storage	Pool browser with content-type filter (all/ISO/templates/backup/disk images/snippets); upload and restore dialogs
Tasks	Recent task list with status and log tail
Backup	Scheduled job list; manual backup trigger; backup age report
Firewall	Cluster and per-guest firewall rule tables
HA	HA resource and group management
Pools	Resource pool list

Additional component tabs available in the drawer and via navigation: Ceph, Replication, PBS (Proxmox Backup Server), Certificates, SDN, Monitoring (RRD charts), Updates (APT), Subscriptions, Templates (when show_templates=true), Cluster Log, Kiosk Mode.

Adapter internals

Authentication and connection

The Proxmox client (ProxmoxClientConfig) connects to {host}:{port} (default 8006) over HTTPS. It supports two auth modes:

API token (preferred): token_id (user@realm!tokenname) + token_secret. Tokens are Fernet-decrypted at runtime; they never appear in logs or error messages.
Ticket auth: username/password/realm. The client caches the ticket for 90 minutes (PVE tickets are valid 2 hours); on a 401 from an idempotent request it drops the cached ticket and retries once.

Safety mechanisms

Read-only gate: every POST, PUT, PATCH, and DELETE request checks _is_adapter_read_only() before proceeding. If the gate is closed, the request is recorded as read_only_blocked in metrics and an AdapterError is raised.
Circuit breaker: 5 consecutive failures open the breaker for 60 seconds. Idempotent timeouts retry with jittered backoff.
Path-traversal guard: _validate_path(path) runs at every _request chokepoint.
Rate limiter: 120 requests/minute, 10 concurrent connections. A dedicated 2-slot semaphore handles large uploads so a 4 GB ISO transfer does not starve API calls.
Response size cap: check_response_size(resp) bounds device response bodies.
Secret redaction: redact_secrets (central, ~90 sensitive key patterns, camelCase-aware) is applied to every adapter read. Full VM/CT config responses (GET …/config) go through this broader central filter. _SENSITIVE_CONFIG_KEYS = {cipassword, sshkeys, args, hookscript} are stripped from pending-config (GET …/pending) responses only. CloudInit cipassword/sshkeys/ipconfigN are redacted. PVE ticket fragments and URLs are stripped from error messages.

RRD downsampling

All RRD endpoints use LTTB (Largest-Triangle-Three-Buckets) downsampling. The max_points parameter (10-5000, default 500) controls output resolution. This keeps chart queries fast even for year-range timeframes.

Gotchas and limitations

Proxmox VE only. Proxmox Mail Gateway (PMG) and Proxmox Backup Server (PBS, as a standalone appliance) are not managed here. The adapter only talks to PVE clusters.
Cluster membership changes require shell access. The Proxmox REST API does not expose adding or removing cluster nodes. Use the Proxmox UI or SSH for those operations.
Node status may lag by up to sync_interval seconds. If a node goes offline between sync cycles, FreeSDN’s status field reflects the last successful poll, not real-time state.
Templates hidden by default. VM templates do not appear in the Virtual Machines list unless you set show_templates: true in module settings.
Ceph tab returns 404 when Ceph is not deployed. This is expected - the adapter passes the 404 through cleanly rather than raising an error.
SDN zone delete is dependency-safe. Attempting to delete a zone that has dependent VNets returns HTTP 409 with the blocking VNet names. Delete the VNets first.
Upload cap is 4 GB. Uploading ISOs or templates larger than 4 GB returns HTTP 413. Split or pre-download large images directly on the PVE node.
Remote-migrate requires a separate Proxmox cluster as the target. Both source and target clusters must be reachable from the FreeSDN API container.

Next steps

Supported Vendors - Proxmox adapter contract, maturity tier, and known limitations.
Staged Changes - how the pending-change queue works across all adapters.
Fabric - wire hypervisor operations to events from other modules.
Roles and Permissions - full 7-tier role hierarchy and how site grants interact with module permissions.
Storage (TrueNAS) - companion module for ZFS pool health and staged blob writes.