Skip to content

Enterprise Overview

FreeSDN is a self-hosted, multi-tenant SDN controller. The enterprise layer is the operational plane that sits above individual modules: it lets you manage many organisations and sites from one installation, enforce SLAs, correlate events into incidents, template device configs across a site hierarchy, run bulk firmware upgrades with staged rollout, and produce tamper-evident audit trails - all without a vendor cloud in the loop.

This page maps what exists today. Read it before diving into the per-feature pages.


AreaShort descriptionStatus
Multi-org / MSPOrg to Site Group to Site hierarchy; per-user site grants; org quotasAvailable
SLA monitoringPolicy thresholds, breach detection, acknowledge, compliance summaryAvailable
SLA reportsOn-demand PDF/CSV generation; scheduled report recordsPartial - see caveat
Alert rules engineEvent-bus rule engine to Alerts to multi-channel notificationsAvailable
Event correlationPattern-match events into Incidents; lifecycle + assignmentAvailable
NotificationsEmail, Slack, Teams, webhook, in-app, SMS, and WhatsAppAvailable
Audit log + tamper evidenceHMAC-SHA256 hash chain; export; security eventsAvailable
Config templatesHierarchical templates (org to site_group to site to device_group)Available
Device lifecycle FSMDiscovered to managed to decommissioned state modelAvailable
Device health scoresComposite health score with org/site/device rollupAvailable
ReconciliationDrift detection + auto-remediate loop; on-demand + scheduledAvailable
Bulk operationsReboot / push-config / firmware-update with staged rolloutAvailable
Topology mapL2/L3 graph per site or org; saved layouts; auto-layout algorithmsAvailable
Config version historySnapshot, diff, rollback per deviceAvailable
Streaming telemetryAgent WebSocket present; no sub-second gRPC pipelinePartial
Scheduled report deliverySchedules persist in DB; no Celery runner executes themPartial - see caveat

┌─────────────────────────────────────────┐
│ FreeSDN enterprise layer │
│ │
Orgs / Sites ───────►│ Alert Rules ──► Alerts │
Config Templates │ Event Correlation ──► Incidents │
Device Groups │ SLA Policies ──► Breaches │
Site Groups │ Reconciliation ──► Drift / Auto-fix │
│ Bulk Operations ──► Staged rollout │
│ Health Scores ──► Dashboard │
│ Audit Log ──► Hash chain + export │
│ Notifications ──► 7 channels │
└────────────┬────────────────────────────┘
│ event bus
┌────────────▼────────────────────────────┐
│ 10 FreeSDN modules │
│ Network / Cameras / VoIP / Firewall … │
└─────────────────────────────────────────┘

Every feature here is org-scoped: queries filter by user.organization_id at the service layer. Many endpoints also enforce per-user site grants (UserSiteAccess) so operators can be restricted to a subset of sites within their org.


FreeSDN uses an Org → Site Group → Site → Device hierarchy. A single installation can host multiple organisations (tenants). Each org is isolated: every list, get, create, and update endpoint verifies organization_id before returning data or persisting rows.

Per-user site grants let you give an operator access to exactly the sites they manage - useful for MSP staff who each own a customer subset. The grant primitive lives in app/core/site_access.py; the API lives under /api/v1/organizations/{org_id}/site-access.

TierScoreTypical use
super_admin100Platform-wide; can see all orgs
admin80Org-wide all capabilities
org_admin60Org management; audit access
site_admin40Full control within assigned sites
operator20Day-to-day ops; no user management
viewer10Read-only
guest0Highly restricted

Role assignment is strict-lower-than: you cannot assign a role at or above your own level.

Quotas are off by default (ENFORCE_ORG_QUOTAS=false). Self-hosted installs are unlimited. If you run FreeSDN as a SaaS you can enable quotas and assign tiers:

TierMax usersMax sitesMax devicesAudit retention
FREE31107 days
STARTER10510030 days
PROFESSIONAL502050090 days
ENTERPRISE5001005,000365 days
UNLIMITEDunlimitedunlimitedunlimitedunlimited

Quotas are enforced atomically (SELECT … FOR UPDATE) to prevent TOCTOU races on concurrent device adoption or member additions.


Most enterprise endpoints use fine-grained permission scopes. The table below shows the scopes used across the enterprise layer:

ScopeUsed by
config:read / config:writeTemplates, SLA policies, site/device groups, reconcile, bulk ops
device:read / device:writeHealth, lifecycle, topology layouts
alert:read / alert:create / alert:update / alert:deleteAlert rules and alerts
event:read / event:writeCorrelation rules and incidents
ORG_ADMIN or SUPER_ADMIN (role check)Audit logs, notification providers, org management
SUPER_ADMIN onlyAudit chain validation

Audit endpoints use role checks rather than fine-grained scopes because they carry cross-tenant visibility implications for super_admin operators.


The enterprise API spans several router prefixes. The table below lists prefixes and what lives under each. See the per-feature pages for full endpoint details.

PrefixWhat it contains
/api/v1/enterpriseTemplates, site groups, device groups, device config, lifecycle, health, reconcile, bulk ops, config versions
/api/v1/slaSLA policies, breaches, compliance summary, reports, report schedules
/api/v1/correlationCorrelation rules, incidents, incident events, manual trigger
/api/v1/alert-rulesAlert rules, alerts, acknowledge/resolve/suppress, evaluate
/api/v1/notificationsNotification providers, send, in-app notifications, preferences
/api/v1/auditAudit logs, security events, activity/security summaries, export, chain validate
/api/v1/topologyTopology graph, saved layouts, auto-layout
/api/v1/organizationsOrg CRUD, dashboard, site-access grants

Browse the full platform surface in the interactive OpenAPI docs at /api/v1/docs in non-production environments when ENABLE_DOCS is true (its default). Docs are unconditionally disabled in production regardless of ENABLE_DOCS.


Honest status: what is wired and what is not

Section titled “Honest status: what is wired and what is not”

The following features have complete backend + frontend implementations exercised against real hardware or integration tests:

  • Three-state device config model (desired / pushed / running)
  • Device lifecycle FSM with event emission
  • Config template hierarchy with deep-merge and secret redaction
  • Device health scores (6-component composite, recomputed every 5 minutes by Celery)
  • Reconciliation loop (drift detection, auto-remediate flag, scheduled every 5 minutes)
  • Bulk operations with staged rollout and auto-rollback option
  • Config version history with diff and rollback
  • Event correlation into incidents
  • SLA monitoring with per-metric thresholds, breach lifecycle, and compliance summary
  • Alert rules engine with multi-channel notification dispatch
  • Notification providers (SMTP, Slack, Teams, webhook, Twilio SMS/WhatsApp)
  • Topology graph with saved layouts
  • Tamper-evident audit log with HMAC-SHA256 hash chain

These points are specific to the enterprise surface. For the platform-wide security model see Security overview.

Cross-tenant IDOR hardening. Every endpoint that accepts a foreign UUID (scope_id, parent_id, site_id, assigned_to) verifies org ownership before inserting or returning data. Foreign probes return 404, not 403, to avoid leaking existence. This behavior is covered by the automated security regression suite.

Audit chain tamper evidence. Each AuditLogRecord carries prev_hash and row_hmac = HMAC-SHA256(key, prev_hash || canonical_json(record)). Chain validation walks the full history and reports the first broken link.

Notification provider config. Provider configs are capped at 256 KiB. Display names reject CR/LF/control characters (header-injection defense). Responses return a redacted config_summary, not raw credentials.

Alert evaluation cross-tenant fix. The manual evaluate endpoint previously accepted organization_id from the request body (IDOR). It now always derives the org from the authenticated user’s JWT. The required permission was raised from alert:create to alert:update because evaluation consumes notification-channel quotas.

Bulk operation site pre-check. Before queuing a bulk job, _resolve_bulk_target_site_ids verifies has_site_permission for every target site. Empty resolved target returns 400 (refuses silent no-op jobs). Foreign site/device_group scope_id returns 404.

Infrastructure health redaction. GET /api/v1/enterprise/health/infrastructure redacts framework version strings (FastAPI, Pydantic, PostgreSQL, Redis) from non-admin callers to reduce CVE reconnaissance surface.


These background tasks run automatically when the scheduler container (Celery beat) is up:

Task nameIntervalQueueWhat it does
enterprise-reconcile-allEvery 5 minsyncDrift detection + auto-remediate for all managed devices
enterprise-recompute-healthEvery 5 minmetricsRecompute health scores for all devices
enterprise-snapshot-daily-healthNightly 01:00 UTCdefaultStore daily health snapshot for trend history
alert-rules-evaluate-allEvery 3 mindefaultEvaluate all enabled alert rules
alert-rules-auto-resolveEvery 10 mindefaultAuto-resolve timed-out alerts
alert-rules-unsuppress-expiredEvery 5 mindefaultLift expired alert suppressions
sla-evaluate-allEvery 5 minmetricsEvaluate SLA policies, create/resolve breaches
correlation-scan-eventsEvery 5 mindefaultCorrelate new events into incidents
correlation-auto-resolveEvery 15 mindefaultAuto-resolve stale incidents

These tasks are registered in celery_app.py and only run when a worker is present on the relevant queue. The Pro and Max deployment tiers include an io-worker container that handles the sync and metrics queues.


Enterprise-grade capabilities require the Pro or Max tier. Lite is for homelabs.

TierEnv fileEnterprise capabilities
Lite.env.liteBasic monitoring only; 1 worker
Pro.env.proFull enterprise + io-worker + Flower monitoring
Max.env.maxPro + PgBouncer connection pooling + Valkey Sentinel HA + off-site DR

Valkey Sentinel HA (Max tier) provides automatic cache/broker failover (~5 s detection, ~9 s promotion). PostgreSQL standby is available in Max but failover is manual - automatic PostgreSQL failover is not available in this release. Operators must manually promote the standby.

See Deployment overview for exact docker compose commands.


VariableDefaultPurpose
ENFORCE_ORG_QUOTASfalseEnable SaaS-style org tier quotas
AUDIT_HMAC_KEY(falls back to SECRET_KEY)HMAC key for audit chain; set explicitly for key rotation
PUBLIC_BASE_URLhttp://localhost:8000Externally-reachable URL for this FreeSDN instance; override to your production domain. Used in notification action URLs and agent WebSocket config.

All variables use the bare name (no FREESDN_ prefix).


  • SLA monitoring - define policies, review breaches, generate reports
  • Alert rules - create rules, manage the alert lifecycle, configure notification routing
  • Event correlation - pattern-match events into incidents, assign and resolve
  • Notifications - configure SMTP, Slack, Teams, webhook, and SMS providers
  • Config templates - author hierarchical device config templates with secret handling
  • Device lifecycle - move devices through the FSM from discovery to decommission
  • Health dashboard - read composite health scores, site ranking, and trend history
  • Reconciliation - understand drift detection and trigger on-demand reconciliation
  • Bulk operations - run staged firmware upgrades and config pushes across device groups
  • Audit log - query the hash-chain audit trail, export, and validate chain integrity
  • Topology - view and persist L2/L3 network maps
  • Organizations - manage tenants, per-user site grants, and org quotas