Skip to content

Alerting & Notifications

The alerting subsystem has two collaborating parts:

  • Alert rules engine - evaluates rules against the internal event bus every 3 minutes and creates Alert records when conditions are met.
  • Notification providers - deliver those alerts (and any other dispatch call) to one or more external channels concurrently.

Together they replace ad-hoc email scripts and give you a single place to define what to watch, who to tell, and how.


An AlertRule subscribes to a named event type (or pattern) on the internal event bus. When the evaluator sees a matching event, it creates an Alert record and fans out notifications through every configured provider.

Each rule carries:

FieldDescription
nameHuman-readable label shown in the UI
descriptionOptional context for operators
rule_typethreshold, pattern, anomaly, or custom - selects the evaluator
scopeTargeting scope - one of organization, site, device_group, device
scope_idsJSONB list of UUIDs for the chosen scope (omit for organization)
severityinfo, warning, or critical - maps to notification priority
statusactive, disabled, or draft - disabled/draft rules are skipped by the evaluator
conditionsJSONB dict whose keys depend on rule_type (see below)
auto_resolve_after_secondsIf set, the evaluator resolves the alert automatically after this many seconds (integer, minimum 60)
notification_channelsJSONB dict mapping channel name to channel-specific config (e.g. {"email": {"to": ["ops@example.com"]}, "slack": {"channel": "#alerts"}}). Valid channel keys: email, slack, teams, webhook, in_app, sms, whatsapp.

conditions structure by rule_type:

rule_typeRequired keys in conditionsExample
thresholdmetric, operator (>,<,>=,<=,==,!=), value{"metric": "cpu_utilization", "operator": ">", "value": 90}
patternevent_type (glob, e.g. device.offline or device.*), min_count{"event_type": "device.offline", "min_count": 3}
anomalymetric, std_dev_threshold{"metric": "traffic_in", "std_dev_threshold": 3.0}

scope narrows which resources a rule fires for. The table below shows valid combinations:

scopescope_ids contentExample use
organizationomit (must be empty)Fire for anything in your org
sitelist of site UUIDsFire only for events on specific sites
device_grouplist of device-group UUIDsFire for a named group of devices
devicelist of device UUIDsFire for exact devices

Before a rule is saved, the backend calls _verify_scope_ids - every UUID in scope_ids is verified to belong to the caller’s organisation. A foreign or non-existent UUID returns 404 (not 403) to avoid leaking existence.

SeverityTypical use
infoLow-priority informational events, resolved events
warningDefault; degraded performance, capacity thresholds
criticalService-affecting outages, security events

An alert moves through these states:

firing
├─► acknowledged (operator marks seen)
├─► resolved (operator or auto-resolve clears it)
└─► suppressed (snooze for N minutes)
  • Suppress takes a suppress_minutes value and an optional reason. When the suppression expires, the alert-rules-unsuppress-expired Celery task lifts it automatically (runs every 5 minutes).
  • Auto-resolve runs via alert-rules-auto-resolve every 10 minutes. Rules with auto_resolve_after_seconds set will be closed without manual intervention.
  • Acknowledge accepts an optional free-text note stored on the record.
Celery taskIntervalWhat it does
alert-rules-evaluate-allEvery 3 minRuns the full rule set for your org
alert-rules-auto-resolveEvery 10 minResolves timed-out alerts
alert-rules-unsuppress-expiredEvery 5 minLifts expired suppressions

These tasks run on the default Celery queue. If the worker container is down, no evaluation happens until it recovers - there is no fallback evaluator.

You can also trigger evaluation manually: POST /api/v1/alert-rules/evaluate.


All endpoints are under the prefix /api/v1/alert-rules. Fine-grained permission scopes are used throughout.

MethodPathPurposePermission
GET/api/v1/alert-rules/rulesList rules (status?, type?, site_id?)alert:read
POST/api/v1/alert-rules/rulesCreate rule (verifies scope_ids)alert:create
GET/api/v1/alert-rules/rules/{rule_id}Get one rulealert:read
PATCH/api/v1/alert-rules/rules/{rule_id}Update rule (re-verifies scope_ids)alert:update
DELETE/api/v1/alert-rules/rules/{rule_id}Soft-delete rulealert:delete
GET/api/v1/alert-rules/statsRule + alert statistics (site_id?)alert:read
MethodPathPurposePermission
GET/api/v1/alert-rules/alertsList alerts (status?, severity?, rule_id?, site_id?, limit≤200)alert:read
GET/api/v1/alert-rules/alerts/{alert_id}Get one alertalert:read
POST/api/v1/alert-rules/alerts/{alert_id}/acknowledgeAcknowledge with optional notealert:update
POST/api/v1/alert-rules/alerts/{alert_id}/resolveResolve alertalert:update
POST/api/v1/alert-rules/alerts/{alert_id}/suppressSuppress for N minutesalert:update
POST/api/v1/alert-rules/evaluateManually evaluate all rules nowalert:update

  1. Open Alert Rules in the sidebar (route /alert-rules).
  2. Click New rule.
  3. Set name, event_type, and severity.
  4. Choose a scope. For a site-scoped rule, select one or more sites from the picker - their UUIDs populate scope_ids.
  5. Optionally set auto_resolve_after_seconds if the alert should self-clear.
  6. Under Channels, tick the delivery channels you want (email, Slack, etc.). Each must have a configured provider - see Notification providers below.
  7. Save. The rule is active immediately; the next evaluator cycle (within 3 minutes) will pick it up.

To test without waiting, call:

POST /api/v1/alert-rules/evaluate
Authorization: Bearer <token>

Then check /api/v1/alert-rules/alerts for any newly fired alerts.


A notification provider is a named, stored delivery configuration for one channel. You can have multiple providers for the same channel (e.g. two Slack workbooks, one per team).

ChannelProvider typeAuth model
EmailsmtpSMTP server + credentials
Slackslack_webhookIncoming webhook URL
Microsoft Teamsteams_webhookIncoming webhook URL
Webhook (generic)generic_webhookURL + optional HMAC secret + custom headers
In-appbuilt-inNo external config needed
SMStwilio_smsTwilio Account SID + Auth Token + from-number
WhatsApptwilio_whatsappTwilio Account SID + Auth Token + from-number

Fetch the full config schema for any provider type - including required fields and validation rules - from:

GET /api/v1/notifications/providers/types

This returns {type, name, channel, icon, config_schema} for each supported type.

  • Provider config blobs are capped at 256 KiB per record.
  • Display names reject CR, LF, and other control characters (header-injection defense).
  • API responses return a redacted config_summary, never raw credentials.
  • Generic webhook HMAC secrets are stored encrypted; the HMAC is computed server-side on dispatch.

Provider management requires ORG_ADMIN or SUPER_ADMIN.

MethodPathPurpose
GET/api/v1/notifications/providersList providers (channel?, enabled_only?)
GET/api/v1/notifications/providers/typesSupported types with config schemas
POST/api/v1/notifications/providersCreate provider
GET/api/v1/notifications/providers/{provider_id}Get provider (config summary only)
PUT/api/v1/notifications/providers/{provider_id}Update provider
DELETE/api/v1/notifications/providers/{provider_id}Delete provider
POST/api/v1/notifications/providers/{provider_id}/verifyTest stored provider connectivity
POST/api/v1/notifications/providers/{provider_id}/testSend a test message (test_email query param required)
  1. Navigate to Notification Providers (/notification-providers).
  2. Click Add provider and choose SMTP Email.
  3. Fill in host, port, username, password, TLS settings, and a from_email.
  4. Save, then click Verify to confirm connectivity (sends no email).
  5. Click Test and supply a test_email address to send a real test message.
  1. In your Slack workspace, create an Incoming Webhook app and copy the webhook URL.
  2. Add a provider with type slack_webhook and paste the URL.
  3. Verify connectivity, then optionally send a test.
{
"type": "generic_webhook",
"name": "PagerDuty ingest",
"config": {
"url": "https://events.pagerduty.com/v2/enqueue",
"method": "POST",
"headers": {"Content-Type": "application/json"},
"hmac_secret": "your-secret-here"
}
}

When hmac_secret is set, FreeSDN computes HMAC-SHA256(secret, body) and attaches it as X-FreeSDN-Signature on every outbound request. The receiving end can verify it to confirm the call originated from your FreeSDN instance.

POST /api/v1/notifications/send
Authorization: Bearer <token>
Content-Type: application/json
{
"channel": "slack",
"recipient": "#alerts",
"title": "Device offline",
"body": "Switch sw-01 at Site A stopped responding."
}

Template-based send:

POST /api/v1/notifications/send/template

Both endpoints require ORG_ADMIN or SUPER_ADMIN.


When an alert rule fires, the dispatch call fans out across all configured channels concurrently via asyncio.gather. Each channel is independent: a failure on Slack does not block email delivery.

Per-user mute preferences apply before dispatch. If a user has muted a category, the dispatch logs SKIPPED for that user’s in-app channel rather than delivering.


In-app notifications are per-user and require no provider configuration. They appear in the bell icon in the top navigation.

MethodPathPurpose
GET/api/v1/notifications/in-appList notifications (paginated envelope with unread_count)
POST/api/v1/notifications/in-app/{notification_id}/readMark one read
POST/api/v1/notifications/in-app/read-allMark all read
GET/api/v1/notifications/in-app/unread-countBadge count for the bell
POST/api/v1/notifications/in-app/markBulk mark read or dismiss

The list response returns {items, total, limit, offset, unread_count}. Pass unread_only=true for only unseen notifications, or include_dismissed=true to include archived ones.


Each user can control which channels they receive on and set quiet hours.

MethodPathPurpose
GET/api/v1/notifications/preferencesGet current preferences (defaults: all channels enabled)
PUT/api/v1/notifications/preferencesUpdate channels, quiet hours, category settings
PATCH/api/v1/notifications/preferences/muteMute or snooze a category (expires_at=null = permanent)
DELETE/api/v1/notifications/preferences/mute/{category}Unmute (returns 404 if not muted)

Users access these settings from Settings → Notifications.


{
"name": "Device offline",
"rule_type": "pattern",
"conditions": {"event_type": "device.offline"},
"scope": "organization",
"scope_ids": [],
"severity": "critical",
"auto_resolve_after_seconds": 1800,
"notification_channels": {
"email": {"to": ["ops@example.com"]},
"slack": {"channel": "#alerts"}
}
}

Alert on SLA breach for two specific sites

Section titled “Alert on SLA breach for two specific sites”
{
"name": "SLA breach - Production sites",
"rule_type": "pattern",
"conditions": {
"event_type": "sla.breach.created"
},
"scope": "site",
"scope_ids": ["site-uuid-a", "site-uuid-b"],
"severity": "critical",
"notification_channels": {
"email": {"to": ["ops@example.com"]},
"teams": {"webhook_url": "https://teams.microsoft.com/l/..."},
"in_app": {"user_ids": ["user-uuid-1", "user-uuid-2"]}
}
}

Suppress noisy alerts during a maintenance window

Section titled “Suppress noisy alerts during a maintenance window”
POST /api/v1/alert-rules/alerts/{alert_id}/suppress
Content-Type: application/json
{
"suppress_minutes": 120,
"reason": "Scheduled maintenance window 02:00-04:00 UTC"
}

ActionRequired permission
Read rules and alertsalert:read
Create a rulealert:create
Update a rule or alert lifecycle actionalert:update
Delete a rulealert:delete
Manage notification providersORG_ADMIN or SUPER_ADMIN
Send notifications programmaticallyORG_ADMIN or SUPER_ADMIN

Role assignment follows the 7-tier ladder (super_adminguest). You cannot assign a role at or above your own level. See Enterprise overview for the full role table.


Alerts are not firing

  • Check that the worker container is running and the default queue is being consumed.
  • Verify the rule’s status is active - rules with status: disabled or status: draft are skipped by the evaluator.
  • Call POST /api/v1/alert-rules/evaluate to force an immediate evaluation and watch the response for any error details.
  • Check Flower (if the monitoring profile is active) at port 5555 to confirm the evaluate_all_alert_rules task is completing without errors.

Notifications are not being delivered

  • Open the provider record and click Verify to confirm connectivity.
  • Send a Test message and review the response body for the provider error.
  • Check whether the user has muted the relevant category in their preferences.
  • For email: confirm SMTP port, TLS mode, and that the from_email is accepted by the relay.
  • For webhooks: confirm the endpoint is reachable from the FreeSDN API container (not just your browser).

POST /api/v1/alert-rules/evaluate returns 403

The endpoint requires alert:update, not alert:create. Verify your API key or JWT role includes that scope.

Suppress is not lifting automatically

The alert-rules-unsuppress-expired task runs every 5 minutes. If it is overdue, check the Celery worker logs for failures or queue backlog.


  • SLA monitoring - define policies, review breaches, and generate compliance reports
  • Event correlation - group related alerts into managed incidents with assignment
  • Audit log - query the tamper-evident log for alert and notification history
  • Enterprise overview - feature matrix, caveats, and role ladder