Skip to content

Alerting & Notifications

The alerting subsystem has two collaborating parts:

  • Alert rules engine - evaluates rules against the internal event bus every 3 minutes and creates Alert records when conditions are met.
  • Notification providers - deliver those alerts (and any other dispatch call) to one or more external channels concurrently.

Together they replace ad-hoc email scripts and give you a single place to define what to watch, who to tell, and how.


An AlertRule subscribes to a named event type (or pattern) on the internal event bus. When the evaluator sees a matching event, it creates an Alert record and fans out notifications through every configured provider.

Each rule carries:

FieldDescription
nameHuman-readable label shown in the UI
descriptionOptional context for operators
rule_typethreshold, pattern, anomaly, or custom - selects the evaluator
scopeTargeting scope - one of organization, site, device_group, device
scope_idsJSONB list of UUIDs for the chosen scope (omit for organization)
severityinfo, warning, or critical - maps to notification priority
statusactive, disabled, or draft - disabled/draft rules are skipped by the evaluator
conditionsJSONB dict whose keys depend on rule_type (see below)
auto_resolve_after_secondsIf set, the evaluator resolves the alert automatically after this many seconds (integer, minimum 60)
notification_channelsJSONB dict mapping channel name to channel-specific config (e.g. {"email": {"to": ["ops@example.com"]}, "slack": {"channel": "#alerts"}}). Valid channel keys: email, slack, teams, webhook, in_app, sms, whatsapp.

conditions structure by rule_type:

rule_typeRequired keys in conditionsExample
thresholdmetric, operator (>,<,>=,<=,==,!=), value{"metric": "cpu_utilization", "operator": ">", "value": 90}
patternevent_type (glob, e.g. device.offline or device.*), min_count{"event_type": "device.offline", "min_count": 3}
anomalymetric, std_dev_threshold{"metric": "traffic_in", "std_dev_threshold": 3.0}

scope narrows which resources a rule fires for. The table below shows valid combinations:

scopescope_ids contentExample use
organizationomit (must be empty)Fire for anything in your org
sitelist of site UUIDsFire only for events on specific sites
device_grouplist of device-group UUIDsFire for a named group of devices
devicelist of device UUIDsFire for exact devices

Before a rule is saved, the backend calls _verify_scope_ids - every UUID in scope_ids is verified to belong to the caller’s organisation. A foreign or non-existent UUID returns 404 (not 403) to avoid leaking existence.

SeverityTypical use
infoLow-priority informational events, resolved events
warningDefault; degraded performance, capacity thresholds
criticalService-affecting outages, security events

An alert moves through these states:

firing
├─► acknowledged (operator marks seen)
├─► resolved (operator or auto-resolve clears it)
└─► suppressed (snooze for N minutes)
  • Suppress takes a suppress_minutes value and an optional reason. When the suppression expires, the alert-rules-unsuppress-expired Celery task lifts it automatically (runs every 5 minutes).
  • Auto-resolve runs via alert-rules-auto-resolve every 10 minutes. Rules with auto_resolve_after_seconds set will be closed without manual intervention.
  • Acknowledge accepts an optional free-text note stored on the record.
Celery taskIntervalWhat it does
alert-rules-evaluate-allEvery 3 minRuns the full rule set for your org
alert-rules-auto-resolveEvery 10 minResolves timed-out alerts
alert-rules-unsuppress-expiredEvery 5 minLifts expired suppressions

These tasks run on the default Celery queue. If the worker container is down, no evaluation happens until it recovers - there is no fallback evaluator.

You can also trigger evaluation manually: POST /api/v1/alert-rules/evaluate.


All endpoints are under the prefix /api/v1/alert-rules. Fine-grained permission scopes are used throughout.

MethodPathPurposePermission
GET/api/v1/alert-rules/rulesList rules (status?, type?, site_id?)alert:read
POST/api/v1/alert-rules/rulesCreate rule (verifies scope_ids)alert:create
GET/api/v1/alert-rules/rules/{rule_id}Get one rulealert:read
PATCH/api/v1/alert-rules/rules/{rule_id}Update rule (re-verifies scope_ids)alert:update
DELETE/api/v1/alert-rules/rules/{rule_id}Soft-delete rulealert:delete
GET/api/v1/alert-rules/statsRule + alert statistics (site_id?)alert:read
MethodPathPurposePermission
GET/api/v1/alert-rules/alertsList alerts (status?, severity?, rule_id?, site_id?, limit≤200)alert:read
GET/api/v1/alert-rules/alerts/{alert_id}Get one alertalert:read
POST/api/v1/alert-rules/alerts/{alert_id}/acknowledgeAcknowledge with optional notealert:update
POST/api/v1/alert-rules/alerts/{alert_id}/resolveResolve alertalert:update
POST/api/v1/alert-rules/alerts/{alert_id}/suppressSuppress for N minutesalert:update
POST/api/v1/alert-rules/evaluateManually evaluate all rules nowalert:update

  1. Open Alert Rules in the sidebar (route /alert-rules).
  2. Click New rule.
  3. Set name, event_type, and severity.
  4. Choose a scope. For a site-scoped rule, select one or more sites from the picker - their UUIDs populate scope_ids.
  5. Optionally set auto_resolve_after_seconds if the alert should self-clear.
  6. Under Channels, tick the delivery channels you want (email, Slack, etc.). Each must have a configured provider - see Notification providers below.
  7. Save. The rule is active immediately; the next evaluator cycle (within 3 minutes) will pick it up.

To test without waiting, call:

POST /api/v1/alert-rules/evaluate
Authorization: Bearer <token>

Then check /api/v1/alert-rules/alerts for any newly fired alerts.


A notification provider is a named, stored delivery configuration for one channel. You can have multiple providers for the same channel (e.g. two Slack workbooks, one per team).

ChannelProvider typeAuth model
EmailsmtpSMTP server + credentials
Slackslack_webhookIncoming webhook URL
Microsoft Teamsteams_webhookIncoming webhook URL
Webhook (generic)generic_webhookURL + optional HMAC secret + custom headers
In-appbuilt-inNo external config needed
SMStwilio_smsTwilio Account SID + Auth Token + from-number
WhatsApptwilio_whatsappTwilio Account SID + Auth Token + from-number

Fetch the full config schema for any provider type - including required fields and validation rules - from:

GET /api/v1/notifications/providers/types

This returns {type, name, channel, icon, config_schema} for each supported type.

  • Provider config blobs are capped at 256 KiB per record.
  • Display names reject CR, LF, and other control characters (header-injection defense).
  • API responses return a redacted config_summary, never raw credentials.
  • Generic webhook HMAC secrets are stored encrypted; the HMAC is computed server-side on dispatch.

Provider management requires ORG_ADMIN or SUPER_ADMIN.

MethodPathPurpose
GET/api/v1/notifications/providersList providers (channel?, enabled_only?)
GET/api/v1/notifications/providers/typesSupported types with config schemas
POST/api/v1/notifications/providersCreate provider
GET/api/v1/notifications/providers/{provider_id}Get provider (config summary only)
PUT/api/v1/notifications/providers/{provider_id}Update provider
DELETE/api/v1/notifications/providers/{provider_id}Delete provider
POST/api/v1/notifications/providers/{provider_id}/verifyTest stored provider connectivity
POST/api/v1/notifications/providers/{provider_id}/testSend a test message (test_email query param required)
  1. Navigate to Notification Providers (/notification-providers).
  2. Click Add provider and choose SMTP Email.
  3. Fill in host, port, username, password, TLS settings, and a from_email.
  4. Save, then click Verify to confirm connectivity (sends no email).
  5. Click Test and supply a test_email address to send a real test message.
  1. In your Slack workspace, create an Incoming Webhook app and copy the webhook URL.
  2. Add a provider with type slack_webhook and paste the URL.
  3. Verify connectivity, then optionally send a test.
{
"type": "generic_webhook",
"name": "PagerDuty ingest",
"config": {
"url": "https://events.pagerduty.com/v2/enqueue",
"method": "POST",
"headers": {"Content-Type": "application/json"},
"hmac_secret": "your-secret-here"
}
}

When hmac_secret is set, FreeSDN computes HMAC-SHA256(secret, body) and attaches it as X-FreeSDN-Signature on every outbound request. The receiving end can verify it to confirm the call originated from your FreeSDN instance.

POST /api/v1/notifications/send
Authorization: Bearer <token>
Content-Type: application/json
{
"channel": "slack",
"recipient": "#alerts",
"title": "Device offline",
"body": "Switch sw-01 at Site A stopped responding."
}

Template-based send:

POST /api/v1/notifications/send/template

Both endpoints require ORG_ADMIN or SUPER_ADMIN.


When an alert rule fires, the dispatch call fans out across all configured channels concurrently via asyncio.gather. Each channel is independent: a failure on Slack does not block email delivery.

Per-user mute preferences apply before dispatch. If a user has muted a category, the dispatch logs SKIPPED for that user’s in-app channel rather than delivering.


In-app notifications are per-user and require no provider configuration. They appear in the bell icon in the top navigation.

MethodPathPurpose
GET/api/v1/notifications/in-appList notifications (paginated envelope with unread_count)
POST/api/v1/notifications/in-app/{notification_id}/readMark one read
POST/api/v1/notifications/in-app/read-allMark all read
GET/api/v1/notifications/in-app/unread-countBadge count for the bell
POST/api/v1/notifications/in-app/markBulk mark read or dismiss

The list response returns {items, total, limit, offset, unread_count}. Pass unread_only=true for only unseen notifications, or include_dismissed=true to include archived ones.


Each user can control which channels they receive on and set quiet hours.

MethodPathPurpose
GET/api/v1/notifications/preferencesGet current preferences (defaults: all channels enabled)
PUT/api/v1/notifications/preferencesUpdate channels, quiet hours, category settings
PATCH/api/v1/notifications/preferences/muteMute or snooze a category (expires_at=null = permanent)
DELETE/api/v1/notifications/preferences/mute/{category}Unmute (returns 404 if not muted)

Users access these settings from Settings → Notifications.


{
"name": "Device offline",
"rule_type": "pattern",
"conditions": {"event_type": "device.offline"},
"scope": "organization",
"scope_ids": [],
"severity": "critical",
"auto_resolve_after_seconds": 1800,
"notification_channels": {
"email": {"to": ["ops@example.com"]},
"slack": {"channel": "#alerts"}
}
}

Alert on SLA breach for two specific sites

Section titled “Alert on SLA breach for two specific sites”
{
"name": "SLA breach - Production sites",
"rule_type": "pattern",
"conditions": {
"event_type": "sla.breach.created"
},
"scope": "site",
"scope_ids": ["site-uuid-a", "site-uuid-b"],
"severity": "critical",
"notification_channels": {
"email": {"to": ["ops@example.com"]},
"teams": {"webhook_url": "https://teams.microsoft.com/l/..."},
"in_app": {"user_ids": ["user-uuid-1", "user-uuid-2"]}
}
}

Suppress noisy alerts during a maintenance window

Section titled “Suppress noisy alerts during a maintenance window”
POST /api/v1/alert-rules/alerts/{alert_id}/suppress
Content-Type: application/json
{
"suppress_minutes": 120,
"reason": "Scheduled maintenance window 02:00-04:00 UTC"
}

ActionRequired permission
Read rules and alertsalert:read
Create a rulealert:create
Update a rule or alert lifecycle actionalert:update
Delete a rulealert:delete
Manage notification providersORG_ADMIN or SUPER_ADMIN
Send notifications programmaticallyORG_ADMIN or SUPER_ADMIN

Role assignment follows the 7-tier ladder (super_adminguest). You cannot assign a role at or above your own level. See Enterprise overview for the full role table.


Alerts are not firing

  • Check that the worker container is running and the default queue is being consumed.
  • Verify the rule’s status is active - rules with status: disabled or status: draft are skipped by the evaluator.
  • Call POST /api/v1/alert-rules/evaluate to force an immediate evaluation and watch the response for any error details.
  • Check Flower (if the monitoring profile is active) at port 5555 to confirm the evaluate_all_alert_rules task is completing without errors.

Notifications are not being delivered

  • Open the provider record and click Verify to confirm connectivity.
  • Send a Test message and review the response body for the provider error.
  • Check whether the user has muted the relevant category in their preferences.
  • For email: confirm SMTP port, TLS mode, and that the from_email is accepted by the relay.
  • For webhooks: confirm the endpoint is reachable from the FreeSDN API container (not just your browser).

POST /api/v1/alert-rules/evaluate returns 403

The endpoint requires alert:update, not alert:create. Verify your API key or JWT role includes that scope.

Suppress is not lifting automatically

The alert-rules-unsuppress-expired task runs every 5 minutes. If it is overdue, check the Celery worker logs for failures or queue backlog.


  • SLA monitoring - define policies, review breaches, and generate compliance reports
  • Event correlation - group related alerts into managed incidents with assignment
  • Audit log - query the tamper-evident log for alert and notification history
  • Enterprise overview - feature matrix, caveats, and role ladder

All product names, logos, and brands are property of their respective owners. FreeSDN is an independent project and is not affiliated with or endorsed by the vendors it integrates with. See Trademarks.