Skip to content

Observability

The Observability module (module id collector, v1.0.0) runs three passive asyncio UDP listeners - SNMP traps, Syslog, and NetFlow - inside the api container and persists everything to TimescaleDB (the logdb). Nothing polls your devices. Devices push telemetry to FreeSDN.

The REST API mounts at /api/v1/collector/. The UI lives at Observability and Observability → Log Explorer in the sidebar.

ProtocolStandardDefault portWhat gets stored
SNMP trapsSNMPv1 + v2cUDP 162Source IP, enterprise OID, trap type (generic / v2c), varbinds (list of OID + value); community string is embedded in the message field only, not stored as a separate column; severity is not populated for SNMP traps (syslog only)
SyslogRFC 3164 + RFC 5424UDP 514Facility, severity, hostname, app name, message, raw line (capped at 2 000 chars)
NetFlowv5 + v9UDP 20555-tuple (src IP / dst IP / src port / dst port / protocol), bytes in/out, packets, 1-minute bucket time, DPI app name + category

All records carry the organization_id of your instance (multi-tenant safe). Incoming packets whose source IP resolves to a known managed device are linked to that device_id automatically (60-second TTL cache). Packets from unrecognized IPs are silently dropped - they never enter the database. This is tracked by an internal _dropped_unknown_source counter on each receiver that is not exposed in the status API. To see whether packets are being dropped due to unknown sources, add the device to FreeSDN under its correct IP and watch for data to appear.

CodeRequired for
collector.logs.readSearch and view SNMP and syslog log entries; used by the Log Explorer
collector.flows.readView NetFlow records and traffic analytics (top talkers, protocol breakdown)
collector.configEnable/disable receivers, set ports, update retention and source allow-list

viewer and above can hold read permissions. Config changes require the collector.config permission. org_admin and super_admin users have it implicitly; lower roles (including site_admin) must be granted it explicitly via module settings.

All configuration is per-organization. Changes are applied live via hot-reload - no container restart required.

Before you enable any receiver, decide which IP ranges send telemetry to FreeSDN. You will set allowed_source_ips to a list of CIDRs in the config body below. An empty list blocks everything.

Requires collector.config permission.

PUT /api/v1/collector/config
Content-Type: application/json
{
"snmp_enabled": true,
"snmp_port": 162,
"snmp_community": "public",
"syslog_enabled": true,
"syslog_port": 514,
"netflow_enabled": true,
"netflow_port": 2055,
"log_retention_days": 30,
"flow_retention_days": 7,
"allowed_source_ips": ["10.0.0.0/8", "192.168.1.0/24"]
}

All fields are optional in the request body - omit any you do not want to change. Allowed mutable fields and their defaults:

FieldTypeDefaultNotes
snmp_enabledboolfalseEnable SNMP trap listener
snmp_portint162Bind port
snmp_communitystring"public"Accepted community string for v1/v2c
syslog_enabledboolfalseEnable syslog listener
syslog_portint514Bind port
netflow_enabledboolfalseEnable NetFlow listener
netflow_portint2055Bind port
log_retention_daysint30Days to retain SNMP + syslog records (not validated server-side; use a positive integer)
flow_retention_daysint7Days to retain NetFlow records (not validated server-side; use a positive integer)
allowed_source_ipslist of CIDRs[]Empty = block all. List every range that sends telemetry
GET /api/v1/collector/status

The response reports each service: running, port, rejected (allowlist drops), and for NetFlow, dropped (queue overflow).

4. (Optional) Read back the current config

Section titled “4. (Optional) Read back the current config”
GET /api/v1/collector/config

Returns the stored config, or defaults if none has been saved yet.

The UDP listeners bind inside the api container. Caddy handles TLS/TCP only - it does not proxy UDP. You must expose the collector ports directly on the api service in your Compose file.

services:
api:
ports:
- "162:162/udp"
- "514:514/udp"
- "2055:2055/udp"

If you run behind a Docker NAT or Docker Desktop, add the same entries to your docker-compose.override.yml.

How source-IP filtering and multi-tenancy work

Section titled “How source-IP filtering and multi-tenancy work”

When a UDP packet arrives, the module applies two checks before any parsing:

  1. Allowlist check - the source IP must fall within one of the CIDRs in allowed_source_ips. Packets outside the list are counted as rejected and discarded.
  2. Tenant resolution - the source IP is matched against Device.ip_address in the database (60-second cache, up to 4 096 entries). If no managed device matches, the packet is counted as _dropped_unknown_source and discarded. This means only packets from devices you have already added to FreeSDN are persisted.

Both checks must pass. A packet from a known CIDR but an unknown device address is still dropped. Add the device to FreeSDN under its correct IP before expecting its telemetry to appear.

Data typeDefaultSetting keyMaximum
SNMP + syslog entries30 dayslog_retention_daysNo enforced limit
NetFlow flow records7 daysflow_retention_daysNo enforced limit

Pruning runs as part of the Celery scheduler service. TimescaleDB handles chunk compression and automatic retention policies to reduce on-disk size for high-volume deployments.

Observability → Log Explorer (GET /api/v1/collector/logs) gives you a filterable, paginated view of collected SNMP trap and syslog entries. The detail drawer shows the full raw_data, enterprise_oid, trap_type, and varbinds for SNMP entries.

Query parameters:

ParameterValuesDescription
source_typesnmp_trap, syslogFilter to one receiver type
severityemergency alert critical error warning notice informational debugSyslog severity level
device_idUUIDFilter to a specific managed device
start_time / end_timeISO-8601Time window (defaults to the last 1 hour)
qstringFull-text search in the message field (SQL ILIKE)
pageint ≥ 1Pagination page
size1-500Results per page

Example - find critical syslog messages in the last hour:

GET /api/v1/collector/logs?source_type=syslog&severity=critical&start_time=2026-06-06T00:00:00Z&end_time=2026-06-06T01:00:00Z

Returns counts broken down by severity, source type, and top sending IPs:

GET /api/v1/collector/logs/stats?hours=24

hours accepts 1-168 (1 week). Default is 24.

GET /api/v1/collector/logs/{log_id}

Returns the complete record including raw_data (the original unparsed line, capped at 2 000 chars) and all decoded fields.

GET /api/v1/collector/flows

Accepted parameters: device_id, source_ip, dest_ip, protocol (IP protocol number as integer), start_time, end_time, page, size (1-500).

GET /api/v1/collector/flows/top-talkers?hours=1&limit=10&sort_by=bytes&site_id=<uuid>

Returns the top-N source IPs ranked by bytes or packet count. Parameters:

ParameterValuesDefault
hours1-1681
limit1-5010
sort_bybytes, packetsbytes
site_idUUID (optional)Aggregates across whole org if omitted

site_id is validated against your site grants - a site you cannot access returns 403.

GET /api/v1/collector/flows/protocol-breakdown?hours=24

Returns traffic volume grouped by IP protocol number with human-readable names for common protocols (TCP, UDP, ICMP, GRE, ESP, OSPF). hours accepts 1-168.

Every NetFlow record is classified before storage using a built-in O(1) lookup table of 60+ rules (protocol + destination port → application name + category). The classifier covers common infrastructure protocols (DNS, SNMP, NTP, BGP, RADIUS, LDAP, Kerberos), network services (HTTP, HTTPS, FTP, SSH, SMB), databases (MySQL, PostgreSQL, Redis), VPNs (OpenVPN, WireGuard), VoIP (SIP, RTP, RTSP), and popular applications (Steam, Xbox, Zoom, MQTT, CoAP, among others).

Application categories: web, streaming, conferencing, email, file_transfer, vpn_tunnel, dns, database, gaming, social, infrastructure, security, iot, voip, other.

You can add custom classification rules via the database (ApplicationClassificationRule table in the collector schema). Custom rules by priority and can override built-in entries. Port ranges are supported up to 1 000 ports per rule.

Classified app_name and app_category are stored on each FlowRecord row but are not returned by the flow-search (GET /collector/flows), top-talkers, or protocol-breakdown endpoints. DPI analytics are exposed through a dedicated set of endpoints under /api/v1/dpi/ (summary, app-breakdown, app-trends, per-client usage) that group and aggregate the stored fields directly.

The AI Assistant can query Observability data directly when both modules are enabled. Two tools are available in the chat:

ToolRequired permission
search_collector_logscollector.logs.read
get_top_talkerscollector.flows.read

The AI Assistant is globally off by default and requires a separate opt-in. See AI Assistant for setup.

All endpoints require authentication. Org isolation is enforced in the service layer - you only see your organization’s data. Source status and config endpoints require collector.config; log and flow queries require their respective read permission.

MethodPathPurpose
GET/api/v1/collector/configRead current per-org collector config (returns defaults if none saved)
PUT/api/v1/collector/configUpdate config + hot-reload listeners
GET/api/v1/collector/statusPer-service running status, port, rejected + dropped counts
GET/api/v1/collector/logsSearch/filter log entries (SNMP + syslog)
GET/api/v1/collector/logs/statsAggregate counts by severity / source type / IP
GET/api/v1/collector/logs/{log_id}Full log detail including raw data
GET/api/v1/collector/flowsSearch/filter NetFlow records
GET/api/v1/collector/flows/top-talkersTop-N source IPs by bytes or packets
GET/api/v1/collector/flows/protocol-breakdownTraffic by IP protocol, named

The receiver is pure-Python BER decoding with no dependency on net-snmp. It handles:

  • SNMPv1 trap PDU (0xA4) - skips agent-address, generic-trap, specific-trap, timestamp fields, then reads varbinds.
  • SNMPv2c trap PDU (0xA7) - skips request-id, error-status, error-index, then reads varbinds.

BER tag types decoded: INTEGER, OCTET_STRING, NULL, OID, SEQUENCE. Each trap is stored with the enterprise OID, trap type (generic or v2c), and varbinds as JSONB. The community string is embedded in the message field rather than stored as a separate column.

The parser tries RFC 5424 first, then RFC 3164. Lines that match neither format are stored verbatim as the message field. Facility and severity are decoded from the PRI byte (facility = pri >> 3, severity = pri & 0x07). The raw line is capped at 2 000 characters.

  • v5: fixed 48-byte records per flow, maximum 30 records per packet.
  • v9: template-based. Each source IP has its own template cache (up to 256 templates) with LRU eviction. A maximum of 1 024 distinct source IPs are tracked simultaneously, preventing one chatty exporter from evicting another source’s templates. Flows buffer in memory (up to 100 000 pending), then flush to TimescaleDB every 60 seconds. Queue overflow increments the dropped counter visible in GET /api/v1/collector/status. A final flush runs on shutdown.
SymptomLikely causeFix
No data in Log Explorer or flow analyticsReceiver not enabled OR allowed_source_ips is emptyEnable the receiver in config; add your network CIDRs to allowed_source_ips
Status shows listener not running on port 162 or 514Port < 1024 privilege failureAdd CAP_NET_BIND_SERVICE to the api container or move to a port > 1024
Data arrives (source IP passes allowlist - rejected count stays flat) but nothing is storedSource device not added to FreeSDN under its correct IP - the tenant resolver cannot map the source IP to a device_idAdd the device to FreeSDN under its correct IP address
rejected count rises in statusSource IP not in allowed_source_ips CIDRAdd the CIDR to the allow-list in config
NetFlow records stop after exporter restartv9 templates not re-learnedNormal - the exporter must re-send templates; can take up to 60 seconds
syslog_port: 514 fails on Linux host (non-container)Host rsyslog holds the portChange syslog_port to 10514 (or any port > 1024) and reconfigure senders
  • Deployment tiers and Compose profiles - understand the logdb (TimescaleDB) service that backs Observability storage and how to size it per tier.
  • AI Assistant - enable the chat interface to query logs and top-talkers conversationally.
  • Automation connections - wire collector data into automation rules (e.g. trigger an alert when log volume from a host spikes).
  • API overview - full OpenAPI spec location and authentication; enable ENABLE_DOCS=true in a non-production environment to browse Swagger.