AI Assistant

The AI Assistant module (id ai, v1.0.0) provides an agentic LLM chat interface that can query your FreeSDN network state, run diagnostics, inspect automation rules, and search observability data. It supports three LLM providers - OpenAI, Anthropic, and Ollama - accessed via httpx directly (no vendor SDKs). All provider calls, rate limits, and token budgets are fail-closed by design.

Read LLM Governance for the full three-layer security model before enabling this in production.

How the chat loop works

You send a message to POST /api/v1/ai/chat.
The service loads the last 50 messages of conversation history for context.
The message (and history) are dispatched to the configured provider with the 11 registered FreeSDN tools.
The model may invoke one or more tools. The loop runs at most 5 iterations - if 5 tool calls complete and the model still wants more, a partial-findings answer is returned that names the tools that ran.
Each tool result is capped at 4,096 characters to prevent context-window blowout and runaway token costs.
Tool results in the prompt are wrapped in <tool_result name="..."> tags and the system prompt explicitly instructs the model to treat them as untrusted data, not instructions (prompt-injection defense).
The final answer is returned to the caller and persisted in the conversation.

Providers

Provider	Policy required	Base URL restriction
OpenAI	`cloud_approved`	Must be `api.openai.com` or `AI_PROVIDER_PROXY_HOSTS` override
Anthropic	`cloud_approved`	Must be `api.anthropic.com` or `AI_PROVIDER_PROXY_HOSTS` override
Ollama (self-hosted)	`local_only` or `cloud_approved`	Any host, except cloud-metadata endpoints (`169.254.169.254`, `fd00:ec2::254`, `metadata.google.internal`)

Cloud provider base URLs are restricted to a static allowlist. An org_admin cannot redirect an API key to an arbitrary host - this prevents an admin-level SSRF where the key is forwarded to an attacker-controlled endpoint. If you run a reverse proxy in front of OpenAI or Anthropic, add its hostname to AI_PROVIDER_PROXY_HOSTS (comma-separated) in the API container environment.

Ollama allows any host but blocks cloud-metadata endpoints, so a misconfigured OLLAMA_BASE_URL cannot be used to reach the instance metadata service.

Three-layer governance

The AI module enforces a three-layer model. All three layers must allow a request for it to proceed.

Layer 1: Global kill-switch  (env var, super_admin, default OFF)
    │
    └── Layer 2: Per-org policy  (org_admin, DB, default "disabled")
            │
            └── Layer 3: Field selector  -  automation rules declare explicit
                         input_fields; wildcards are not permitted

Full details are at AI Governance. The short version:

Layer 1 - global kill-switch

# .env or API container environment
LLM_GLOBALLY_ENABLED=true

Default: false. When off, all /api/v1/ai/* endpoints return 503, automation rules with LLM action types are skipped (logged, not errored), and the AI settings section is hidden in the UI. This is the super_admin’s single override that trumps all per-org policy.

Layer 2 - per-org policy

Each organization independently sets one of three modes (default: disabled):

Policy	What it allows
`disabled` (default)	No LLM calls for this org
`local_only`	Ollama only - no data leaves the deployment
`cloud_approved`	Ollama + OpenAI + Anthropic using org-owned API keys

Configure via Settings → AI Providers (requires ai.admin or org_admin):

PUT /api/v1/ai/providers/anthropic
Content-Type: application/json

{
  "api_key": "sk-ant-...",
  "default_model": "claude-haiku-4-5-20251001",
  "is_enabled": true,
  "llm_org_policy": "cloud_approved",
  "monthly_token_budget": 200000
}

API keys are Fernet-encrypted at rest. Responses return api_key_set: true/false - the key itself is never returned.

Layer 3 - field selector

Automation rules that trigger LLM actions must declare an explicit input_fields list. Wildcards are not permitted. Only the named fields from the trigger-event payload are forwarded to the model; the selected field names (not their values) are recorded in the audit log.

Outbound PII redaction (additional protection, not a governance gate)

Before a message is sent to OpenAI or Anthropic, _redact_messages_for_cloud replaces:

IPv4 addresses → <IP>
MAC addresses → <MAC>
Password/secret/token assignments → <REDACTED>
JWT-shaped strings → <JWT>

This redaction applies only to the outgoing payload for cloud providers. The operator UI and Ollama (local) receive verbatim data. Redaction is regex-based and conservative - it reduces exposure but is not a guarantee that all PII is stripped. It is not one of the three governance gates: it runs unconditionally on every cloud call regardless of policy settings.

Token budgets and rate limiting

Monthly token budget

Each organization has a monthly token budget (default: 100,000 tokens). The service:

Estimates token usage over the full message list (history + new message) via tiktoken, with a 4,096-token buffer for the model response. Estimation overestimates to stay fail-closed.
Applies the budget check before calling the provider.
Uses an atomic Redis INCRBY (key llm:budget:{org}:{YYYY}:{MM}, 32-day TTL) with DB fallback for durability.
At 80% consumption, publishes an ai.budget.warning event (wireable via Fabric).
When the budget is exceeded, calls return 429 - no charges accumulate silently.
Budget counters reset on the first day of each calendar month via a Celery Beat task (ai.reset_monthly_budgets, 00:05 UTC daily).

Check current usage:

GET /api/v1/ai/governance/usage

{
  "llm_org_policy": "cloud_approved",
  "monthly_token_budget": 200000,
  "tokens_used_this_month": 14300,
  "percentage_used": 7.2,
  "budget_remaining": 185700
}

The 11 tools

The assistant has access to 11 read-only tools across four categories. Each tool requires a specific FreeSDN permission. If the calling user lacks a tool’s permission, the model receives an error result for that call - the tool is not silently hidden.

Category	Tool	Required permission
Network state	`get_devices`	`device:read`
Network state	`get_device_detail`	`device:read`
Network state	`get_vlans`	`network:read`
Network state	`get_alerts`	`device:read`
Network state	`get_sites`	`site:read`
Diagnostics	`get_device_health`	`device:read`
Diagnostics	`get_bandwidth_usage`	`device:read`
Observability	`search_collector_logs`	`collector.logs.read`
Observability	`get_top_talkers`	`collector.flows.read`
Automation	`list_automation_rules`	`automation.rules.read`
Automation	`get_execution_history`	`automation.rules.read`

Plugin authors can register up to 20 additional AI tools per plugin via self.register_ai_tool(). Plugin-namespaced tools (plugin_*) are permission-gated at execution time and fail closed if the permission check is missing - this is a defense-in-depth measure against the plugin AI-tool permission-bypass class of vulnerability (PS-11). See Plugin System.

List the tools registered at runtime (including any from loaded plugins):

GET /api/v1/ai/tools

Structured operations for automation

In addition to free-form chat, the AI service exposes three structured operations used by the automation engine. These are not chat - they take an explicit field list and return a typed result.

Operation	Use case	Output
`llm_classify`	Classify syslog severity, categorize alert type	One label from a declared list
`llm_extract`	Extract IP/device name from unstructured syslog	JSON matching a declared schema
`llm_summarize`	Digest multiple alerts into a plain-text summary	Plain text, up to 500 words

The field selector in automation rules applies here: the rule author explicitly selects which fields from the trigger event payload to send to the model. (This is separate from the PII redaction in Layer 3 - it controls which fields reach the model at all, before any cloud-provider redaction is applied.) Wildcards are not permitted. The selected field names (not their values) are what gets stored in the audit log.

Audit log

Every LLM call writes a row to ai.llm_call_logs. The log is privacy-preserving by design:

Stored: provider, model, operation type, field names sent (not values), token counts (prompt + completion), latency, success/failure, error message, rule_id, execution_id.
Not stored: input text, output text, field values.

If you need to reconstruct what was sent in a specific call, correlate the execution_id with AutomationExecutionRecord.trigger_data (which is your system’s record, not the AI module’s).

GET /api/v1/ai/governance/logs?page=1&size=50

Requires ai.admin or org_admin.

Conversation management

Each chat session is an AIConversation row, owned by the user who created it. Conversations are private - an org admin cannot read another user’s conversation history. The provider used for a conversation is pinned at creation time: follow-up messages on an existing conversation always use the stored provider, not the one passed in the current request. This prevents a stale client dropdown from silently routing history to a provider that cannot decode it.

Action	Endpoint
Start a new conversation	`POST /api/v1/ai/chat` (omit `conversation_id`)
Continue a conversation	`POST /api/v1/ai/chat` with `conversation_id`
List your conversations	`GET /api/v1/ai/conversations`
Get a conversation with all messages	`GET /api/v1/ai/conversations/{id}`
Delete a conversation	`DELETE /api/v1/ai/conversations/{id}`

Permissions

Permission code	Required for
`ai.chat`	Send messages to the AI assistant
`ai.admin`	Configure providers, view audit logs, set org policy, check usage

ai.chat is not assigned to any role by default - it must be explicitly granted to users by a super_admin or org_admin (e.g. via a custom role with the ai.chat permission code). Without an explicit grant, no user below super_admin can use AI chat. ai.admin is implicitly held by org_admin, admin, and super_admin roles (via is_org_admin), or can be explicitly granted to any user.

API endpoint reference

Method	Path	Purpose	Permission
`POST`	`/api/v1/ai/chat`	Send a message; runs the agentic loop	`ai.chat` + rate limit + governance
`GET`	`/api/v1/ai/conversations`	List caller’s conversations (limit 100)	authenticated (own user)
`GET`	`/api/v1/ai/conversations/{id}`	Get conversation + all messages	authenticated (own user)
`DELETE`	`/api/v1/ai/conversations/{id}`	Delete conversation and messages	authenticated (own user)
`GET`	`/api/v1/ai/providers`	List provider configs + governance state (`api_key_set` bool only)	`ai.admin` or `org_admin`
`PUT`	`/api/v1/ai/providers/{provider_id}`	Configure a provider	`ai.admin` or `org_admin`
`POST`	`/api/v1/ai/providers/{provider_id}/test`	Test connectivity using stored key	`ai.admin` or `org_admin`
`GET`	`/api/v1/ai/tools`	List registered tools (name, description, required permission)	authenticated
`GET`	`/api/v1/ai/governance/usage`	Current month token usage and budget	`ai.admin` or `org_admin`
`GET`	`/api/v1/ai/governance/logs`	LLM call audit logs (paginated)	`ai.admin` or `org_admin`

The chat body accepts: message (1-8,000 characters), conversation_id (optional UUID to continue), and provider (one of openai, anthropic, ollama).

Setting up for the first time

Enable globally - a super_admin sets the env var and restarts the API container:

# In .env.pro (or whichever tier file you use)
LLM_GLOBALLY_ENABLED=true

docker compose --env-file .env.pro restart api

Set org policy - an org_admin navigates to Settings → AI Providers and configures a policy and API key (or Ollama URL for local-only).
Test connectivity:
```
POST /api/v1/ai/providers/anthropic/test
```
The test uses the stored encrypted key - it does not accept a key in the request body.
Users with ai.chat can now open the AI Assistant page in the sidebar.

Gotchas

LLM_GLOBALLY_ENABLED is an env var, not a DB setting. Changing it requires restarting the api container - a PUT to any API endpoint cannot toggle it.
Two gates, not one. Enabling the global switch does not enable AI for any org. Each org’s policy starts at disabled and must be changed explicitly.
Cloud providers require cloud_approved policy. An org on local_only that sends a request with provider: openai receives 403.
The agentic loop caps at 5 iterations. Complex questions that need more than 5 tool calls will get a partial answer. Rephrase the question to be more specific if you hit this limit regularly.
Ollama must be reachable from inside the api container. Use the Docker service name or a host-accessible URL for OLLAMA_BASE_URL. http://localhost:11434 from inside the container points to the container itself, not your host.
Rate limiting is per user, not per org. A single active user running rapid queries can exhaust the monthly token budget quickly. Adjust monthly_token_budget on the OrgLLMPolicy for orgs with heavy expected usage.
Provider is pinned to the conversation. If you switch your default provider, existing conversations continue to use the provider they were started with.
PII redaction is regex-based. It reduces exposure for cloud calls but does not guarantee that all sensitive strings are caught. Review what data your team will be querying before setting cloud_approved.

Fabric integration

The AI module emits one native Fabric event:

Event type	Trigger	Typical use
`ai.budget.warning`	Token usage reaches 80% of monthly budget	Wire to `fabric.notify` to alert the org admin before the limit is hit

Wire this event in Automation → Connections. See Fabric for how to build a connection.

Next steps

AI Governance - full detail on the three-layer model, field selectors in automation rules, and what is explicitly out of scope.
Automation connections - build automation rules that use llm_classify, llm_extract, or llm_summarize actions.
Observability - the search_collector_logs and get_top_talkers AI tools query data collected by this module.
Plugin System - register up to 20 additional AI tools from a plugin.
Configuration reference - LLM_GLOBALLY_ENABLED, OLLAMA_BASE_URL, AI_PROVIDER_PROXY_HOSTS.
Roles and permissions - super_admin and org_admin responsibilities for governing AI access.