AI Assistant
The AI Assistant module (id ai, v1.0.0) provides an agentic LLM chat interface that can query your FreeSDN network state, run diagnostics, inspect automation rules, and search observability data. It supports three LLM providers - OpenAI, Anthropic, and Ollama - accessed via httpx directly (no vendor SDKs). All provider calls, rate limits, and token budgets are fail-closed by design.
Read LLM Governance for the full three-layer security model before enabling this in production.
How the chat loop works
Section titled “How the chat loop works”- You send a message to
POST /api/v1/ai/chat. - The service loads the last 50 messages of conversation history for context.
- The message (and history) are dispatched to the configured provider with the 11 registered FreeSDN tools.
- The model may invoke one or more tools. The loop runs at most 5 iterations - if 5 tool calls complete and the model still wants more, a partial-findings answer is returned that names the tools that ran.
- Each tool result is capped at 4,096 characters to prevent context-window blowout and runaway token costs.
- Tool results in the prompt are wrapped in
<tool_result name="...">tags and the system prompt explicitly instructs the model to treat them as untrusted data, not instructions (prompt-injection defense). - The final answer is returned to the caller and persisted in the conversation.
Providers
Section titled “Providers”| Provider | Policy required | Base URL restriction |
|---|---|---|
| OpenAI | cloud_approved | Must be api.openai.com or AI_PROVIDER_PROXY_HOSTS override |
| Anthropic | cloud_approved | Must be api.anthropic.com or AI_PROVIDER_PROXY_HOSTS override |
| Ollama (self-hosted) | local_only or cloud_approved | Any host, except cloud-metadata endpoints (169.254.169.254, fd00:ec2::254, metadata.google.internal) |
Cloud provider base URLs are restricted to a static allowlist. An org_admin cannot redirect an API key to an arbitrary host - this prevents an admin-level SSRF where the key is forwarded to an attacker-controlled endpoint. If you run a reverse proxy in front of OpenAI or Anthropic, add its hostname to AI_PROVIDER_PROXY_HOSTS (comma-separated) in the API container environment.
Ollama allows any host but blocks cloud-metadata endpoints, so a misconfigured OLLAMA_BASE_URL cannot be used to reach the instance metadata service.
Three-layer governance
Section titled “Three-layer governance”The AI module enforces a three-layer model. All three layers must allow a request for it to proceed.
Layer 1: Global kill-switch (env var, super_admin, default OFF) │ └── Layer 2: Per-org policy (org_admin, DB, default "disabled") │ └── Layer 3: Field selector - automation rules declare explicit input_fields; wildcards are not permittedFull details are at AI Governance. The short version:
Layer 1 - global kill-switch
Section titled “Layer 1 - global kill-switch”# .env or API container environmentLLM_GLOBALLY_ENABLED=trueDefault: false. When off, all /api/v1/ai/* endpoints return 503, automation rules with LLM action types are skipped (logged, not errored), and the AI settings section is hidden in the UI. This is the super_admin’s single override that trumps all per-org policy.
Layer 2 - per-org policy
Section titled “Layer 2 - per-org policy”Each organization independently sets one of three modes (default: disabled):
| Policy | What it allows |
|---|---|
disabled (default) | No LLM calls for this org |
local_only | Ollama only - no data leaves the deployment |
cloud_approved | Ollama + OpenAI + Anthropic using org-owned API keys |
Configure via Settings → AI Providers (requires ai.admin or org_admin):
PUT /api/v1/ai/providers/anthropicContent-Type: application/json
{ "api_key": "sk-ant-...", "default_model": "claude-haiku-4-5-20251001", "is_enabled": true, "llm_org_policy": "cloud_approved", "monthly_token_budget": 200000}API keys are Fernet-encrypted at rest. Responses return api_key_set: true/false - the key itself is never returned.
Layer 3 - field selector
Section titled “Layer 3 - field selector”Automation rules that trigger LLM actions must declare an explicit input_fields list. Wildcards are not permitted. Only the named fields from the trigger-event payload are forwarded to the model; the selected field names (not their values) are recorded in the audit log.
Outbound PII redaction (additional protection, not a governance gate)
Section titled “Outbound PII redaction (additional protection, not a governance gate)”Before a message is sent to OpenAI or Anthropic, _redact_messages_for_cloud replaces:
- IPv4 addresses →
<IP> - MAC addresses →
<MAC> - Password/secret/token assignments →
<REDACTED> - JWT-shaped strings →
<JWT>
This redaction applies only to the outgoing payload for cloud providers. The operator UI and Ollama (local) receive verbatim data. Redaction is regex-based and conservative - it reduces exposure but is not a guarantee that all PII is stripped. It is not one of the three governance gates: it runs unconditionally on every cloud call regardless of policy settings.
Token budgets and rate limiting
Section titled “Token budgets and rate limiting”Monthly token budget
Section titled “Monthly token budget”Each organization has a monthly token budget (default: 100,000 tokens). The service:
- Estimates token usage over the full message list (history + new message) via tiktoken, with a 4,096-token buffer for the model response. Estimation overestimates to stay fail-closed.
- Applies the budget check before calling the provider.
- Uses an atomic Redis
INCRBY(keyllm:budget:{org}:{YYYY}:{MM}, 32-day TTL) with DB fallback for durability. - At 80% consumption, publishes an
ai.budget.warningevent (wireable via Fabric). - When the budget is exceeded, calls return
429- no charges accumulate silently. - Budget counters reset on the first day of each calendar month via a Celery Beat task (
ai.reset_monthly_budgets, 00:05 UTC daily).
Check current usage:
GET /api/v1/ai/governance/usage{ "llm_org_policy": "cloud_approved", "monthly_token_budget": 200000, "tokens_used_this_month": 14300, "percentage_used": 7.2, "budget_remaining": 185700}The 11 tools
Section titled “The 11 tools”The assistant has access to 11 read-only tools across four categories. Each tool requires a specific FreeSDN permission. If the calling user lacks a tool’s permission, the model receives an error result for that call - the tool is not silently hidden.
| Category | Tool | Required permission |
|---|---|---|
| Network state | get_devices | device:read |
| Network state | get_device_detail | device:read |
| Network state | get_vlans | network:read |
| Network state | get_alerts | device:read |
| Network state | get_sites | site:read |
| Diagnostics | get_device_health | device:read |
| Diagnostics | get_bandwidth_usage | device:read |
| Observability | search_collector_logs | collector.logs.read |
| Observability | get_top_talkers | collector.flows.read |
| Automation | list_automation_rules | automation.rules.read |
| Automation | get_execution_history | automation.rules.read |
Plugin authors can register up to 20 additional AI tools per plugin via self.register_ai_tool(). Plugin-namespaced tools (plugin_*) are permission-gated at execution time and fail closed if the permission check is missing - this is a defense-in-depth measure against the plugin AI-tool permission-bypass class of vulnerability (PS-11). See Plugin System.
List the tools registered at runtime (including any from loaded plugins):
GET /api/v1/ai/toolsStructured operations for automation
Section titled “Structured operations for automation”In addition to free-form chat, the AI service exposes three structured operations used by the automation engine. These are not chat - they take an explicit field list and return a typed result.
| Operation | Use case | Output |
|---|---|---|
llm_classify | Classify syslog severity, categorize alert type | One label from a declared list |
llm_extract | Extract IP/device name from unstructured syslog | JSON matching a declared schema |
llm_summarize | Digest multiple alerts into a plain-text summary | Plain text, up to 500 words |
The field selector in automation rules applies here: the rule author explicitly selects which fields from the trigger event payload to send to the model. (This is separate from the PII redaction in Layer 3 - it controls which fields reach the model at all, before any cloud-provider redaction is applied.) Wildcards are not permitted. The selected field names (not their values) are what gets stored in the audit log.
Audit log
Section titled “Audit log”Every LLM call writes a row to ai.llm_call_logs. The log is privacy-preserving by design:
- Stored: provider, model, operation type, field names sent (not values), token counts (prompt + completion), latency, success/failure, error message,
rule_id,execution_id. - Not stored: input text, output text, field values.
If you need to reconstruct what was sent in a specific call, correlate the execution_id with AutomationExecutionRecord.trigger_data (which is your system’s record, not the AI module’s).
GET /api/v1/ai/governance/logs?page=1&size=50Requires ai.admin or org_admin.
Conversation management
Section titled “Conversation management”Each chat session is an AIConversation row, owned by the user who created it. Conversations are private - an org admin cannot read another user’s conversation history. The provider used for a conversation is pinned at creation time: follow-up messages on an existing conversation always use the stored provider, not the one passed in the current request. This prevents a stale client dropdown from silently routing history to a provider that cannot decode it.
| Action | Endpoint |
|---|---|
| Start a new conversation | POST /api/v1/ai/chat (omit conversation_id) |
| Continue a conversation | POST /api/v1/ai/chat with conversation_id |
| List your conversations | GET /api/v1/ai/conversations |
| Get a conversation with all messages | GET /api/v1/ai/conversations/{id} |
| Delete a conversation | DELETE /api/v1/ai/conversations/{id} |
Permissions
Section titled “Permissions”| Permission code | Required for |
|---|---|
ai.chat | Send messages to the AI assistant |
ai.admin | Configure providers, view audit logs, set org policy, check usage |
ai.chat is not assigned to any role by default - it must be explicitly granted to users by a super_admin or org_admin (e.g. via a custom role with the ai.chat permission code). Without an explicit grant, no user below super_admin can use AI chat. ai.admin is implicitly held by org_admin, admin, and super_admin roles (via is_org_admin), or can be explicitly granted to any user.
API endpoint reference
Section titled “API endpoint reference”| Method | Path | Purpose | Permission |
|---|---|---|---|
POST | /api/v1/ai/chat | Send a message; runs the agentic loop | ai.chat + rate limit + governance |
GET | /api/v1/ai/conversations | List caller’s conversations (limit 100) | authenticated (own user) |
GET | /api/v1/ai/conversations/{id} | Get conversation + all messages | authenticated (own user) |
DELETE | /api/v1/ai/conversations/{id} | Delete conversation and messages | authenticated (own user) |
GET | /api/v1/ai/providers | List provider configs + governance state (api_key_set bool only) | ai.admin or org_admin |
PUT | /api/v1/ai/providers/{provider_id} | Configure a provider | ai.admin or org_admin |
POST | /api/v1/ai/providers/{provider_id}/test | Test connectivity using stored key | ai.admin or org_admin |
GET | /api/v1/ai/tools | List registered tools (name, description, required permission) | authenticated |
GET | /api/v1/ai/governance/usage | Current month token usage and budget | ai.admin or org_admin |
GET | /api/v1/ai/governance/logs | LLM call audit logs (paginated) | ai.admin or org_admin |
The chat body accepts: message (1-8,000 characters), conversation_id (optional UUID to continue), and provider (one of openai, anthropic, ollama).
Setting up for the first time
Section titled “Setting up for the first time”-
Enable globally - a
super_adminsets the env var and restarts the API container:Terminal window # In .env.pro (or whichever tier file you use)LLM_GLOBALLY_ENABLED=trueTerminal window docker compose --env-file .env.pro restart api -
Set org policy - an
org_adminnavigates to Settings → AI Providers and configures a policy and API key (or Ollama URL for local-only). -
Test connectivity:
POST /api/v1/ai/providers/anthropic/testThe test uses the stored encrypted key - it does not accept a key in the request body.
-
Users with
ai.chatcan now open the AI Assistant page in the sidebar.
Gotchas
Section titled “Gotchas”LLM_GLOBALLY_ENABLEDis an env var, not a DB setting. Changing it requires restarting theapicontainer - aPUTto any API endpoint cannot toggle it.- Two gates, not one. Enabling the global switch does not enable AI for any org. Each org’s policy starts at
disabledand must be changed explicitly. - Cloud providers require
cloud_approvedpolicy. An org onlocal_onlythat sends a request withprovider: openaireceives403. - The agentic loop caps at 5 iterations. Complex questions that need more than 5 tool calls will get a partial answer. Rephrase the question to be more specific if you hit this limit regularly.
- Ollama must be reachable from inside the
apicontainer. Use the Docker service name or a host-accessible URL forOLLAMA_BASE_URL.http://localhost:11434from inside the container points to the container itself, not your host. - Rate limiting is per user, not per org. A single active user running rapid queries can exhaust the monthly token budget quickly. Adjust
monthly_token_budgeton theOrgLLMPolicyfor orgs with heavy expected usage. - Provider is pinned to the conversation. If you switch your default provider, existing conversations continue to use the provider they were started with.
- PII redaction is regex-based. It reduces exposure for cloud calls but does not guarantee that all sensitive strings are caught. Review what data your team will be querying before setting
cloud_approved.
Fabric integration
Section titled “Fabric integration”The AI module emits one native Fabric event:
| Event type | Trigger | Typical use |
|---|---|---|
ai.budget.warning | Token usage reaches 80% of monthly budget | Wire to fabric.notify to alert the org admin before the limit is hit |
Wire this event in Automation → Connections. See Fabric for how to build a connection.
Next steps
Section titled “Next steps”- AI Governance - full detail on the three-layer model, field selectors in automation rules, and what is explicitly out of scope.
- Automation connections - build automation rules that use
llm_classify,llm_extract, orllm_summarizeactions. - Observability - the
search_collector_logsandget_top_talkersAI tools query data collected by this module. - Plugin System - register up to 20 additional AI tools from a plugin.
- Configuration reference -
LLM_GLOBALLY_ENABLED,OLLAMA_BASE_URL,AI_PROVIDER_PROXY_HOSTS. - Roles and permissions -
super_adminandorg_adminresponsibilities for governing AI access.