Skip to content

AI Assistant

The AI Assistant module (id ai, v1.0.0) provides an agentic LLM chat interface that can query your FreeSDN network state, run diagnostics, inspect automation rules, and search observability data. It supports three LLM providers - OpenAI, Anthropic, and Ollama - accessed via httpx directly (no vendor SDKs). All provider calls, rate limits, and token budgets are fail-closed by design.

Read LLM Governance for the full three-layer security model before enabling this in production.

  1. You send a message to POST /api/v1/ai/chat.
  2. The service loads the last 50 messages of conversation history for context.
  3. The message (and history) are dispatched to the configured provider with the 11 registered FreeSDN tools.
  4. The model may invoke one or more tools. The loop runs at most 5 iterations - if 5 tool calls complete and the model still wants more, a partial-findings answer is returned that names the tools that ran.
  5. Each tool result is capped at 4,096 characters to prevent context-window blowout and runaway token costs.
  6. Tool results in the prompt are wrapped in <tool_result name="..."> tags and the system prompt explicitly instructs the model to treat them as untrusted data, not instructions (prompt-injection defense).
  7. The final answer is returned to the caller and persisted in the conversation.
ProviderPolicy requiredBase URL restriction
OpenAIcloud_approvedMust be api.openai.com or AI_PROVIDER_PROXY_HOSTS override
Anthropiccloud_approvedMust be api.anthropic.com or AI_PROVIDER_PROXY_HOSTS override
Ollama (self-hosted)local_only or cloud_approvedAny host, except cloud-metadata endpoints (169.254.169.254, fd00:ec2::254, metadata.google.internal)

Cloud provider base URLs are restricted to a static allowlist. An org_admin cannot redirect an API key to an arbitrary host - this prevents an admin-level SSRF where the key is forwarded to an attacker-controlled endpoint. If you run a reverse proxy in front of OpenAI or Anthropic, add its hostname to AI_PROVIDER_PROXY_HOSTS (comma-separated) in the API container environment.

Ollama allows any host but blocks cloud-metadata endpoints, so a misconfigured OLLAMA_BASE_URL cannot be used to reach the instance metadata service.

The AI module enforces a three-layer model. All three layers must allow a request for it to proceed.

Layer 1: Global kill-switch (env var, super_admin, default OFF)
└── Layer 2: Per-org policy (org_admin, DB, default "disabled")
└── Layer 3: Field selector - automation rules declare explicit
input_fields; wildcards are not permitted

Full details are at AI Governance. The short version:

Terminal window
# .env or API container environment
LLM_GLOBALLY_ENABLED=true

Default: false. When off, all /api/v1/ai/* endpoints return 503, automation rules with LLM action types are skipped (logged, not errored), and the AI settings section is hidden in the UI. This is the super_admin’s single override that trumps all per-org policy.

Each organization independently sets one of three modes (default: disabled):

PolicyWhat it allows
disabled (default)No LLM calls for this org
local_onlyOllama only - no data leaves the deployment
cloud_approvedOllama + OpenAI + Anthropic using org-owned API keys

Configure via Settings → AI Providers (requires ai.admin or org_admin):

PUT /api/v1/ai/providers/anthropic
Content-Type: application/json
{
"api_key": "sk-ant-...",
"default_model": "claude-haiku-4-5-20251001",
"is_enabled": true,
"llm_org_policy": "cloud_approved",
"monthly_token_budget": 200000
}

API keys are Fernet-encrypted at rest. Responses return api_key_set: true/false - the key itself is never returned.

Automation rules that trigger LLM actions must declare an explicit input_fields list. Wildcards are not permitted. Only the named fields from the trigger-event payload are forwarded to the model; the selected field names (not their values) are recorded in the audit log.

Outbound PII redaction (additional protection, not a governance gate)

Section titled “Outbound PII redaction (additional protection, not a governance gate)”

Before a message is sent to OpenAI or Anthropic, _redact_messages_for_cloud replaces:

  • IPv4 addresses → <IP>
  • MAC addresses → <MAC>
  • Password/secret/token assignments → <REDACTED>
  • JWT-shaped strings → <JWT>

This redaction applies only to the outgoing payload for cloud providers. The operator UI and Ollama (local) receive verbatim data. Redaction is regex-based and conservative - it reduces exposure but is not a guarantee that all PII is stripped. It is not one of the three governance gates: it runs unconditionally on every cloud call regardless of policy settings.

Each organization has a monthly token budget (default: 100,000 tokens). The service:

  • Estimates token usage over the full message list (history + new message) via tiktoken, with a 4,096-token buffer for the model response. Estimation overestimates to stay fail-closed.
  • Applies the budget check before calling the provider.
  • Uses an atomic Redis INCRBY (key llm:budget:{org}:{YYYY}:{MM}, 32-day TTL) with DB fallback for durability.
  • At 80% consumption, publishes an ai.budget.warning event (wireable via Fabric).
  • When the budget is exceeded, calls return 429 - no charges accumulate silently.
  • Budget counters reset on the first day of each calendar month via a Celery Beat task (ai.reset_monthly_budgets, 00:05 UTC daily).

Check current usage:

GET /api/v1/ai/governance/usage
{
"llm_org_policy": "cloud_approved",
"monthly_token_budget": 200000,
"tokens_used_this_month": 14300,
"percentage_used": 7.2,
"budget_remaining": 185700
}

The assistant has access to 11 read-only tools across four categories. Each tool requires a specific FreeSDN permission. If the calling user lacks a tool’s permission, the model receives an error result for that call - the tool is not silently hidden.

CategoryToolRequired permission
Network stateget_devicesdevice:read
Network stateget_device_detaildevice:read
Network stateget_vlansnetwork:read
Network stateget_alertsdevice:read
Network stateget_sitessite:read
Diagnosticsget_device_healthdevice:read
Diagnosticsget_bandwidth_usagedevice:read
Observabilitysearch_collector_logscollector.logs.read
Observabilityget_top_talkerscollector.flows.read
Automationlist_automation_rulesautomation.rules.read
Automationget_execution_historyautomation.rules.read

Plugin authors can register up to 20 additional AI tools per plugin via self.register_ai_tool(). Plugin-namespaced tools (plugin_*) are permission-gated at execution time and fail closed if the permission check is missing - this is a defense-in-depth measure against the plugin AI-tool permission-bypass class of vulnerability (PS-11). See Plugin System.

List the tools registered at runtime (including any from loaded plugins):

GET /api/v1/ai/tools

In addition to free-form chat, the AI service exposes three structured operations used by the automation engine. These are not chat - they take an explicit field list and return a typed result.

OperationUse caseOutput
llm_classifyClassify syslog severity, categorize alert typeOne label from a declared list
llm_extractExtract IP/device name from unstructured syslogJSON matching a declared schema
llm_summarizeDigest multiple alerts into a plain-text summaryPlain text, up to 500 words

The field selector in automation rules applies here: the rule author explicitly selects which fields from the trigger event payload to send to the model. (This is separate from the PII redaction in Layer 3 - it controls which fields reach the model at all, before any cloud-provider redaction is applied.) Wildcards are not permitted. The selected field names (not their values) are what gets stored in the audit log.

Every LLM call writes a row to ai.llm_call_logs. The log is privacy-preserving by design:

  • Stored: provider, model, operation type, field names sent (not values), token counts (prompt + completion), latency, success/failure, error message, rule_id, execution_id.
  • Not stored: input text, output text, field values.

If you need to reconstruct what was sent in a specific call, correlate the execution_id with AutomationExecutionRecord.trigger_data (which is your system’s record, not the AI module’s).

GET /api/v1/ai/governance/logs?page=1&size=50

Requires ai.admin or org_admin.

Each chat session is an AIConversation row, owned by the user who created it. Conversations are private - an org admin cannot read another user’s conversation history. The provider used for a conversation is pinned at creation time: follow-up messages on an existing conversation always use the stored provider, not the one passed in the current request. This prevents a stale client dropdown from silently routing history to a provider that cannot decode it.

ActionEndpoint
Start a new conversationPOST /api/v1/ai/chat (omit conversation_id)
Continue a conversationPOST /api/v1/ai/chat with conversation_id
List your conversationsGET /api/v1/ai/conversations
Get a conversation with all messagesGET /api/v1/ai/conversations/{id}
Delete a conversationDELETE /api/v1/ai/conversations/{id}
Permission codeRequired for
ai.chatSend messages to the AI assistant
ai.adminConfigure providers, view audit logs, set org policy, check usage

ai.chat is not assigned to any role by default - it must be explicitly granted to users by a super_admin or org_admin (e.g. via a custom role with the ai.chat permission code). Without an explicit grant, no user below super_admin can use AI chat. ai.admin is implicitly held by org_admin, admin, and super_admin roles (via is_org_admin), or can be explicitly granted to any user.

MethodPathPurposePermission
POST/api/v1/ai/chatSend a message; runs the agentic loopai.chat + rate limit + governance
GET/api/v1/ai/conversationsList caller’s conversations (limit 100)authenticated (own user)
GET/api/v1/ai/conversations/{id}Get conversation + all messagesauthenticated (own user)
DELETE/api/v1/ai/conversations/{id}Delete conversation and messagesauthenticated (own user)
GET/api/v1/ai/providersList provider configs + governance state (api_key_set bool only)ai.admin or org_admin
PUT/api/v1/ai/providers/{provider_id}Configure a providerai.admin or org_admin
POST/api/v1/ai/providers/{provider_id}/testTest connectivity using stored keyai.admin or org_admin
GET/api/v1/ai/toolsList registered tools (name, description, required permission)authenticated
GET/api/v1/ai/governance/usageCurrent month token usage and budgetai.admin or org_admin
GET/api/v1/ai/governance/logsLLM call audit logs (paginated)ai.admin or org_admin

The chat body accepts: message (1-8,000 characters), conversation_id (optional UUID to continue), and provider (one of openai, anthropic, ollama).

  1. Enable globally - a super_admin sets the env var and restarts the API container:

    Terminal window
    # In .env.pro (or whichever tier file you use)
    LLM_GLOBALLY_ENABLED=true
    Terminal window
    docker compose --env-file .env.pro restart api
  2. Set org policy - an org_admin navigates to Settings → AI Providers and configures a policy and API key (or Ollama URL for local-only).

  3. Test connectivity:

    POST /api/v1/ai/providers/anthropic/test

    The test uses the stored encrypted key - it does not accept a key in the request body.

  4. Users with ai.chat can now open the AI Assistant page in the sidebar.

  • LLM_GLOBALLY_ENABLED is an env var, not a DB setting. Changing it requires restarting the api container - a PUT to any API endpoint cannot toggle it.
  • Two gates, not one. Enabling the global switch does not enable AI for any org. Each org’s policy starts at disabled and must be changed explicitly.
  • Cloud providers require cloud_approved policy. An org on local_only that sends a request with provider: openai receives 403.
  • The agentic loop caps at 5 iterations. Complex questions that need more than 5 tool calls will get a partial answer. Rephrase the question to be more specific if you hit this limit regularly.
  • Ollama must be reachable from inside the api container. Use the Docker service name or a host-accessible URL for OLLAMA_BASE_URL. http://localhost:11434 from inside the container points to the container itself, not your host.
  • Rate limiting is per user, not per org. A single active user running rapid queries can exhaust the monthly token budget quickly. Adjust monthly_token_budget on the OrgLLMPolicy for orgs with heavy expected usage.
  • Provider is pinned to the conversation. If you switch your default provider, existing conversations continue to use the provider they were started with.
  • PII redaction is regex-based. It reduces exposure for cloud calls but does not guarantee that all sensitive strings are caught. Review what data your team will be querying before setting cloud_approved.

The AI module emits one native Fabric event:

Event typeTriggerTypical use
ai.budget.warningToken usage reaches 80% of monthly budgetWire to fabric.notify to alert the org admin before the limit is hit

Wire this event in Automation → Connections. See Fabric for how to build a connection.

  • AI Governance - full detail on the three-layer model, field selectors in automation rules, and what is explicitly out of scope.
  • Automation connections - build automation rules that use llm_classify, llm_extract, or llm_summarize actions.
  • Observability - the search_collector_logs and get_top_talkers AI tools query data collected by this module.
  • Plugin System - register up to 20 additional AI tools from a plugin.
  • Configuration reference - LLM_GLOBALLY_ENABLED, OLLAMA_BASE_URL, AI_PROVIDER_PROXY_HOSTS.
  • Roles and permissions - super_admin and org_admin responsibilities for governing AI access.