Rate Limiting

FreeSDN applies rate limiting at the API gateway layer to protect against credential-stuffing, runaway automation, and denial-of-service. Limits are enforced in middleware before any route handler runs, so they apply uniformly across every endpoint including module routes and vendor adapter surfaces.

How limits work

The rate limiter (RateLimitMiddleware) uses a Redis sorted-set sliding window. Every request records a timestamp in a per-principal key and counts entries within the last minute or the last second (for burst). No approximations: the window slides continuously, so a burst of requests at 11:59:59 does not “reset” at 12:00:00.

There are two independent checks per request, evaluated in order:

Check	Default	Trigger	`Retry-After`
Burst	120 req/sec	More than 120 requests in any one-second window	`1`
Per-minute	600 req/min	More than 600 requests in any one-minute window	`60`

Both limits fire a 429 Too Many Requests response. The burst check fires first, so you will see Retry-After: 1 before you see Retry-After: 60.

Per-principal keying

The limiter identifies who is making the request:

Authenticated user - the sub claim is extracted from the freesdn_access JWT cookie using a local HMAC verification (no database or Redis round-trip). All requests made by the same user - regardless of which device, browser, or API client - share one bucket.
Unauthenticated / no valid JWT - the limiter falls back to rl:ip:<ip>. All unauthenticated traffic from the same IP address shares one bucket.

This means a single over-active API client counts against the authenticated user’s limit, not against an IP. If you operate behind NAT and share an IP with other FreeSDN users who are not yet authenticated, those users share the IP bucket.

Auth endpoints have stricter limits

Authentication routes (/api/v1/auth/*) have a tighter per-IP limit enforced inside the auth handlers themselves, independent of the middleware layer:

Scope	Limit
Per IP (auth routes)	5 requests / minute
Per username (failed attempts)	20 failures / 5 minutes
Password reset - per IP	5 requests / hour
Password reset - per email	3 requests / hour

These per-auth limits run on Redis as well. After 5 failed login attempts the account locks for 30 minutes regardless of rate limits.

Skipped endpoints

The following endpoints are excluded from rate limiting because they are high-frequency probes or have their own flow control:

GET /health and /health/*
GET /api/v1/health*
POST /api/v1/cameras/{camera_id}/stream-token

Request headers

Non-429 (success) responses include the following rate-limit headers. On a 429 response only Retry-After is guaranteed; X-RateLimit-Limit and X-RateLimit-Remaining are not present on 429 responses. (X-Request-ID is still present on 429s because the outer RequestIDMiddleware adds it to every response.)

Header	Present on	Description
`X-RateLimit-Limit`	Success responses only	The per-minute cap in effect for this principal
`X-RateLimit-Remaining`	Success responses only	Requests remaining in the current minute window
`Retry-After`	429 responses only	Seconds to wait before retrying

Read X-RateLimit-Remaining proactively to slow down before you hit the limit. Retry-After is set to 1 on a burst 429 and 60 on a per-minute 429.

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60

{"detail": "Rate limit exceeded", "retry_after": 60}

The burst-limit variant returns "Rate limit exceeded (burst)" with "retry_after": 1. The rate-limit middleware returns a raw JSONResponse directly - it does not go through the global exception handlers, so there is no error envelope, code, message, or request_id field. Those fields appear only in custom exception-handler responses (adapter errors, auth errors, etc.).

Handling 429 in your client

Follow this pattern for any API client or automation script:

Check every response for status 429 before processing the body.
Read Retry-After from the response headers.
Wait exactly that many seconds (do not use a fixed sleep or exponential backoff that ignores the header - the server already computed the correct delay).
Retry the original request once. If it 429s again, apply exponential backoff starting from the Retry-After value.
Propagate the X-Request-ID from the original 429 when you report errors to your operator.

import time
import httpx

def api_get(client: httpx.Client, url: str, **kwargs):
    for attempt in range(5):
        resp = client.get(url, **kwargs)
        if resp.status_code == 429:
            wait = int(resp.headers.get("Retry-After", 60))
            time.sleep(wait * (2 ** attempt))
            continue
        resp.raise_for_status()
        return resp
    raise RuntimeError("Exceeded retry budget after rate limiting")

Configuration reference

Variable	Default	Description
`RATE_LIMIT_ENABLED`	`True`	When `False`, disables the per-IP in-memory rate limiter for public agent-download endpoints only (`/api/v1/agent-downloads/`). It does not* disable the global `RateLimitMiddleware`; that middleware is unconditionally added during app startup. To disable the global middleware you would need to modify the `setup_middleware` call in `main.py`.
`RATE_LIMIT_DEFAULT`	`100/minute`	Label used in logs (actual enforcement is `RATE_LIMIT_RPM`)
`RATE_LIMIT_AUTH`	`5/minute`	Auth-route per-IP limit label
`RATE_LIMIT_RPM`	`600`	Per-principal per-minute hard cap
`RATE_LIMIT_BURST`	`120`	Per-principal per-second burst cap

Fail modes

Rate limiting depends on Redis (Valkey). The middleware behaves differently depending on which endpoint category fails:

Endpoint category	Redis unavailable
`/api/v1/auth/*`	Fail closed - 503. A Redis outage cannot disable credential-stuffing protection.
All other endpoints	Fail open - request proceeds. A Redis blip does not take down the API.

Tips for integrations and automation

Spread requests across time. The 600 rpm default is generous for interactive use, but an automation loop that fires 600 requests in a one-second burst will hit the 120 req/sec burst limit immediately. Add a small sleep between requests (time.sleep(0.1) between each gives a natural 10 req/sec cadence with headroom).

Use scoped API keys, not user credentials. API keys bypass browser session overhead and carry an explicit scope ceiling - requests that exceed the key’s scope are rejected with 403 before touching any data. Rate-limit accounting is per-identity (cookie-session requests share one per-user bucket; API-key requests share one per-IP bucket), not per-endpoint or per-scope, so a scoped key does not grant extra request headroom and can still consume its bucket firing at endpoints the scope prohibits. See API Keys for how to create and scope keys.

Monitor X-RateLimit-Remaining at the application level. Log or alert when remaining drops below 20% of the limit so you can throttle proactively rather than reacting to 429s.

Batch where the API allows it. Several endpoints accept arrays of IDs or offer bulk variants (for example POST /api/v1/discovery/adopt/bulk). A single bulk call counts as one request against your limit.

Do not parallelize heavily without backpressure. Ten concurrent goroutines or threads hitting the same authenticated bucket will consume the burst allowance nearly instantly. Use a semaphore or connection pool sized well below the burst limit.

Next steps

Authentication - understand how JWT and API key principals are identified, which determines which rate-limit bucket applies to you.
API Keys - create scoped keys for automation so limits apply to the purpose-built principal rather than your user account.
Using the API - general conventions, error shapes, and CSRF handling.
Deployment: environment variables - full list of settings including RATE_LIMIT_RPM and RATE_LIMIT_BURST tuning.