Skip to content

Rate Limiting

FreeSDN applies rate limiting at the API gateway layer to protect against credential-stuffing, runaway automation, and denial-of-service. Limits are enforced in middleware before any route handler runs, so they apply uniformly across every endpoint including module routes and vendor adapter surfaces.

The rate limiter (RateLimitMiddleware) uses a Redis sorted-set sliding window. Every request records a timestamp in a per-principal key and counts entries within the last minute or the last second (for burst). No approximations: the window slides continuously, so a burst of requests at 11:59:59 does not “reset” at 12:00:00.

There are two independent checks per request, evaluated in order:

CheckDefaultTriggerRetry-After
Burst120 req/secMore than 120 requests in any one-second window1
Per-minute600 req/minMore than 600 requests in any one-minute window60

Both limits fire a 429 Too Many Requests response. The burst check fires first, so you will see Retry-After: 1 before you see Retry-After: 60.

The limiter identifies who is making the request:

  1. Authenticated user - the sub claim is extracted from the freesdn_access JWT cookie using a local HMAC verification (no database or Redis round-trip). All requests made by the same user - regardless of which device, browser, or API client - share one bucket.
  2. Unauthenticated / no valid JWT - the limiter falls back to rl:ip:<ip>. All unauthenticated traffic from the same IP address shares one bucket.

This means a single over-active API client counts against the authenticated user’s limit, not against an IP. If you operate behind NAT and share an IP with other FreeSDN users who are not yet authenticated, those users share the IP bucket.

Authentication routes (/api/v1/auth/*) have a tighter per-IP limit enforced inside the auth handlers themselves, independent of the middleware layer:

ScopeLimit
Per IP (auth routes)5 requests / minute
Per username (failed attempts)20 failures / 5 minutes
Password reset - per IP5 requests / hour
Password reset - per email3 requests / hour

These per-auth limits run on Redis as well. After 5 failed login attempts the account locks for 30 minutes regardless of rate limits.

The following endpoints are excluded from rate limiting because they are high-frequency probes or have their own flow control:

  • GET /health and /health/*
  • GET /api/v1/health*
  • POST /api/v1/cameras/{camera_id}/stream-token

Non-429 (success) responses include the following rate-limit headers. On a 429 response only Retry-After is guaranteed; X-RateLimit-Limit and X-RateLimit-Remaining are not present on 429 responses. (X-Request-ID is still present on 429s because the outer RequestIDMiddleware adds it to every response.)

HeaderPresent onDescription
X-RateLimit-LimitSuccess responses onlyThe per-minute cap in effect for this principal
X-RateLimit-RemainingSuccess responses onlyRequests remaining in the current minute window
Retry-After429 responses onlySeconds to wait before retrying

Read X-RateLimit-Remaining proactively to slow down before you hit the limit. Retry-After is set to 1 on a burst 429 and 60 on a per-minute 429.

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
{"detail": "Rate limit exceeded", "retry_after": 60}

The burst-limit variant returns "Rate limit exceeded (burst)" with "retry_after": 1. The rate-limit middleware returns a raw JSONResponse directly - it does not go through the global exception handlers, so there is no error envelope, code, message, or request_id field. Those fields appear only in custom exception-handler responses (adapter errors, auth errors, etc.).

Follow this pattern for any API client or automation script:

  1. Check every response for status 429 before processing the body.
  2. Read Retry-After from the response headers.
  3. Wait exactly that many seconds (do not use a fixed sleep or exponential backoff that ignores the header - the server already computed the correct delay).
  4. Retry the original request once. If it 429s again, apply exponential backoff starting from the Retry-After value.
  5. Propagate the X-Request-ID from the original 429 when you report errors to your operator.
import time
import httpx
def api_get(client: httpx.Client, url: str, **kwargs):
for attempt in range(5):
resp = client.get(url, **kwargs)
if resp.status_code == 429:
wait = int(resp.headers.get("Retry-After", 60))
time.sleep(wait * (2 ** attempt))
continue
resp.raise_for_status()
return resp
raise RuntimeError("Exceeded retry budget after rate limiting")
VariableDefaultDescription
RATE_LIMIT_ENABLEDTrueWhen False, disables the per-IP in-memory rate limiter for public agent-download endpoints only (/api/v1/agent-downloads/*). It does not disable the global RateLimitMiddleware; that middleware is unconditionally added during app startup. To disable the global middleware you would need to modify the setup_middleware call in main.py.
RATE_LIMIT_DEFAULT100/minuteLabel used in logs (actual enforcement is RATE_LIMIT_RPM)
RATE_LIMIT_AUTH5/minuteAuth-route per-IP limit label
RATE_LIMIT_RPM600Per-principal per-minute hard cap
RATE_LIMIT_BURST120Per-principal per-second burst cap

Rate limiting depends on Redis (Valkey). The middleware behaves differently depending on which endpoint category fails:

Endpoint categoryRedis unavailable
/api/v1/auth/*Fail closed - 503. A Redis outage cannot disable credential-stuffing protection.
All other endpointsFail open - request proceeds. A Redis blip does not take down the API.

Spread requests across time. The 600 rpm default is generous for interactive use, but an automation loop that fires 600 requests in a one-second burst will hit the 120 req/sec burst limit immediately. Add a small sleep between requests (time.sleep(0.1) between each gives a natural 10 req/sec cadence with headroom).

Use scoped API keys, not user credentials. API keys bypass browser session overhead and carry an explicit scope ceiling - requests that exceed the key’s scope are rejected with 403 before touching any data. Rate-limit accounting is per-identity (cookie-session requests share one per-user bucket; API-key requests share one per-IP bucket), not per-endpoint or per-scope, so a scoped key does not grant extra request headroom and can still consume its bucket firing at endpoints the scope prohibits. See API Keys for how to create and scope keys.

Monitor X-RateLimit-Remaining at the application level. Log or alert when remaining drops below 20% of the limit so you can throttle proactively rather than reacting to 429s.

Batch where the API allows it. Several endpoints accept arrays of IDs or offer bulk variants (for example POST /api/v1/discovery/adopt/bulk). A single bulk call counts as one request against your limit.

Do not parallelize heavily without backpressure. Ten concurrent goroutines or threads hitting the same authenticated bucket will consume the burst allowance nearly instantly. Use a semaphore or connection pool sized well below the burst limit.

  • Authentication - understand how JWT and API key principals are identified, which determines which rate-limit bucket applies to you.
  • API Keys - create scoped keys for automation so limits apply to the purpose-built principal rather than your user account.
  • Using the API - general conventions, error shapes, and CSRF handling.
  • Deployment: environment variables - full list of settings including RATE_LIMIT_RPM and RATE_LIMIT_BURST tuning.