Rate Limiting
FreeSDN applies rate limiting at the API gateway layer to protect against credential-stuffing, runaway automation, and denial-of-service. Limits are enforced in middleware before any route handler runs, so they apply uniformly across every endpoint including module routes and vendor adapter surfaces.
How limits work
Section titled “How limits work”The rate limiter (RateLimitMiddleware) uses a Redis sorted-set sliding window. Every
request records a timestamp in a per-principal key and counts entries within the last minute or
the last second (for burst). No approximations: the window slides continuously, so a burst of
requests at 11:59:59 does not “reset” at 12:00:00.
There are two independent checks per request, evaluated in order:
| Check | Default | Trigger | Retry-After |
|---|---|---|---|
| Burst | 120 req/sec | More than 120 requests in any one-second window | 1 |
| Per-minute | 600 req/min | More than 600 requests in any one-minute window | 60 |
Both limits fire a 429 Too Many Requests response. The burst check fires first, so you will
see Retry-After: 1 before you see Retry-After: 60.
Per-principal keying
Section titled “Per-principal keying”The limiter identifies who is making the request:
- Authenticated user - the
subclaim is extracted from thefreesdn_accessJWT cookie using a local HMAC verification (no database or Redis round-trip). All requests made by the same user - regardless of which device, browser, or API client - share one bucket. - Unauthenticated / no valid JWT - the limiter falls back to
rl:ip:<ip>. All unauthenticated traffic from the same IP address shares one bucket.
This means a single over-active API client counts against the authenticated user’s limit, not against an IP. If you operate behind NAT and share an IP with other FreeSDN users who are not yet authenticated, those users share the IP bucket.
Auth endpoints have stricter limits
Section titled “Auth endpoints have stricter limits”Authentication routes (/api/v1/auth/*) have a tighter per-IP limit enforced inside the auth
handlers themselves, independent of the middleware layer:
| Scope | Limit |
|---|---|
| Per IP (auth routes) | 5 requests / minute |
| Per username (failed attempts) | 20 failures / 5 minutes |
| Password reset - per IP | 5 requests / hour |
| Password reset - per email | 3 requests / hour |
These per-auth limits run on Redis as well. After 5 failed login attempts the account locks for 30 minutes regardless of rate limits.
Skipped endpoints
Section titled “Skipped endpoints”The following endpoints are excluded from rate limiting because they are high-frequency probes or have their own flow control:
GET /healthand/health/*GET /api/v1/health*POST /api/v1/cameras/{camera_id}/stream-token
Request headers
Section titled “Request headers”Non-429 (success) responses include the following rate-limit headers. On a 429 response only
Retry-After is guaranteed; X-RateLimit-Limit and X-RateLimit-Remaining are not
present on 429 responses. (X-Request-ID is still present on 429s because the outer
RequestIDMiddleware adds it to every response.)
| Header | Present on | Description |
|---|---|---|
X-RateLimit-Limit | Success responses only | The per-minute cap in effect for this principal |
X-RateLimit-Remaining | Success responses only | Requests remaining in the current minute window |
Retry-After | 429 responses only | Seconds to wait before retrying |
Read X-RateLimit-Remaining proactively to slow down before you hit the limit. Retry-After
is set to 1 on a burst 429 and 60 on a per-minute 429.
HTTP/1.1 429 Too Many RequestsContent-Type: application/jsonRetry-After: 60
{"detail": "Rate limit exceeded", "retry_after": 60}The burst-limit variant returns "Rate limit exceeded (burst)" with "retry_after": 1. The rate-limit middleware returns a raw JSONResponse directly - it does not go through the global exception handlers, so there is no error envelope, code, message, or request_id field. Those fields appear only in custom exception-handler responses (adapter errors, auth errors, etc.).
Handling 429 in your client
Section titled “Handling 429 in your client”Follow this pattern for any API client or automation script:
- Check every response for status
429before processing the body. - Read
Retry-Afterfrom the response headers. - Wait exactly that many seconds (do not use a fixed sleep or exponential backoff that ignores the header - the server already computed the correct delay).
- Retry the original request once. If it 429s again, apply exponential backoff starting from
the
Retry-Aftervalue. - Propagate the
X-Request-IDfrom the original 429 when you report errors to your operator.
import timeimport httpx
def api_get(client: httpx.Client, url: str, **kwargs): for attempt in range(5): resp = client.get(url, **kwargs) if resp.status_code == 429: wait = int(resp.headers.get("Retry-After", 60)) time.sleep(wait * (2 ** attempt)) continue resp.raise_for_status() return resp raise RuntimeError("Exceeded retry budget after rate limiting")Configuration reference
Section titled “Configuration reference”| Variable | Default | Description |
|---|---|---|
RATE_LIMIT_ENABLED | True | When False, disables the per-IP in-memory rate limiter for public agent-download endpoints only (/api/v1/agent-downloads/*). It does not disable the global RateLimitMiddleware; that middleware is unconditionally added during app startup. To disable the global middleware you would need to modify the setup_middleware call in main.py. |
RATE_LIMIT_DEFAULT | 100/minute | Label used in logs (actual enforcement is RATE_LIMIT_RPM) |
RATE_LIMIT_AUTH | 5/minute | Auth-route per-IP limit label |
RATE_LIMIT_RPM | 600 | Per-principal per-minute hard cap |
RATE_LIMIT_BURST | 120 | Per-principal per-second burst cap |
Fail modes
Section titled “Fail modes”Rate limiting depends on Redis (Valkey). The middleware behaves differently depending on which endpoint category fails:
| Endpoint category | Redis unavailable |
|---|---|
/api/v1/auth/* | Fail closed - 503. A Redis outage cannot disable credential-stuffing protection. |
| All other endpoints | Fail open - request proceeds. A Redis blip does not take down the API. |
Tips for integrations and automation
Section titled “Tips for integrations and automation”Spread requests across time. The 600 rpm default is generous for interactive use, but an
automation loop that fires 600 requests in a one-second burst will hit the 120 req/sec burst
limit immediately. Add a small sleep between requests (time.sleep(0.1) between each gives a
natural 10 req/sec cadence with headroom).
Use scoped API keys, not user credentials. API keys bypass browser session overhead and carry an explicit scope ceiling - requests that exceed the key’s scope are rejected with 403 before touching any data. Rate-limit accounting is per-identity (cookie-session requests share one per-user bucket; API-key requests share one per-IP bucket), not per-endpoint or per-scope, so a scoped key does not grant extra request headroom and can still consume its bucket firing at endpoints the scope prohibits. See API Keys for how to create and scope keys.
Monitor X-RateLimit-Remaining at the application level. Log or alert when remaining drops
below 20% of the limit so you can throttle proactively rather than reacting to 429s.
Batch where the API allows it. Several endpoints accept arrays of IDs or offer bulk
variants (for example POST /api/v1/discovery/adopt/bulk). A single bulk call counts as one
request against your limit.
Do not parallelize heavily without backpressure. Ten concurrent goroutines or threads hitting the same authenticated bucket will consume the burst allowance nearly instantly. Use a semaphore or connection pool sized well below the burst limit.
Next steps
Section titled “Next steps”- Authentication - understand how JWT and API key principals are identified, which determines which rate-limit bucket applies to you.
- API Keys - create scoped keys for automation so limits apply to the purpose-built principal rather than your user account.
- Using the API - general conventions, error shapes, and CSRF handling.
- Deployment: environment variables - full list of settings including
RATE_LIMIT_RPMandRATE_LIMIT_BURSTtuning.