Rate Limiting

FraiseQL includes built-in rate limiting via a token bucket algorithm, configured under [security.rate_limiting].

Overview

The rate limiter operates at two levels:

Global — a single token bucket applies to all incoming requests
Per-endpoint — stricter limits on /auth/start, /auth/callback, /auth/refresh, /auth/logout to prevent brute-force attacks

Rate limiting is disabled by default. Enable it explicitly in production.

Configuration

Basic setup

[security.rate_limiting]
enabled                      = true
requests_per_second          = 100
requests_per_second_per_user = 500   # default: 10× requests_per_second
burst_size                   = 200
trust_proxy_headers          = false # set true only when behind a trusted reverse proxy

Per-user limit: Authenticated users receive 10× the global rate by default. This reflects that authenticated requests are identifiable (abuse is traceable) and service accounts legitimately call at higher rates. Override with requests_per_second_per_user when the default is too permissive — for example, on public-facing APIs where “authenticated” just means “has an account.”

Auth endpoint limits

[security.rate_limiting]
enabled                    = true
requests_per_second        = 100
burst_size                 = 200

# /auth/start
auth_start_max_requests    = 5
auth_start_window_secs     = 60

# /auth/callback
auth_callback_max_requests = 10
auth_callback_window_secs  = 60

# /auth/refresh
auth_refresh_max_requests  = 20
auth_refresh_window_secs   = 300

# /auth/logout
auth_logout_max_requests   = 30
auth_logout_window_secs    = 60

[security.rate_limiting]
enabled                   = true
failed_login_max_attempts = 10
failed_login_lockout_secs = 900  # 15-minute lockout after 10 failures

Burst Allowance

The burst size lets clients send a short burst of requests above the steady-state rate without being throttled. This is important for page loads and app startups where many requests fire simultaneously.

How it works: Tokens accumulate in the bucket at requests_per_second. At any moment, a client can consume up to burst_size tokens at once. Once the bucket empties, requests return HTTP 429 until tokens refill.

Example behavior:

Time	Tokens Available	Requests Sent	Result
0s	200 (full)	200	All allowed
0s	0	1	Rate limited (429)
1s	100	100	All allowed (refilled)
2s	100	50	Allowed; 50 tokens remain

Reverse Proxy Deployments

By default FraiseQL uses the TCP peer address for IP-based rate limiting. Behind a reverse proxy (nginx, ALB, Caddy), this is always the proxy’s IP — meaning all clients share one bucket and IP-based limiting is effectively inoperative.

Set trust_proxy_headers = true to read the real client IP from X-Real-IP (preferred) or the first address in X-Forwarded-For:

[security.rate_limiting]
enabled             = true
trust_proxy_headers = true   # only when sitting behind a trusted proxy

Only enable this when FraiseQL is guaranteed to be behind a proxy you control. If FraiseQL is directly internet-facing, X-Forwarded-For is client-supplied and trivially spoofable, defeating IP-based limiting entirely.

Multi-Instance (Distributed Rate Limits)

By default the rate limiter is in-memory per process. When running multiple FraiseQL instances behind a load balancer, each instance maintains its own token bucket independently — a client can distribute requests across replicas without hitting a shared limit.

For multi-replica deployments (Kubernetes, ECS, fly.io with multiple instances), configure the Redis backend. It requires the redis-rate-limiting Cargo feature in your deployment image:

[security.rate_limiting]
enabled    = true
redis_url  = "${REDIS_URL}"
requests_per_second = 100

The Redis backend uses an atomic Lua token-bucket script (EVALSHA with NOSCRIPT fallback). If Redis is unavailable, the limiter fails open — requests are allowed through and a fraiseql_rate_limit_redis_errors_total counter increments. Check that counter in your alerting to detect Redis connectivity issues.

Default Values

Field	Default	Description
`enabled`	`false`	Rate limiting is opt-in
`requests_per_second`	`100`	Global token bucket refill rate
`requests_per_second_per_user`	`requests_per_second × 10`	Per-authenticated-user limit
`burst_size`	`200`	Maximum burst
`trust_proxy_headers`	`false`	Read `X-Real-IP`/`X-Forwarded-For` for client IP (trusted proxy only)
`auth_start_max_requests`	`5`	Per 60s window
`auth_callback_max_requests`	`10`	Per 60s window
`auth_refresh_max_requests`	`20`	Per 300s window
`auth_logout_max_requests`	`30`	Per 60s window
`failed_login_max_attempts`	`10`	Before lockout
`failed_login_lockout_secs`	`900`	15-minute lockout

Response Headers

FraiseQL includes rate limit headers on all responses:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1704067260

When rate limited (HTTP 429):

HTTP/1.1 429 Too Many Requests
Retry-After: 5
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1704067260
Content-Type: application/json

{
  "errors": [{
    "message": "Rate limit exceeded. Try again in 5 seconds.",
    "extensions": {
      "code": "RATE_LIMITED",
      "retryAfter": 5,
      "limit": 100,
      "reset": 1704067260
    }
  }]
}

Client Handling

async function executeWithRetry(query, variables, maxRetries = 3) {
    for (let attempt = 0; attempt < maxRetries; attempt++) {
        const response = await fetch('/graphql', {
            method: 'POST',
            headers: { 'Content-Type': 'application/json' },
            body: JSON.stringify({ query, variables })
        });

        if (response.status === 429) {
            const retryAfter = response.headers.get('Retry-After') || 5;
            await new Promise(r => setTimeout(r, retryAfter * 1000));
            continue;
        }

        return response.json();
    }
    throw new Error('Rate limit exceeded after retries');
}

import time
import httpx

def execute_with_retry(query: str, variables: dict, max_retries: int = 3):
    for attempt in range(max_retries):
        response = httpx.post(
            "http://localhost:8080/graphql",
            json={"query": query, "variables": variables}
        )

        if response.status_code == 429:
            retry_after = int(response.headers.get("Retry-After", 5))
            time.sleep(retry_after)
            continue

        return response.json()

    raise Exception("Rate limit exceeded after retries")

func executeWithRetry(query string, variables map[string]interface{}, maxRetries int) (map[string]interface{}, error) {
    payload := map[string]interface{}{"query": query, "variables": variables}
    jsonPayload, _ := json.Marshal(payload)

    for attempt := 0; attempt < maxRetries; attempt++ {
        resp, err := http.Post(
            "http://localhost:8080/graphql",
            "application/json",
            bytes.NewBuffer(jsonPayload),
        )
        if err != nil {
            return nil, err
        }
        defer resp.Body.Close()

        if resp.StatusCode == 429 {
            retryAfter := resp.Header.Get("Retry-After")
            seconds, _ := strconv.Atoi(retryAfter)
            if seconds == 0 {
                seconds = 5
            }
            time.Sleep(time.Duration(seconds) * time.Second)
            continue
        }

        var result map[string]interface{}
        json.NewDecoder(resp.Body).Decode(&result)
        return result, nil
    }

    return nil, fmt.Errorf("rate limit exceeded after %d retries", maxRetries)
}

Health Check Exemptions

The following paths are always exempt from rate limiting:

GET /health   → always 200 OK
GET /ready    → always 200 OK
GET /metrics  → always 200 OK (Prometheus scrape endpoint)

These endpoints are not configurable — they are always exempt to ensure load balancer health checks and Kubernetes probes are never blocked.

Prometheus Metrics

Metric	Type	Description
`fraiseql_rate_limit_hits_total`	Counter	Requests rejected by rate limiter
`fraiseql_rate_limit_allowed_total`	Counter	Requests allowed through
`fraiseql_rate_limit_redis_errors_total`	Counter	Redis errors (fail-open; only present with `redis-rate-limiting` feature)

Grafana queries

# Rate limit hit ratio (% of requests blocked)
sum(rate(fraiseql_rate_limit_hits_total[5m])) /
sum(rate(fraiseql_rate_limit_allowed_total[5m]) + rate(fraiseql_rate_limit_hits_total[5m]))

# Alert: more than 5% of requests rate limited
sum(rate(fraiseql_rate_limit_hits_total[5m])) /
sum(rate(fraiseql_rate_limit_hits_total[5m]) + rate(fraiseql_rate_limit_allowed_total[5m])) > 0.05

Troubleshooting

Rate limits seem too aggressive

Increase burst_size — the default (200) may be too small for page-load traffic patterns
Check if requests_per_second needs adjusting for your traffic volume

Rate limiting not triggering

Verify enabled = true
Check that requests_per_second is lower than your actual peak traffic
If running behind a reverse proxy, set trust_proxy_headers = true — without it, all clients share the proxy’s IP bucket and limits appear not to fire per-client

Legitimate users being limited in multi-instance deployments

Configure redis_url in [security.rate_limiting] to share limits across replicas (requires redis-rate-limiting feature)

Next Steps

Security

Security — Authentication and authorization

TOML Reference

TOML Config — Full field reference

Performance

Performance — Optimization strategies