Rate Limiting APIs
Overview
Implement sophisticated rate limiting using sliding window, token bucket, and fixed window counter algorithms with Redis-backed distributed state. Configure per-endpoint, per-user, and per-API-key limits with tiered quotas, burst allowances, and standard response headers that communicate limit status to API consumers.
Prerequisites
- Redis 6+ for distributed rate limit state (required for multi-instance deployments)
- Rate limiting library:
rate-limiter-flexible (Node.js), slowapi (Python/FastAPI), or Bucket4j (Java)
- API key or user identification mechanism for per-consumer tracking
- Monitoring for rate limit hit rates and rejected request metrics
- Documentation system for publishing rate limit policies to API consumers
Instructions
- Analyze endpoint traffic patterns using Read and Grep on access logs or metrics to determine appropriate rate limits per endpoint category (read-heavy, write-heavy, resource-intensive).
- Select the rate limiting algorithm per endpoint: token bucket for bursty traffic allowance, sliding window log for precise per-second limits, or fixed window counter for simple quota enforcement.
- Implement rate limiting middleware that extracts the client identifier (API key from header, user ID from JWT, or IP address as fallback) and checks against the configured limit.
- Configure tiered rate limits per API consumer plan: Free (100 req/min), Pro (1000 req/min), Enterprise (10000 req/min) with per-endpoint overrides for expensive operations.
- Add burst allowance using token bucket: allow 2x the sustained rate for 10 seconds to handle legitimate traffic spikes without penalizing well-behaved clients.
- Set standard rate limit response headers on every response:
X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset (Unix timestamp), and RateLimit-Policy (draft IETF standard).
- Return 429 Too Many Requests with
Retry-After header (seconds until next allowed request) and a JSON body explaining the limit, current usage, and reset time.
- Implement rate limit bypass for internal service-to-service calls using shared secret or mutual TLS identification to prevent internal traffic from consuming consumer quotas.
- Write tests that verify rate limits engage at exact thresholds, headers reflect correct remaining counts, and limits reset at the configured window boundary.
See ${CLAUDESKILLDIR}/references/implementation.md for the full implementation guide.
Output
${CLAUDESKILLDIR}/src/middleware/rate-limiter.js - Rate limiting middleware with algorithm selection
${CLAUDESKILLDIR}/src/config/rate-limits.js - Per-endpoint and per-tier rate limit configur