githubEdit

Rate Limiter

Difficulty: Easy Topics: Token Bucket, Sliding Window, Distributed Systems Time: 45 minutes Companies: Google, Amazon, Stripe, Cloudflare

Problem Statement

Design a rate limiter that restricts the number of requests a user can make to an API within a time window.

Examples:

  • Maximum 10 requests per second per user

  • Maximum 1000 requests per hour per API key

  • Maximum 100,000 requests per day per IP address

Requirements

Functional

  • Limit requests per user/API key/IP

  • Return 429 (Too Many Requests) when limit exceeded

  • Support different time windows (second, minute, hour, day)

Non-Functional

  • Low latency (<1ms per check)

  • Highly available (99.99%)

  • Distributed (works across multiple servers)

  • Memory efficient

Algorithms

1. Fixed Window Counter

Problem: Burst at window boundaries (20 requests in 1 second if 10 at end + 10 at start)

2. Sliding Window Log

Pros: Accurate Cons: Memory intensive (stores all request timestamps)

3. Sliding Window Counter

Pros: Low memory footprint, smooths traffic efficiently. Cons: Assumes even distribution of requests in the previous window.

4. Token Bucket (Most Common)

Pros: Smooth rate limiting, allows bursts Cons: Slightly complex

Architecture

API Design

Distributed Rate Limiting

Challenge: Multiple servers need shared state and atomic operations to prevent race conditions.

Atomicity and Concurrency (SDE-3 Focus)

When scaling horizontally, get-then-set operations (like fetching tokens, checking, and updating) create race conditions.

Solution 1: Redis Lua Scripts Lua scripts execute atomically in Redis, preventing race conditions without needing distributed locks.

Solution 2: Redis Cluster Use hash tags {#hash_key} to ensure related keys route to the same Redis shard.

Interview Tips

Common Questions:

  • Q: "How would you rate limit across multiple datacenters?"

  • A: Global Redis cluster OR each datacenter has local limits (e.g., 80% of global). Local limits are faster but may slightly exceed total global intent. Can use a hybrid: local Redis for fast checks, async sync to global Cassandra/Redis.

  • Q: "How does rate limiting differ for B2B vs B2C?"

  • A: B2B (API Keys) usually needs higher limits, strict adherence, and analytics. Token bucket is ideal. B2C (User/IP) needs DDOS protection, softer limits, and often uses simpler algorithms like fixed window at the edge (WAF).

Decision Matrix:

Algorithm
Accuracy
Memory
Use Case

Fixed Window

Low (Burst edge)

Low

Simple, non-critical, Edge DoS protection

Sliding Log

High

High

Financial APIs, Strict auditing

Sliding Window Counter

High

Low

Low memory, smooth traffic

Token Bucket

Medium

Medium

Most APIs (recommended), Allows bursts

Leaky Bucket

High

Medium

Traffic shaping, Payment processing queues

Last updated