Difficulty: Medium Topics: Concurrency, Design Patterns (Strategy), Token Bucket Algorithm Context: Designing the internal class implementation (Thread-Safe), not the distributed system.
Phase 1: Requirements Gathering
Goals
Design a library to limit requests based on a defined policy.
Identify core entities: User, Request, Bucket.
Define behavior for allowed vs. denied requests.
1. Who are the actors?
Client Application: Sends requests that need to be rate-limited.
Rate Limiter System: Decides whether to allow or block a request.
2. What are the must-have features? (Core)
User-based Limiting: Limit requests per userId.
Configurable Rules: Define limits (e.g., 10 requests per second).
Boolean Response: Return true (Allowed) or false (Throttled).
Description: Defines a family of algorithms, encapsulates each one, and makes them interchangeable. Strategy lets the algorithm vary independently from clients that use it.
Why used: Allows switching between different rate-limiting algorithms (Token Bucket, Leaky Bucket, Sliding Window) dynamically based on configuration without changing the client code.
2. Factory Pattern
Description: A creational pattern that provides an interface for creating objects in a superclass, but allows subclasses to alter the type of objects that will be created.
Why used: Useful for creating different types of rate limiters (e.g., TokenBucketLimiter, FixedWindowLimiter) based on user tier or configuration.
Phase 5: Code Key Methods
Java Implementation (Thread-Safe Token Bucket)
Phase 6: Discussion
Concurrency
Q: Why synchronized on allow()?
A: To prevent race conditions where two threads read tokens=1, both decrement, and tokens become negative. The lock ensures atomicity of verify-and-decrement.
Distributed Environments
Q: How to scale to multiple servers?
A: Local memory (HashMap) won't work if requests for the same user hit different servers.
Solution: Use Redis with Lua Scripts. Lua scripts execute atomically in Redis, performing the get tokens -> refill -> decrement -> set tokens logic in one step.
Memory Optimization
Q: How to handle millions of users?
A: The current map grows indefinitely. Implement a cleanup strategy:
Background Thread: Scan map periodically and remove keys with lastRefillTimestamp > 1 hour ago.
LRU Cache: Use a Guava Cache or similar with expireAfterAccess.
SOLID Principles Checklist
S (Single Responsibility): TokenBucket handles logic for one user. TokenBucketRateLimiter manages mapping of users to buckets.
O (Open/Closed): Can add new Rate Limiters (e.g., SlidingWindowRateLimiter) implementing RateLimiter interface.
L (Liskov Substitution): TokenBucketRateLimiter can stand in for RateLimiter.
I (Interface Segregation): RateLimiter interface is simple (one method).
D (Dependency Inversion): Client depends on RateLimiter abstraction.