Caching & CDN
Comprehensive guide to caching strategies and content delivery networks for system design
Table of Contents
Caching Fundamentals
What is Caching?
Caching stores frequently accessed data in a fast-access layer (memory) to reduce latency and load on slower backend systems (databases, APIs).
Key Principle: Locality of Reference
Temporal Locality: Recently accessed data likely to be accessed again
Spatial Locality: Data near recently accessed data likely to be accessed
Why Cache?
Benefits:
✅ Reduced Latency: 10-100× faster responses
✅ Lower Database Load: Fewer queries to primary DB
✅ Cost Savings: Cheaper to serve from cache than compute
✅ Improved Scalability: Handle traffic spikes with cached data
Costs:
❌ Stale Data: Cache may return outdated information
❌ Complexity: Cache invalidation, consistency challenges
❌ Memory Cost: RAM more expensive than disk
❌ Cold Start: Cache misses on first request
Caching Strategies
1. Cache-Aside (Lazy Loading)
How it works:
Characteristics:
Application controls cache logic
Cache only populated on read (lazy)
Cache failures don't break application
Pros:
✅ Only caches data that's actually accessed
✅ Resilient (cache failure → slower but still works)
Cons:
❌ Cache miss penalty (extra latency on first request)
❌ Stale data possible (until TTL expires)
Use When:
Read-heavy workloads
Cache misses are acceptable (not latency-critical)
2. Read-Through Cache
How it works:
Cache library automatically handles DB query on miss.
Difference from Cache-Aside:
Cache is responsible for loading data (not application)
Cleaner application code
Example (Redis with read-through):
Pros:
✅ Simpler application code
✅ Consistent cache loading logic
Cons:
❌ Tighter coupling (cache must know DB structure)
❌ Still has cache miss penalty
3. Write-Through Cache
How it works:
Characteristics:
Every write goes to both cache and database
Cache is always consistent with database
Pros:
✅ No stale reads (cache always fresh)
✅ Fast subsequent reads
Cons:
❌ Higher write latency (2 writes instead of 1)
❌ Cache pollution (write data that's never read)
Use When:
Read/write ratio is high (same data written then read many times)
Consistency critical (financial data, user profiles)
4. Write-Behind (Write-Back) Cache
How it works:
Characteristics:
Writes go to cache first
Database updated asynchronously (batched)
Pros:
✅ Very low write latency (cache is fast)
✅ Can batch writes (write 1000 updates at once)
Cons:
❌ Data loss risk (if cache crashes before DB write)
❌ Complex (need queue, workers, retry logic)
Use When:
Write-heavy workloads (logging, analytics)
Can tolerate some data loss (non-critical data)
Want to batch writes for efficiency
5. Refresh-Ahead (Predictive Refresh)
How it works:
Pros:
✅ No cache miss penalty for hot data
✅ Consistent performance
Cons:
❌ Complex (predict which keys to refresh)
❌ Wasted refreshes if data not accessed
Use When:
Predictable access patterns (homepage, trending data)
Cache misses are unacceptable (sub-10ms latency SLA)
Cache Invalidation
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
Strategies
1. TTL (Time-To-Live)
Pros: Simple, automatic cleanup Cons: May serve stale data until TTL expires
Best for: Data that changes slowly (user profiles, product catalogs)
2. Event-Based Invalidation
Pros: No stale data (immediate invalidation) Cons: More complex, must track all dependencies
Best for: Critical data (inventory, pricing)
3. Versioning
Pros: Smooth rollouts, no invalidation storms Cons: Storage overhead (multiple versions)
Distributed Caching
Single-Node Cache (e.g., Redis single instance)
Pros: Simple, low latency Cons: Single point of failure, limited by single machine memory
Distributed Cache Cluster
Example: Redis Cluster
Sharding Strategy (Consistent Hashing):
Pros:
Horizontal scaling (add more nodes)
High availability (replicas)
Cons:
Complexity (cluster management)
Network overhead (cross-node requests)
Cache Replication
Pros: High availability, read scaling Cons: Replication lag (eventual consistency)
Eviction Policies
When cache is full, which item to remove?
LRU (Least Recently Used)
Evict oldest accessed item
General purpose
LFU (Least Frequently Used)
Evict least accessed item
Hot data patterns
FIFO (First In First Out)
Evict oldest inserted item
Simple, predictable
Random
Evict random item
Fast, low overhead
TTL
Evict expired items first
Time-sensitive data
Most Common: LRU
Content Delivery Networks (CDN)
What is a CDN?
CDN is a geographically distributed network of caching servers (edge locations) that serve static content closer to users.
How CDN Works
1. Push CDN (Proactive)
2. Pull CDN (Lazy)
CDN Caching
Cache-Control Headers:
CDN Invalidation
Problem: Asset updated, but CDN still serves old version
Solutions:
1. Versioned URLs (Best Practice)
2. Cache Purge (Manual)
3. TTL Management
Real-World Examples
Example 1: Twitter Timeline Caching
Architecture:
Strategy: Cache-aside with short TTL (stale feed acceptable for 5 min)
Example 2: YouTube Video Streaming
Architecture:
Strategy: Pull CDN with long TTL (videos rarely change)
Example 3: E-commerce Product Catalog
Architecture:
Strategy: Event-based invalidation (pricing must be accurate)
Decision Matrix
User profiles
Cache-aside
LRU
1 hour
Product catalog (read-heavy)
Read-through
LRU
10 min
Real-time analytics
Write-behind
LFU
No TTL
Static assets (images, JS)
CDN (Pull)
LRU
1 year
Session data
Write-through
TTL
30 min
Leaderboard (hot data)
Refresh-ahead
LFU
5 min
Interview Tips
When asked about caching:
Clarify requirements:
Read vs write ratio?
Staleness tolerance?
Latency SLA?
Discuss trade-offs:
Write-through: Consistency vs Latency
Cache-aside: Simplicity vs Staleness
Mention eviction:
"I'd use LRU eviction for general purpose"
"TTL handles automatic cleanup"
Consider failure modes:
Cache failure → Fallback to DB
Thundering herd → Cache stampede protection
Example Answer:
"For user profiles, I'd use cache-aside with Redis. On read, check cache first (sub-ms latency). On miss, query PostgreSQL, cache result with 1-hour TTL. Use LRU eviction when cache is full. On user update, invalidate cache to prevent stale data. This balances simplicity, performance, and consistency."
Senior Engineer Insights
Design trade-offs: Cache-aside is simple and resilient (cache down → DB); write-through gives consistency at the cost of write latency. Write-behind is fast but risks data loss—use only for non-critical or replayable data.
Cost: Memory (cache) is more expensive per GB than disk; size cache for hot working set (e.g. 80/20). CDN egress is often cheaper than origin egress; push popular assets to edge to reduce origin load and cost.
Operational complexity: Cache invalidation is hard; prefer TTL + versioned URLs where possible. Distributed cache adds cluster management and failure modes (e.g. split brain); use proven solutions (Redis Cluster, Memcached pools).
Observability: Monitor hit rate, miss rate, eviction rate, and latency (P99). Low hit rate → wrong keys or TTL; high eviction → undersized or hot key problem. Alert on cache unavailability and fallback to DB.
Resilience: Cache stampede on hot key miss → use single-flighter or probabilistic early expiry. Cache failure should degrade gracefully (slower, not broken); avoid cache-as-critical-path for correctness.
Quick Revision
Strategies: Cache-aside (app loads on miss), read-through (cache loads on miss), write-through (write DB + cache), write-behind (write cache, async DB). Cache-aside most common; write-behind for high write throughput only when loss is acceptable.
Invalidation: TTL (simple, stale possible), event-based (delete/update on write), versioning (new key per version). "Two hard things: cache invalidation and naming things."
Eviction: LRU common; LFU for stable hot set; TTL for time-sensitive data.
CDN: Edge caches; push (you upload) vs pull (on first request); versioned URLs for immutable assets; purge for updates.
Interview talking points: "We use cache-aside with Redis; 1-hour TTL for user profiles; invalidate on update. We use LRU eviction. For static assets we use CDN with versioned URLs. If Redis is down we fall back to DB and accept higher latency."
Common mistakes: Caching without TTL or invalidation (stale forever); treating cache as source of truth; no fallback when cache is down; cache stampede on viral key.
Last updated