githubEdit

Caching & CDN

Comprehensive guide to caching strategies and content delivery networks for system design

Table of Contents


Caching Fundamentals

What is Caching?

Caching stores frequently accessed data in a fast-access layer (memory) to reduce latency and load on slower backend systems (databases, APIs).

Key Principle: Locality of Reference

  • Temporal Locality: Recently accessed data likely to be accessed again

  • Spatial Locality: Data near recently accessed data likely to be accessed

Why Cache?

Benefits:

  • Reduced Latency: 10-100× faster responses

  • Lower Database Load: Fewer queries to primary DB

  • Cost Savings: Cheaper to serve from cache than compute

  • Improved Scalability: Handle traffic spikes with cached data

Costs:

  • Stale Data: Cache may return outdated information

  • Complexity: Cache invalidation, consistency challenges

  • Memory Cost: RAM more expensive than disk

  • Cold Start: Cache misses on first request


Caching Strategies

1. Cache-Aside (Lazy Loading)

How it works:

Characteristics:

  • Application controls cache logic

  • Cache only populated on read (lazy)

  • Cache failures don't break application

Pros:

  • ✅ Only caches data that's actually accessed

  • ✅ Resilient (cache failure → slower but still works)

Cons:

  • ❌ Cache miss penalty (extra latency on first request)

  • ❌ Stale data possible (until TTL expires)

Use When:

  • Read-heavy workloads

  • Cache misses are acceptable (not latency-critical)


2. Read-Through Cache

How it works:

Cache library automatically handles DB query on miss.

Difference from Cache-Aside:

  • Cache is responsible for loading data (not application)

  • Cleaner application code

Example (Redis with read-through):

Pros:

  • ✅ Simpler application code

  • ✅ Consistent cache loading logic

Cons:

  • ❌ Tighter coupling (cache must know DB structure)

  • ❌ Still has cache miss penalty


3. Write-Through Cache

How it works:

Characteristics:

  • Every write goes to both cache and database

  • Cache is always consistent with database

Pros:

  • ✅ No stale reads (cache always fresh)

  • ✅ Fast subsequent reads

Cons:

  • ❌ Higher write latency (2 writes instead of 1)

  • ❌ Cache pollution (write data that's never read)

Use When:

  • Read/write ratio is high (same data written then read many times)

  • Consistency critical (financial data, user profiles)


4. Write-Behind (Write-Back) Cache

How it works:

Characteristics:

  • Writes go to cache first

  • Database updated asynchronously (batched)

Pros:

  • ✅ Very low write latency (cache is fast)

  • ✅ Can batch writes (write 1000 updates at once)

Cons:

  • Data loss risk (if cache crashes before DB write)

  • ❌ Complex (need queue, workers, retry logic)

Use When:

  • Write-heavy workloads (logging, analytics)

  • Can tolerate some data loss (non-critical data)

  • Want to batch writes for efficiency


5. Refresh-Ahead (Predictive Refresh)

How it works:

Pros:

  • ✅ No cache miss penalty for hot data

  • ✅ Consistent performance

Cons:

  • ❌ Complex (predict which keys to refresh)

  • ❌ Wasted refreshes if data not accessed

Use When:

  • Predictable access patterns (homepage, trending data)

  • Cache misses are unacceptable (sub-10ms latency SLA)


Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Strategies

1. TTL (Time-To-Live)

Pros: Simple, automatic cleanup Cons: May serve stale data until TTL expires

Best for: Data that changes slowly (user profiles, product catalogs)

2. Event-Based Invalidation

Pros: No stale data (immediate invalidation) Cons: More complex, must track all dependencies

Best for: Critical data (inventory, pricing)

3. Versioning

Pros: Smooth rollouts, no invalidation storms Cons: Storage overhead (multiple versions)


Distributed Caching

Single-Node Cache (e.g., Redis single instance)

Pros: Simple, low latency Cons: Single point of failure, limited by single machine memory

Distributed Cache Cluster

Example: Redis Cluster

Sharding Strategy (Consistent Hashing):

Pros:

  • Horizontal scaling (add more nodes)

  • High availability (replicas)

Cons:

  • Complexity (cluster management)

  • Network overhead (cross-node requests)

Cache Replication

Pros: High availability, read scaling Cons: Replication lag (eventual consistency)


Eviction Policies

When cache is full, which item to remove?

Policy
Description
Use Case

LRU (Least Recently Used)

Evict oldest accessed item

General purpose

LFU (Least Frequently Used)

Evict least accessed item

Hot data patterns

FIFO (First In First Out)

Evict oldest inserted item

Simple, predictable

Random

Evict random item

Fast, low overhead

TTL

Evict expired items first

Time-sensitive data

Most Common: LRU


Content Delivery Networks (CDN)

What is a CDN?

CDN is a geographically distributed network of caching servers (edge locations) that serve static content closer to users.

How CDN Works

1. Push CDN (Proactive)

2. Pull CDN (Lazy)

CDN Caching

Cache-Control Headers:

CDN Invalidation

Problem: Asset updated, but CDN still serves old version

Solutions:

1. Versioned URLs (Best Practice)

2. Cache Purge (Manual)

3. TTL Management


Real-World Examples

Example 1: Twitter Timeline Caching

Architecture:

Strategy: Cache-aside with short TTL (stale feed acceptable for 5 min)

Example 2: YouTube Video Streaming

Architecture:

Strategy: Pull CDN with long TTL (videos rarely change)

Example 3: E-commerce Product Catalog

Architecture:

Strategy: Event-based invalidation (pricing must be accurate)


Decision Matrix

Scenario
Caching Strategy
Eviction
TTL

User profiles

Cache-aside

LRU

1 hour

Product catalog (read-heavy)

Read-through

LRU

10 min

Real-time analytics

Write-behind

LFU

No TTL

Static assets (images, JS)

CDN (Pull)

LRU

1 year

Session data

Write-through

TTL

30 min

Leaderboard (hot data)

Refresh-ahead

LFU

5 min


Interview Tips

When asked about caching:

  1. Clarify requirements:

    • Read vs write ratio?

    • Staleness tolerance?

    • Latency SLA?

  2. Discuss trade-offs:

    • Write-through: Consistency vs Latency

    • Cache-aside: Simplicity vs Staleness

  3. Mention eviction:

    • "I'd use LRU eviction for general purpose"

    • "TTL handles automatic cleanup"

  4. Consider failure modes:

    • Cache failure → Fallback to DB

    • Thundering herd → Cache stampede protection

Example Answer:

"For user profiles, I'd use cache-aside with Redis. On read, check cache first (sub-ms latency). On miss, query PostgreSQL, cache result with 1-hour TTL. Use LRU eviction when cache is full. On user update, invalidate cache to prevent stale data. This balances simplicity, performance, and consistency."


Senior Engineer Insights

  • Design trade-offs: Cache-aside is simple and resilient (cache down → DB); write-through gives consistency at the cost of write latency. Write-behind is fast but risks data loss—use only for non-critical or replayable data.

  • Cost: Memory (cache) is more expensive per GB than disk; size cache for hot working set (e.g. 80/20). CDN egress is often cheaper than origin egress; push popular assets to edge to reduce origin load and cost.

  • Operational complexity: Cache invalidation is hard; prefer TTL + versioned URLs where possible. Distributed cache adds cluster management and failure modes (e.g. split brain); use proven solutions (Redis Cluster, Memcached pools).

  • Observability: Monitor hit rate, miss rate, eviction rate, and latency (P99). Low hit rate → wrong keys or TTL; high eviction → undersized or hot key problem. Alert on cache unavailability and fallback to DB.

  • Resilience: Cache stampede on hot key miss → use single-flighter or probabilistic early expiry. Cache failure should degrade gracefully (slower, not broken); avoid cache-as-critical-path for correctness.


Quick Revision

  • Strategies: Cache-aside (app loads on miss), read-through (cache loads on miss), write-through (write DB + cache), write-behind (write cache, async DB). Cache-aside most common; write-behind for high write throughput only when loss is acceptable.

  • Invalidation: TTL (simple, stale possible), event-based (delete/update on write), versioning (new key per version). "Two hard things: cache invalidation and naming things."

  • Eviction: LRU common; LFU for stable hot set; TTL for time-sensitive data.

  • CDN: Edge caches; push (you upload) vs pull (on first request); versioned URLs for immutable assets; purge for updates.

  • Interview talking points: "We use cache-aside with Redis; 1-hour TTL for user profiles; invalidate on update. We use LRU eviction. For static assets we use CDN with versioned URLs. If Redis is down we fall back to DB and accept higher latency."

  • Common mistakes: Caching without TTL or invalidation (stale forever); treating cache as source of truth; no fallback when cache is down; cache stampede on viral key.

Last updated