Caching & CDN

Comprehensive guide to caching strategies and content delivery networks for system design

Caching Fundamentals

What is Caching?

Caching stores frequently accessed data in a fast-access layer (memory) to reduce latency and load on slower backend systems (databases, APIs).

Key Principle: Locality of Reference

Temporal Locality: Recently accessed data likely to be accessed again
Spatial Locality: Data near recently accessed data likely to be accessed

Why Cache?

Without Cache:
Client → API Server → Database → API Server → Client
Latency: 100ms (DB query)

With Cache:
Client → API Server → Redis (cache hit) → Client
Latency: 1ms (99% reduction!)

Benefits:

✅ Reduced Latency: 10-100× faster responses
✅ Lower Database Load: Fewer queries to primary DB
✅ Cost Savings: Cheaper to serve from cache than compute
✅ Improved Scalability: Handle traffic spikes with cached data

Costs:

❌ Stale Data: Cache may return outdated information
❌ Complexity: Cache invalidation, consistency challenges
❌ Memory Cost: RAM more expensive than disk
❌ Cold Start: Cache misses on first request

Caching Strategies

1. Cache-Aside (Lazy Loading)

How it works:

public User getUser(String userId) {
    // 1. Check cache first
    User user = cache.get("user:" + userId);
    if (user != null) {
        return user;  // Cache HIT
    }
    
    // 2. Cache MISS → Query database
    user = db.query("SELECT * FROM users WHERE id = ?", userId);
    
    // 3. Populate cache
    cache.set("user:" + userId, user, 3600);  // 1 hour TTL
    
    return user;
}

Characteristics:

Application controls cache logic
Cache only populated on read (lazy)
Cache failures don't break application

Pros:

✅ Only caches data that's actually accessed
✅ Resilient (cache failure → slower but still works)

Cons:

❌ Cache miss penalty (extra latency on first request)
❌ Stale data possible (until TTL expires)

Use When:

Read-heavy workloads
Cache misses are acceptable (not latency-critical)

2. Read-Through Cache

How it works:

Client → Cache → (if miss) → Database → Cache → Client

Cache library automatically handles DB query on miss.

Difference from Cache-Aside:

Cache is responsible for loading data (not application)
Cleaner application code

Example (Redis with read-through):

// Cache library handles miss automatically
User user = cache.get("user:123");  // If miss, fetches from DB

Pros:

✅ Simpler application code
✅ Consistent cache loading logic

Cons:

❌ Tighter coupling (cache must know DB structure)
❌ Still has cache miss penalty

3. Write-Through Cache

How it works:

public boolean updateUser(String userId, User data) {
    // 1. Write to database
    db.update("UPDATE users SET name = ? WHERE id = ?", data.getName(), userId);
    
    // 2. Synchronously update cache
    cache.set("user:" + userId, data, 3600);
    
    return true;
}

Characteristics:

Every write goes to both cache and database
Cache is always consistent with database

Pros:

✅ No stale reads (cache always fresh)
✅ Fast subsequent reads

Cons:

❌ Higher write latency (2 writes instead of 1)
❌ Cache pollution (write data that's never read)

Use When:

Read/write ratio is high (same data written then read many times)
Consistency critical (financial data, user profiles)

4. Write-Behind (Write-Back) Cache

How it works:

public boolean updateUser(String userId, User data) {
    // 1. Write to cache only (fast!)
    cache.set("user:" + userId, data);
    
    // 2. Queue database write (async)
    queue.enqueue("db_write", userId, data);
    
    return true;  // Respond immediately
}

// Background worker
public void dbWorker() {
    Task task = queue.dequeue();
    db.update(task.getUserId(), task.getData());
}

Characteristics:

Writes go to cache first
Database updated asynchronously (batched)

Pros:

✅ Very low write latency (cache is fast)
✅ Can batch writes (write 1000 updates at once)

Cons:

❌ Data loss risk (if cache crashes before DB write)
❌ Complex (need queue, workers, retry logic)

Use When:

Write-heavy workloads (logging, analytics)
Can tolerate some data loss (non-critical data)
Want to batch writes for efficiency

5. Refresh-Ahead (Predictive Refresh)

How it works:

// Cache monitors access patterns
// Before TTL expires, proactively refresh hot keys
public void backgroundRefresh() {
    List<String> hotKeys = cache.getHotKeys();  // e.g., "homepage_data"
    for (String key : hotKeys) {
        if (cache.ttl(key) < 300) {  // 5 min before expiry
            // Proactively refresh
            Object data = db.query(key);
            cache.set(key, data, 3600);
        }
    }
}

Pros:

✅ No cache miss penalty for hot data
✅ Consistent performance

Cons:

❌ Complex (predict which keys to refresh)
❌ Wasted refreshes if data not accessed

Use When:

Predictable access patterns (homepage, trending data)
Cache misses are unacceptable (sub-10ms latency SLA)

Cache Invalidation

"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton

Strategies

1. TTL (Time-To-Live)

// Set expiration time
cache.set("user:123", userData, 3600);  // Expire after 1 hour

Pros: Simple, automatic cleanup Cons: May serve stale data until TTL expires

Best for: Data that changes slowly (user profiles, product catalogs)

2. Event-Based Invalidation

public void updateUser(String userId, User data) {
    db.update(userId, data);
    // Invalidate on write
    cache.delete("user:" + userId);
}

Pros: No stale data (immediate invalidation) Cons: More complex, must track all dependencies

Best for: Critical data (inventory, pricing)

3. Versioning

// Include version in cache key
int version = 1;
cache.set("user:123:v" + version, userData);

// On schema change, increment version
version = 2;  // Old cache keys auto-expire via TTL

Pros: Smooth rollouts, no invalidation storms Cons: Storage overhead (multiple versions)

Distributed Caching

Single-Node Cache (e.g., Redis single instance)

Client → Redis (single instance)

Pros: Simple, low latency Cons: Single point of failure, limited by single machine memory

Distributed Cache Cluster

Example: Redis Cluster

Clients
├── Cache Node 1 (hash slots 0-5460)
├── Cache Node 2 (hash slots 5461-10922)
└── Cache Node 3 (hash slots 10923-16383)

Sharding Strategy (Consistent Hashing):

int slot = CRC16(key) % 16384;
Node node = slotToNodeMapping.get(slot);

Pros:

Horizontal scaling (add more nodes)
High availability (replicas)

Cons:

Complexity (cluster management)
Network overhead (cross-node requests)

Cache Replication

Write to Primary → Replicate to Secondary (async)
Reads can go to any replica

Pros: High availability, read scaling Cons: Replication lag (eventual consistency)

Eviction Policies

When cache is full, which item to remove?

Policy

Description

Use Case

LRU (Least Recently Used)

Evict oldest accessed item

General purpose

LFU (Least Frequently Used)

Evict least accessed item

Hot data patterns

FIFO (First In First Out)

Evict oldest inserted item

Simple, predictable

Random

Evict random item

Fast, low overhead

TTL

Evict expired items first

Time-sensitive data

Most Common: LRU

Cache: [A, B, C, D] (capacity = 4)
Access: E
Result: Evict least recently used (e.g., A)
Cache: [B, C, D, E]

Content Delivery Networks (CDN)

What is a CDN?

CDN is a geographically distributed network of caching servers (edge locations) that serve static content closer to users.

User in India → CDN Edge (Mumbai) → Serve fast!
User in USA → CDN Edge (Virginia) → Serve fast!
Origin Server (California) → Only hit on cache miss

How CDN Works

1. Push CDN (Proactive)

You upload assets to CDN
CDN replicates to all edge locations
Best for: Predictable, static content (images, videos)

2. Pull CDN (Lazy)

User requests asset → CDN edge checks local cache
If miss → Fetch from origin, cache, serve
Best for: Dynamic content, unpredictable access

CDN Caching

Cache-Control Headers:

Cache-Control: public, max-age=31536000, immutable
# Cache for 1 year, never revalidate (perfect for versioned assets)

Cache-Control: public, max-age=3600, must-revalidate
# Cache for 1 hour, then revalidate with origin

Cache-Control: no-cache
# Always revalidate (CDN can cache but must check origin)

Cache-Control: no-store
# Never cache (sensitive data)

CDN Invalidation

Problem: Asset updated, but CDN still serves old version

Solutions:

1. Versioned URLs (Best Practice)

Old: https://cdn.example.com/app.js
New: https://cdn.example.com/app.v2.js (or app.abc123.js)

2. Cache Purge (Manual)

CDN API: purge("/images/logo.png")
→ Invalidates across all edge locations

3. TTL Management

Short TTL for frequently changing content
Long TTL for versioned/immutable assets

Real-World Examples

Example 1: Twitter Timeline Caching

Architecture:

User requests timeline
└→ API Server
   ├→ Redis Cache (key: user:123:timeline)
   │   └→ Cache HIT: Return cached timeline (2ms)
   │
   └→ Cache MISS:
      ├→ Query database (followers, posts)
      ├→ Compute timeline
      ├→ Cache result (TTL: 5 min)
      └→ Return timeline (200ms)

Strategy: Cache-aside with short TTL (stale feed acceptable for 5 min)

Example 2: YouTube Video Streaming

Architecture:

User watches video
└→ CDN Edge (nearest location)
   ├→ Video chunks cached? → Serve from edge (5ms latency)
   └→ Cache MISS → Fetch from origin, cache, serve

Strategy: Pull CDN with long TTL (videos rarely change)

Example 3: E-commerce Product Catalog

Architecture:

Update product price
├→ Database: UPDATE products SET price = 99.99
├→ Invalidate cache: DELETE cache:product:123
└→ Next read: Cache MISS → Fresh data from DB

Read product
└→ Cache-aside: Check cache → DB → Cache

Strategy: Event-based invalidation (pricing must be accurate)

Decision Matrix

Scenario

Caching Strategy

Eviction

TTL

User profiles

Cache-aside

LRU

1 hour

Product catalog (read-heavy)

Read-through

LRU

10 min

Real-time analytics

Write-behind

LFU

No TTL

Static assets (images, JS)

CDN (Pull)

LRU

1 year

Session data

Write-through

TTL

30 min

Leaderboard (hot data)

Refresh-ahead

LFU

5 min

Interview Tips

When asked about caching:

Clarify requirements:
- Read vs write ratio?
- Staleness tolerance?
- Latency SLA?
Discuss trade-offs:
- Write-through: Consistency vs Latency
- Cache-aside: Simplicity vs Staleness
Mention eviction:
- "I'd use LRU eviction for general purpose"
- "TTL handles automatic cleanup"
Consider failure modes:
- Cache failure → Fallback to DB
- Thundering herd → Cache stampede protection

Example Answer:

"For user profiles, I'd use cache-aside with Redis. On read, check cache first (sub-ms latency). On miss, query PostgreSQL, cache result with 1-hour TTL. Use LRU eviction when cache is full. On user update, invalidate cache to prevent stale data. This balances simplicity, performance, and consistency."

Senior Engineer Insights

Design trade-offs: Cache-aside is simple and resilient (cache down → DB); write-through gives consistency at the cost of write latency. Write-behind is fast but risks data loss—use only for non-critical or replayable data.
Cost: Memory (cache) is more expensive per GB than disk; size cache for hot working set (e.g. 80/20). CDN egress is often cheaper than origin egress; push popular assets to edge to reduce origin load and cost.
Operational complexity: Cache invalidation is hard; prefer TTL + versioned URLs where possible. Distributed cache adds cluster management and failure modes (e.g. split brain); use proven solutions (Redis Cluster, Memcached pools).
Observability: Monitor hit rate, miss rate, eviction rate, and latency (P99). Low hit rate → wrong keys or TTL; high eviction → undersized or hot key problem. Alert on cache unavailability and fallback to DB.
Resilience: Cache stampede on hot key miss → use single-flighter or probabilistic early expiry. Cache failure should degrade gracefully (slower, not broken); avoid cache-as-critical-path for correctness.

Quick Revision

Strategies: Cache-aside (app loads on miss), read-through (cache loads on miss), write-through (write DB + cache), write-behind (write cache, async DB). Cache-aside most common; write-behind for high write throughput only when loss is acceptable.
Invalidation: TTL (simple, stale possible), event-based (delete/update on write), versioning (new key per version). "Two hard things: cache invalidation and naming things."
Eviction: LRU common; LFU for stable hot set; TTL for time-sensitive data.
CDN: Edge caches; push (you upload) vs pull (on first request); versioned URLs for immutable assets; purge for updates.
Interview talking points: "We use cache-aside with Redis; 1-hour TTL for user profiles; invalidate on update. We use LRU eviction. For static assets we use CDN with versioned URLs. If Redis is down we fall back to DB and accept higher latency."
Common mistakes: Caching without TTL or invalidation (stale forever); treating cache as source of truth; no fallback when cache is down; cache stampede on viral key.

PreviousDatabases NextNetworking

Last updated 18 days ago

hashtagTable of Contents

hashtagCaching Fundamentals

hashtagWhat is Caching?

hashtagWhy Cache?

hashtagCaching Strategies

hashtag1. Cache-Aside (Lazy Loading)

hashtag2. Read-Through Cache

hashtag3. Write-Through Cache

hashtag4. Write-Behind (Write-Back) Cache

hashtag5. Refresh-Ahead (Predictive Refresh)

hashtagCache Invalidation

hashtagStrategies

hashtag1. TTL (Time-To-Live)

hashtag2. Event-Based Invalidation

hashtag3. Versioning

hashtagDistributed Caching

hashtagSingle-Node Cache (e.g., Redis single instance)

hashtagDistributed Cache Cluster

hashtagCache Replication

hashtagEviction Policies

hashtagContent Delivery Networks (CDN)

hashtagWhat is a CDN?

hashtagHow CDN Works

hashtagCDN Caching

hashtagCDN Invalidation

hashtagReal-World Examples

hashtagExample 1: Twitter Timeline Caching

hashtagExample 2: YouTube Video Streaming

hashtagExample 3: E-commerce Product Catalog

hashtagDecision Matrix

hashtagInterview Tips

hashtagSenior Engineer Insights

hashtagQuick Revision

Table of Contents

Caching Fundamentals

What is Caching?

Why Cache?

Caching Strategies

1. Cache-Aside (Lazy Loading)

2. Read-Through Cache

3. Write-Through Cache

4. Write-Behind (Write-Back) Cache

5. Refresh-Ahead (Predictive Refresh)

Cache Invalidation

Strategies

1. TTL (Time-To-Live)

2. Event-Based Invalidation

3. Versioning

Distributed Caching

Single-Node Cache (e.g., Redis single instance)

Distributed Cache Cluster

Cache Replication

Eviction Policies

Content Delivery Networks (CDN)

What is a CDN?

How CDN Works

CDN Caching

CDN Invalidation

Real-World Examples

Example 1: Twitter Timeline Caching

Example 2: YouTube Video Streaming

Example 3: E-commerce Product Catalog

Decision Matrix

Interview Tips

Senior Engineer Insights

Quick Revision