DDIA

The "Bible" of System Design by Martin Kleppmann Must-read for SDE-3+ roles

Part I: Foundations of Data Systems

Chapter 1: Reliable, Scalable, and Maintainable Applications

Reliability: System continues to work correctly even when things go wrong (hardware faults, software bugs, human error).

Fault-tolerant: Anticipate faults and cope with them.
Resilience: Ability to recover from failure.

Scalability: Ability to cope with increased load.

Load parameters: Requests per second, read/write ratio, concurrent users.
Latency vs Response Time: Latency is duration of request waiting; Response time involves service time + network delay.
Percentiles: P50, P95, P99 (crucial for tail latency).

Maintainability:

Operability: Easy for ops team to keep running.
Simplicity: Easy for new engineers to understand (clean abstractions).
Evolvability: Easy to make changes in future (agile).

Chapter 2: Data Models and Query Languages

Relational Model (SQL):

Data organized in relations (tables).
Default for transactional apps (OLTP).
Schema-on-write: Strict structure enforced.

Document Model (NoSQL):

JSON-like documents (MongoDB).
Better for objects that are self-contained.
Schema-on-read: Structure interpretation happens at application level.
Locality: Fetch entire document at once (good for 1-to-many relationships).

Graph Model (Neo4j):

Nodes and edges.
Best for many-to-many relationships (social networks).

Chapter 3: Storage and Retrieval

Log-Structured Merge-Tree (LSM-Tree):

Used by: Cassandra, HBase, LevelDB.
Writes: Append to MemTable (in-memory sorted tree). When full, flush to SSTable on disk.
Reads: Check MemTable -> Check recent SSTables -> Check older SSTables.
Pros: High write throughput (sequential writes).
Cons: Compaction overhead, read amplification.

B-Trees:

Used by: MySQL, PostgreSQL.
Breakdown into pages (blocks).
Writes: Overwrite page in place (random I/O).
Pros: Good for reads (predictable performance).
Cons: Write amplification (write entire page even for small update), fragmentation.

Chapter 4: Encoding and Evolution

Formats:

JSON/XML: Human readable, verbose, no schema enforcement.
Binary (Protobuf, Thrift, Avro): Compact, schema-based, backward/forward compatibility.

Backward Compatibility: New code can read old data. Forward Compatibility: Old code can read new data.

Part II: Distributed Data

Chapter 5: Replication

Leaders and Followers:

Synchronous Replication: Leader waits for follower ack. Durable but slow (latency).
Asynchronous Replication: Leader confirms write immediately. Fast but potential data loss if leader fails.
Semi-Synchronous: Wait for 1 replica, others async.

Replication Lag Problems:

Read-after-write consistency: User posts comment, refresh page, comment missing. Fix: Read your own writes from leader.
Monotonic reads: User reads updated data, refresh, sees old data (replica lag). Fix: Sticker session (user always reads from same replica).
Consistent prefix reads: Causal dependencies violated. Fix: Causal dependency tracking.

Chapter 6: Partitioning (Sharding)

Strategies:

Key Range: Keys sorted (e.g., A-F, G-M). Good for range scans, bat for hotspots.
Hash of Key: Uniform distribution. Good for hotspots, bad for range scans.

Rebalancing:

Fixed number of partitions: Create 1000 partitions, assign to nodes. Move entire partition when adding node. (Best approach).
Dynamic partitioning: Split partition when it gets too big (HBase).

Chapter 7: Transactions

ACID:

Atomicity: All or nothing.
Consistency: Invariants preserved (app defined).
Isolation: Concurrent transactions don't interfere.
Durability: Committed data is saved.

Isolation Levels:

Read Uncommitted: Dirty reads allowed.
Read Committed: No dirty reads (default in many DBs).
Snapshot Isolation (MVCC): Readers don't block writers, writers don't block readers. Repeatable Read.
Serializable: Strictest. Executed as if serial. (2PL or Serializable Snapshot Isolation).

Chapter 8: The Trouble with Distributed Systems

Ordering and Clocks:

Physical Clocks: Device time (NTP can drift). Unreliable for ordering.
Logical Clocks: Lamport timestamps, Vector clocks. Capture causality.

Byzantine Faults: Nodes may lie/cheat (uncommon in trusted datacenters).

Chapter 9: Consistency and Consensus

Linearizability (Strong Consistency):

System behaves as if there is only ONE copy of data.
Head-of-line blocking: Slow.

CAP Theorem:

Consistency: Linearizability.
Availability: Every request receives a response (non-error).
Partition Tolerance: System continues despite network failure.
Trade-off: In partition (P), choose C (unavailable) or A (inconsistent).

Consensus Algorithms:

Paxos/Raft: Agree on value.
2-Phase Commit (2PC): Atomic commit across nodes (blocking).

Part III: Derived Data

Chapter 10: Batch Processing

MapReduce:

Map: Extract key-value.
Shuffle: Sort by key (distributed sort).
Reduce: Aggregate values for key.

Unix Philosophy: Small tools, composable, pipes. MapReduce is "Unix pipes distributed".

Chapter 11: Stream Processing

Event Sourcing:

Store immutable log of events (facts), not current state.
Derive state from event log.

Processing Guarantees:

At-most-once: Fire and forget.
At-least-once: Retry on failure (idempotency key needed).
Exactly-once: Atomic update of state + offset.

Chapter 12: The Future of Data Systems

Data Integration:

Integrating disparate systems (DB, Search Index, Cache, Analytics).
Change Data Capture (CDC): DB log -> Stream -> Downstream systems.
Unbundling the Database: Using specialized tools for each part (Kafka for log, Elasticsearch for search, Redis for cache) tied together via streams.

Key Takeaways for Interviews

Trade-offs everywhere: Consistency vs Availability, Latency vs Durability.
There is no magic: Every system has limitations.
Partitioning + Replication: The core of scalability.
Asynchronous/Event-driven: Decouples systems and improves resilience.

PreviousBook Summaries NextHead First Java

Last updated 1 month ago

hashtagPart I: Foundations of Data Systems

hashtagChapter 1: Reliable, Scalable, and Maintainable Applications

hashtagChapter 2: Data Models and Query Languages

hashtagChapter 3: Storage and Retrieval

hashtagChapter 4: Encoding and Evolution

hashtagPart II: Distributed Data

hashtagChapter 5: Replication

hashtagChapter 6: Partitioning (Sharding)

hashtagChapter 7: Transactions

hashtagChapter 8: The Trouble with Distributed Systems

hashtagChapter 9: Consistency and Consensus

hashtagPart III: Derived Data

hashtagChapter 10: Batch Processing

hashtagChapter 11: Stream Processing

hashtagChapter 12: The Future of Data Systems

hashtagKey Takeaways for Interviews