Notes
Basic Concepts
System design concept guide
Importance of system design
Provides high level of understanding of the components and their relationships.
Helps in making informed descisions about various tradeoffs.
What is system design
deals with designing the structure of a software system to ensure that it meets the desired functionality, performance, Scalability, reliability and other quality attributes.
It is the process of designing the artitecture, components, data etc. for a system to satisfy functional and non-functional requirements.
Overall, system design in an open ended discussion topic. That is why most tech firms prefer to include system design interview rounds.
Some key steps involved in designing a system
Requirement analysis: Understand functional and non functional requirements
High level design: Define the overall structure, Components and their interactions
Database design: Designing the structure of the database like defining tables, relationships, indexing strategies, data patterns etc.
API design: Designing APIs through which users or other systems interact with the system.
Component design: Designing individual components their responsibilities and interactions.
Scalability and performance: Ensuring that the system can handle its expected workload efficiency and can scale to accommodate increasing workload demand.
Reliability and fault tolerance: Designing the system to minimize the impact of failures and to recover gracefully from errors or faults.
Security: Protecting the system from unauthorized access, data breaches and other security threats.
Testing: Designing test to verify that the system meets its requirements and performs as expected.
Documentations: Documenting the design descisions, artitecture and other relevant information to facilitate understanding , maintainance and future development of the system.
Essential concepts used in system design
Availability
system should remain up and return the response of a client request.
can be quantified by measuring the percentage of time system remain operational in a given time window.
We usually define the system availability in terms of number of 9s
For designing a highly available system we need to consider factors like replication, load balancing, failover mechanism, disaster recovery strategies etc.

Throughput
measure the performance and capacity of a system handling tasks , transactions or operations within a given time interval.
High throughput is crucial when handling a large data or a high number of concurrent requests.
Latency
Metric related to system performance.
Measures the time taken by a request to receive a response from a server.
it can be affected by seped of processing, network congestion, distance between the source and destination and the efficiency of the system components.
Lower latency means quick response time, higher system performance and a smooth user experience.
reducing latency is important in applications involving real time responsiveness is critical like online gaming, video streaming and financial trading etc.
Strategies for reducing latency: optimizing algorithms, improving network infrastructure, using faster hardware components, minimizing unnecessary processing steps etc.
Network Protocols
Network acts as platform for communication between users and servers or different servers.
Protocols on other side are the rules governing how servers or machines communicate over the network. Some common protocols are HTTP, TCP/IP etc.

Load balancing
machines that balance the load among various servers, these are used to prevent single point of faliures.
Depending on how these machines are deployed these can be of different types.
these distribute traffic, prevent service from breakdown and distribute to the system reliability
Acts as traffic managers and help us to maintain system throughput and availability.
Proxies
Present between client and server. when a client sends a request it passes through the proxy and then reqches the servers.
these are of two types
forward proxy: acts as a mask for the clients and hides the clients iddentity from the server
Reverse Proxy: acts as mask for the servers and hides the servers iddentity from the response
Databases
associated with data that needs to be stored or later retrieval.
One way to classify database is based on their structure which can be relational or non relational.
Relational database include MySQL and PostgresSQL enforce strict relationships among data and have a higly structured organization
Non Relational Database such as cassandra and redis have flexible structure and can store both structured and unstructured data. Often used in system that are highley distributed and require high spped
Database partition
is a way of dividing database into smaller chunks to increase performance of the system.
If done properly can improve latency and throughput so that more and more request can be served
ACID vs BASE
Relational and non relational databse ensure different types of compliance.
Relational database are associated with ACID while non relation with BASE
ACID: Atomicity, Consistency, Isolation, Durability
This ensures that transactions are executed in a controlled manner, A transaction is an interaction with the database.
Atomicity: A group of operations are treated as a single unit of work, if any of these operation fail , entire transaction will fail. This ensures all or nothing is achieved.
Consistency: Data base state changes do not corrupt the data and that transactions can move from one valid state to another
Isolation: Transactions are executed independently of each other, without affecting other transactions. This ensures concurrency in database operations.
Durability: Data written to the database is persisted and will remain there even in the event of faliure.
BASE: Basically available soft state eventual consistency
maintains integrity of NoSQL databases and ensures their proper functioning. It is a key factor in the scalability of NoSQL databases.
Basically available: Ensures that system is always available
Soft state: provides flexibility to the system and allows it to change over time for faster access.
Eventual Consistency: ensures that the data in the system takes some time to reach a consistent state and eventually becomes consistent.
SQL vs NoSQL
one needs to be clear about type of storage according to the system requirements.
If system is distributed in nature and scalability is essential then NoSQL databases are the best choice to go with.
NoSQL databases when the amount of data is huge
SQL databases are favorable when data structure is more important .
It is preferred when queries are complex and databases require fewer updates but there is always a tradeoff between NoSQL and SQL databases.
Sometimes according to requirements a polyglot artitecture comprising both SQL and NoSQL databases is used to ensure application performance.
Scalability in distributed system
As service grows number of requests increases, system can become slow and performance may be affected.
Scaling is the best way to adress this issue. There are two main ways to scale
horizontal scaling: Adding more servers to distribute requests.
Vertical scaling: Increasing the capacity of a single machine by upgrading the resources like CPU, RAM etc.
Caching:
It helps to improve the performance of a system by reducing its latency by storing frequently used data in a cache so that it can be accessed quickly instead of querying the database
But implementing a cache also adds complexity because it is important to maintain consistency between data in the cache and data in the main database.
Cache is expensive, so its size is limited.
To ensure best performance, cache eviction algorithms like LIFO, FIFO, LRU and LFU
Consistent hashing:
Widely used concepts in distributed systems because it offers flexibility in scaling. It is an improvement over traditional hashing which is ineffective for handling requests over a network.
In this Users and servers are located in a virtual circular ring structure called hash ring. The ring is considered infinite and can accommodate any numbers of servers .
This allows for efficient distribution of requests or data among servers. It helps us achieve horizontal scaling which increases throughput and reduces latency.
CAP Theorm
Concepts for designing networked shared data systems.
It states that a distributed data system can provide Consistency, avalability and partition tolerance.
One can make tradeoffs between three properties based on the use cases of our DBMS Systems.
Consistency:
everything should go on in a very well coordinated manner and with proper synchronization.
It ensures that systems returns the results such that any read operation should give the most recent write operation
Availability:
System is always available whenever the request is made.
Availability:
Partition Tolerance:
necessary for any distributed system, so we always need to choose between availability and consistency.
It corresponds to the conditional that the system should work irrespective of any node faliure or network faliure.
Some other concepts
Server-sent events
Long polling
Client server architecture
Web sockets
Load balancing algorithms
Peer to Peer networks
Master Slave replication
Throttling and rate limiting
How to choose the right database
Leader election in distributed systems
System design interview guide
These are open ended conversations where candidates and interviewer discuss to reach a solution.
Step 1: Requirements Clarifications
Ask question and clarify any ambiguities early in the interview process.
We can ask
who will be using the system?
How will they be using it?
How many users are there?
What does the system do?
What are the inputs and outputs of the system?
How much data is expected to be handled?
How many requests per second are expected?
What is the expected read-to-write ratio?
Step 2: Back of the envelope estimation
It is important to understand the system scale.
One need to consider the factors like
read-to-write ratio,
number of concurrent requests,
number of monthly users,
number of daily active users ,
required storage capacity,
bandwith requiremtns etc
Step 3: Create a high level design
identify all critical components and attempt to draw a digram representing the interactions of these components to fulfill both functional and non functional requirements.
these components include
different client
backend services
data stores
cache systems
load balancers
CDNs
message queues
batch processing systems
By out linging the interaction of key components one can better explain the overall structure of the system and how various components will collaborate to achieve the desired results.
Step 4: Database design
defining how data will be processed is a crucial
one need to understand type of data stored, relationships between different types of data , expected workload ( read-heavy, write heavy or balanced)
Step 5: Design Core components
Once all high level structure has been discussed then we discuss the major components in more detail.
At this stage we get the we might get some inputs from the interviewer.
For ex-- if we were asked to design a tiny URL then
Generating and storing a hash of the full URL
MD5 and Base62
Hash Collisions
SQL and NoSQL
Database Schema
Translating a hashed URL to full URL
Database lookup
API and Object-oriented design
Step 6: Scale the dsign
to meet the scale requiremnts we can use ideas like
load balancing
horizontal scaling
caching
database sharding
replication etc
Step 7: Identifying and resolving bottlenecks
One can discuss following bottlenecks
Is there any single point of faliure? what steps can be taken to mitigate this?
Are there enough replicas of data to ensure that users can still be served if a few servers are down?
Are there enough copies of different services running to ensure that a few faliures will not cause a total system shutdown?
How is the performance of the service being monitored? are there alerts in place to notify the team when a critical components fail or their performance degrades?
System design questions
Key value store
typehead system design
Pastebin system design
Designing API rate limiter
designing web crawler
Youtube system design
Designing Google Docs
Network Protocols
Group of computers connected through communication channels for resource sharing, data exchange and various types of communication.
these devices communicate with each other following rules known as network protocols.
Why do we need protocols?
these define the format of data packets so that devices can process transmitted data correctly
Specify how devices can exchange messages: sequence of messages , types of message
provide a way to route data packets , how devices are identified how address are assigned how routing descisions are made t send data to desired destination.
safety in data transmission,
implement flow control to manage the rate of data transmission
features to detect and correct errors during data transmission
Protocols are designed to scale well with growth in the network size.
OSI Model
divides whole system into 7 different layers, where each layer has specific responsibilities and each layer interacts with adjecent layers to fecilitate communication.
Seven layers of OSI Model:
Application
Top layer
Fecilitate comm. b/w software application or services runnign on different devices over network
ex -- HTTP, SMTP, FTP
Presentation:
Responsible for data formatting, encrypting and compression.
Session:
Establish reliable and efficient data transfer.
It segments the data received from upper layers into smaller units (data packets) and provides error recovery, flow control and congestion control
Ex-- TCP and UDP
Network
determine the best path for the data packets to reach their destination across different networks, using routing protocols and ip adressing
Data link layer
Provide reliable and error free transmission of data over physical layer.
It is responsible for establishing and terminating connections between network nodes, performing error detection and correction and handling the flow of data frames between adjacent node.
Physical layer
deals with physical transmission.
defines the electrical, mechanical and physical aspects of the network like cables, connectors and signaling methods.
Types of netowk protools
each protocol has a different purpouse and operate in different layers of OSI model. These are:
Internet protocol(IP):
Network layer protocol
helps data packets to reach their destination.
when one machine wants to send data to another machine, it sends it in the form of IP packets
An IP packet is fundamental data unit that is transmitted over any network that uses IP.
It has 2 key parts:

IP header
contains info about the packet like source destination IP adress, size of packet, version of IP and other control info
IP data (payload)
It contains the actual data being transmitted.
What happens
When device sends a packet it first determines the IP adress of the destination device, Packet is then sent to the first hop on the network i.e. a router.
The router examines the destination IP address in the header and determines the next hop on the network to which the packet should be forwarded.
This process continues till the packet reaches its final destination.
Here routing algos are used to determine best path
What is an IP adress?
digital address that helps us identify and locate a device on a network.
unique and assigned to each device or domain that connects to the internet.
Written as four numbers separated by periods, these represent numerical value of the address.
Types of IP address:
IPv4
First version of IP
Uses 32 bit address scheme, so it can accommodate 2^31 distinct IP address
Still carries major chunk of internet traffic
also provides benefits like encrypted data transmission and cost effective data routing.
IPv6
developed to overcome the limitation of IPv4
uses 128 bit address space
improved routing, packet processing and security features compared to IPv4.
But IPv6 is not compatible with IPv4 and upgrading can present a challenge.
Transmission control protocol (TCP)
Used to send large data. The data is divided into multiple packets.
But increase in packets increases the risk of packets not reaching the destination or arriving in the wrong order, TCP helps us in mitigating this
TCP is transport layer protocol built on top of the IP, it is used in conjunction with the IP to provide a complete suite of protocols for transmitting the data .
How it works
Breaks data into small packets and numbers them in sequence to ensure that they are received in the correct order at the receiver.
If there is any loss of packet TCP can retransmit it.
So IP is responsible for delivering the packet while TCP helps to put them back in correct order
when a device wants to communicate with another device using TCP, it establishes a connection with the destination device through a sequenced acknowledgement
Process management in OS
Latency of a system
Throughput of a system
Availability of a system
Distributed system Concepts
Client server architecture
Peer-to-peer Architecture
Advanced concepts
Load balancers
Load balancing algorithms
Forward and reverse proxy
What is a websocket
Server sent events
http long pooling
Consistent hashing
Throttling and rate limiting
Publisher subscriber pattern
Leader election
Database concepts
Fundamentals of caching
Storage and redundancy
Database partition
Master slave replication
SQL vs NoSQL database
Key-Value Database
Graph Database
SQL for Data Science
Basic interview concepts
Design LRU Cache
Design LFU Cache
Design Bloom filter
Design key value store
Paste bin system design
Typeahead System design
URL shortener system design
Design Notification Service
Design QR code generator
Design API rate limiter
Advanced interview concepts
Web crawler system design
You tube system design
WhatsApp system design
Google Docs system design
Instagram system design
Twitter system design
Dropbox system design
Yelp system design
Uber system design
Recommender system
Last updated
