githubEdit

22. Design Download Manager

Difficulty: Hard Topics: Concurrency, Multi-threading, HTTP Range Headers, File I/O Key Concepts: Range Requests, RandomAccessFile, Thread Pooling.

Phase 1: Requirements Gathering

Goals

  • Design a download manager that accelerates downloads using parallel connections.

  • Support pausing, resuming, and error recovery.

1. Who are the actors?

  • User: Starts downloads, pauses/resumes.

  • Worker Threads: Independent threads downloading chunks.

  • Server: HTTP Server hosting the file (must support Range requests).

2. What are the must-have features? (Core)

  • Parallelism: Split file into N segments.

  • Resumability: Save progress state to disk.

  • Assembly: Merge segments into final file.

3. What are the constraints?

  • Disk I/O: Writing to same file concurrently requires locking or RandomAccessFile.

  • Network: Server might ban too many connections.


Phase 2: Use Cases

UC1: Start Download

Actor: User Flow:

  1. User provides URL.

  2. Manager sends HEAD request to get Content-Length.

  3. Checks Accept-Ranges: bytes.

  4. Calculates segment size (Size / N).

  5. Creates N Segment objects (Start, End, Downloaded=0).

  6. Starts N Worker threads.

UC2: Worker Progress

Actor: Worker Thread Flow:

  1. Worker sends GET with header Range: bytes=Start-End.

  2. Server responds with 206 Partial Content.

  3. Worker streams bytes to temp_part_k file (or offsets in main file).

  4. Updates downloaded count in shared state.

  5. If interrupted, saves state.


Phase 3: Class Diagram

Step 1: Core Entities

  • DownloadManager: Facade.

  • DownloadTask: State of one file download.

  • Segment: Metadata for a chunk.

  • Worker: Runnable.

UML Diagram

spinner

Phase 4: Design Patterns

1. Master-Worker Pattern (Parallel Processing)

  • Description: A controller (Master) distributes identical tasks to multiple worker threads and aggregates the results.

  • Why used: To saturate the bandwidth, the Manager splits a large file into N segments. It assigns each segment to a Worker thread. The Manager then waits (Barrier) for all to finish before merging the parts.

2. State Pattern

  • Description: Allows an object to alter its behavior when its internal state changes.

  • Why used: A Download Task has complex states (PENDING, DOWNLOADING, PAUSED, FAILED, COMPLETED). The behavior of clicking "Start/Resume" depends entirely on the current state (e.g., Resume only works if Paused).


Phase 5: Code Key Methods

Java Implementation


Phase 6: Discussion

Resume Logic

Q: "How to handle crashes?"

  • A: "Serialize the DownloadTask object (including segments list and downloaded bytes for each) to a JSON file on disk. On restart, load JSON, check file sizes of .part files, and adjust Range: bytes=(Start+Downloaded)-End."

File I/O Optimization

Q: "Why separate temp files?"

  • A: "Writing to a single file from multiple threads requires RandomAccessFile and careful seeking. While RandomAccessFile is thread-safe for different offsets, locking can still occur at OS level. Separate files avoid contention entirely, merging is sequential I/O (fast)."

Concurrency

Q: "Optimal Thread Pool Size?"

  • A: "Network Bound. Not CPU bound. However, too many threads = overhead + server ban. Usually 4-8 is optimal for consumer connections."


SOLID Principles Checklist

  • S (Single Responsibility): ChunkDownloader downloads bytes, DownloadTask manages segments, Manager starts tasks.

  • O (Open/Closed): Add FTPDownloader by extending worker.

  • L (Liskov Substitution): N/A.

  • I (Interface Segregation): N/A.

  • D (Dependency Inversion): N/A.

Last updated