Server: HTTP Server hosting the file (must support Range requests).
2. What are the must-have features? (Core)
Parallelism: Split file into N segments.
Resumability: Save progress state to disk.
Assembly: Merge segments into final file.
3. What are the constraints?
Disk I/O: Writing to same file concurrently requires locking or RandomAccessFile.
Network: Server might ban too many connections.
Phase 2: Use Cases
UC1: Start Download
Actor: User Flow:
User provides URL.
Manager sends HEAD request to get Content-Length.
Checks Accept-Ranges: bytes.
Calculates segment size (Size / N).
Creates N Segment objects (Start, End, Downloaded=0).
Starts N Worker threads.
UC2: Worker Progress
Actor: Worker Thread Flow:
Worker sends GET with header Range: bytes=Start-End.
Server responds with 206 Partial Content.
Worker streams bytes to temp_part_k file (or offsets in main file).
Updates downloaded count in shared state.
If interrupted, saves state.
Phase 3: Class Diagram
Step 1: Core Entities
DownloadManager: Facade.
DownloadTask: State of one file download.
Segment: Metadata for a chunk.
Worker: Runnable.
UML Diagram
Phase 4: Design Patterns
1. Master-Worker Pattern (Parallel Processing)
Description: A controller (Master) distributes identical tasks to multiple worker threads and aggregates the results.
Why used: To saturate the bandwidth, the Manager splits a large file into N segments. It assigns each segment to a Worker thread. The Manager then waits (Barrier) for all to finish before merging the parts.
2. State Pattern
Description: Allows an object to alter its behavior when its internal state changes.
Why used: A Download Task has complex states (PENDING, DOWNLOADING, PAUSED, FAILED, COMPLETED). The behavior of clicking "Start/Resume" depends entirely on the current state (e.g., Resume only works if Paused).
Phase 5: Code Key Methods
Java Implementation
Phase 6: Discussion
Resume Logic
Q: "How to handle crashes?"
A: "Serialize the DownloadTask object (including segments list and downloaded bytes for each) to a JSON file on disk. On restart, load JSON, check file sizes of .part files, and adjust Range: bytes=(Start+Downloaded)-End."
File I/O Optimization
Q: "Why separate temp files?"
A: "Writing to a single file from multiple threads requires RandomAccessFile and careful seeking. While RandomAccessFile is thread-safe for different offsets, locking can still occur at OS level. Separate files avoid contention entirely, merging is sequential I/O (fast)."
Concurrency
Q: "Optimal Thread Pool Size?"
A: "Network Bound. Not CPU bound. However, too many threads = overhead + server ban. Usually 4-8 is optimal for consumer connections."