Heap

Complete binary tree with heap property: parent ≥ children (max-heap) or parent ≤ children (min-heap). Used for priority queue, top-K, K-way merge, median maintenance.


1. Concept Overview

Problem space: Top K elements (min-heap of size K for largest; max-heap for smallest), K-way merge (min-heap of heads), median from stream (max-heap left half, min-heap right half), scheduling (earliest deadline / highest priority).

When to use: Need repeated "extract min/max" or "maintain K best"; merge K sorted; dynamic median; Dijkstra.


2. Core Algorithms

Array Representation (0-indexed)

  • Parent: (i-1)//2; Left: 2*i+1; Right: 2*i+2.

  • Insert: Append, then bubble up (compare with parent, swap if violates).

  • Extract: Swap root with last, pop last, bubble down (swap with smaller/larger child until heap property holds).

  • Heapify: From last non-leaf down to 0, bubble down. O(N) total.

Top K Largest

  • Min-heap of size K. For each x: if len(heap)<K, push x; else if x > heap[0], pop then push x. Result = heap (or pop all). O(N log K).

K-Way Merge

  • Min-heap stores (value, list_id, index). Push first element of each list. Pop min, push next from same list. O(N log K).

Two Heaps (Median)

  • Max-heap (left) holds lower half; min-heap (right) upper half. Keep balance (|left| - |right| ≤ 1). Median = top of larger or average of tops.


3. Advanced Variations

  • Lazy deletion: For "remove element" in heap (e.g., sliding window median), mark deleted and ignore when it surfaces at top. Count valid vs invalid at top.

  • Custom comparator: Objects by key; (priority, timestamp) for stable ordering.

  • Kth largest in stream: Same as top-K with min-heap of size K.

Edge Cases

  • Empty input; K=0 or K > n; duplicates; negative numbers; median with even count (average of two mids).


4. Common Interview Problems

Easy: Kth Largest in Stream. Medium: K Closest Points, Top K Frequent Elements, Reorganize String, Task Scheduler. Hard: Find Median from Data Stream, Merge K Sorted Lists, Sliding Window Median, IPO.


5. Pattern Recognition

  • Top K: Min-heap size K (for largest) or max-heap (for smallest); if stream, maintain size K.

  • K-way merge: Heap of heads; pop min, push next from same source.

  • Median / percentiles: Two heaps (max for lower, min for upper); rebalance on insert.

  • Scheduling: Priority by deadline or profit; heap for "next available" or "highest profit".


6. Code Implementations


7. SDE-3 Level Thinking

  • Trade-offs: Heap vs sort — heap O(N log K) for top-K without fully sorting. Two heaps for median: O(log N) add, O(1) median; alternative is sorted structure (e.g., balanced BST).

  • Memory: Top-K heap uses O(K). Median two heaps O(N). For distributed top-K, merge local top-K from each shard.

  • Concurrency: Thread-safe priority queue (locking or lock-free); multiple consumers with priority.


8. Interview Strategy

  • Identify: "K largest/smallest" → heap of size K. "Merge K sorted" → K-way merge. "Median" → two heaps.

  • Common mistakes: Wrong heap type (max vs min for "K largest"); forgetting to rebalance in two heaps; off-by-one in median (even vs odd).


9. Quick Revision

  • Formulas: Parent (i-1)//2, children 2*i+1, 2*i+2. Heapify O(N); insert/extract O(log N).

  • Tricks: K largest → min-heap K; K smallest → max-heap K (negate in Python). Median → lo (max) + hi (min), keep lo size ≥ hi.

  • Edge cases: K=0, K>n; empty stream; duplicate values.

  • Pattern tip: "K" + "largest/smallest" or "merge K" → heap.

Last updated