#3 Google map

Nice — let’s design a simple Google Maps (map tiles, search, routing, POI, basic traffic). I’ll keep it practical: scope, requirements, APIs, architecture, core data & algorithms, scaling/back-of-envelope numbers, trade-offs and an evolution path. Treat this as the answer you’d give in a system-design interview.

1 — Scope & assumptions

Goal: User-facing map app that supports:

  • Map display (tiles) and pan/zoom

  • Place search & autocomplete (POIs)

  • Reverse geocoding (lat/lon → address)

  • Turn-by-turn routing for driving (fastest path)

  • Basic live traffic overlays (coarse)

  • Save/bookmark places

Assumptions (simple version):

  • Target: 1M monthly active users (MAU), peak 2k QPS for map tiles & 500 QPS for search/routing.

  • Global-ish coverage but tolerate coarser data quality.

  • Mobile-first; offline caching acceptable (basic).

  • Latency targets: tile load < 200 ms (from CDN), search P95 < 300 ms, route compute < 500 ms.

2 — Functional requirements

Must-haves

  • Serve map tiles at multiple zoom levels.

  • Search places by name/address with autocomplete.

  • Reverse geocode coordinates to human-readable address.

  • Compute driving route (start → end) with step-by-step directions.

  • Show estimated travel time (ETAs) and distance.

  • Display basic traffic layer (color by speed: green/yellow/red).

  • Persist user bookmarks/favorites.

Nice-to-have (not for simple v1)

  • Transit routing, walking/biking, turn-by-turn voice guidance, live incident reports, offline full-region maps.

3 — Non-functional requirements

  • Performance: Tile load P95 < 200 ms (CDN hit), search P95 < 300 ms.

  • Availability: 99.9% for read operations (tiles/search); route compute 99% SLA.

  • Scalability: Handle bursts (events, navigation peaks) via autoscaling & CDN.

  • Accuracy: Routing accuracy adequate for driving; traffic precision coarse.

  • Privacy & Security: TLS, opt-in location sharing, minimal retention of location telemetry unless user consents.

  • Cost-awareness: Use CDNs and caching to reduce origin load.

4 — External APIs (high-level)

5 — High-level architecture

6 — Data & storage components

  • Map geometry / road graph — primary authoritative source (Vector): nodes (intersections), edges (road segments), metadata (max speed, lanes, turn restrictions). Stored in a graph DB or custom sharded store (on-disk protobufs or key-value per tile).

  • Tiles

    • Pre-rendered raster tiles (PNG) or vector tiles (protobuf: Mapbox Vector Tile). For simple version use vector tiles + client render or pre-rendered PNGs for fast implementation. Stored in blob store (S3) and served via CDN.

  • POI database — places with name, address, category, location, popularity; indexed for fast text+geo lookups (Elasticsearch or custom trie + geospatial index).

  • Reverse geocoding index — address polygons/centroids, indexed by spatial lookup (PostGIS / R-tree).

  • Routing graph — compact representation of the road network, possibly partitioned by region; preprocessed for fastest queries (see algorithms).

  • Traffic feed — time-series datastore (recent speeds for edges), aggregated from telemetry and third-party feeds.

  • User data — auth, bookmarks, preferences (RDBMS or managed user store).

7 — Core algorithms & components

Map tiles

  • Pre-generate vector tiles for zoom levels (0–16+) using map data pipeline. Store and serve via CDN. For dynamic overlays (traffic), we may generate separate overlay tiles that are composited client-side.

Search & Autocomplete

  • Autocomplete: prefix trie or Elasticsearch with n-gram / edge-ngram for fast suggestions. Combine text score with geographic proximity to rank results.

  • Geo-filtering: restrict suggestions by bounding box (user viewport) to be relevant.

Reverse Geocoding

  • Use spatial index (R-tree) on address centroids / polygons; perform nearest-neighbor lookup, then format address.

Routing (driving)

  • Road graph = nodes + directed edges, edges have weights (travel time = length / speed).

  • Routing algorithm: A* with heuristic = straight-line distance / max_speed.

  • Speed-ups:

    • Precompute contraction hierarchies (CH) or use hierarchical routing (landmarks / ALT) for very fast queries.

    • Partition graph by region and use multi-level search for long routes.

  • Traffic incorporation: modify edge weights with live speed factor; for faster response, use cached ETA for common origin-destination pairs and re-compute on demand.

Traffic

  • Ingest telemetry (anonymized speed samples) and third-party traffic feeds into a streaming pipeline. Aggregate per edge (e.g., 1-min windows) and update traffic datastore.

  • Traffic overlay tiles generated at low resolution and pushed to CDN.

8 — Read/write paths & caching

  • Tile reads: client → CDN → origin (only on cache miss). Pre-generate tiles to minimize origin load.

  • Search reads: served by search cluster, cache popular queries in redis.

  • Routing: routing service queries in-memory graph partitions (hot in RAM); results cached for recent queries.

  • Writes: map updates, POI updates, traffic feeds push into batch pipelines that regenerate tiles/indices off-line and update stores; user bookmarks are simple DB writes.

9 — Back-of-the-envelope (simple estimates)

(Assume simple v1: 1M MAU, peak 2k tile QPS, 500 search/routing QPS)

  • Tile size: vector tile ~ 10–40 KB; assume 20 KB average.

    • Peak outbound bandwidth for tiles: 2,000 req/sec × 20 KB ≈ 40 MB/s ≈ 320 Mbps.

  • Storage for tiles: pre-generate 300M tiles × 20 KB ≈ 6 TB (compressed).

  • Routing graph: global road graph compacted ~ tens of GB; regional partitions small enough to hold hot partitions in memory (a few GB per region).

  • Search index: ES cluster few dozen GBs depending on POI count.

  • Compute: routing using A* on in-memory graphs; each query CPU cost small but requires memory; provision a few route servers behind LB (scale with QPS).

  • Cache hit: CDN should handle >90% of tile traffic; origin load small.

10 — Operational concerns & observability

  • Monitoring: QPS, P95/P99 latencies per API, cache hit ratios, tile generation job health, routing latency, error rates.

  • Logging/Tracing: distributed tracing for route generation (to debug slow graph lookups).

  • Alerts: surge in origin hits (CDN misses), index staleness, pipeline failures.

  • Deployment: blue/green for search and routing; careful migration for graph changes.

11 — Privacy & security

  • TLS for all traffic.

  • Store only necessary telemetry; anonymize and aggregate user location data; provide privacy controls and deletion (GDPR).

  • Rate limit APIs to prevent scraping.

12 — Trade-offs & choices

  • Vector vs raster tiles

    • Vector: smaller storage, flexible styling, client rendering cost.

    • Raster: faster for old clients, larger storage and CDN usage.

    • Choice (simple v1): vector tiles if client can render; fallback raster for compatibility.

  • Routing approach

    • A* on raw graph: simpler, slower for long routes.

    • CH/ALT: more complex preprocessing, much faster queries.

    • Choice: start with A*/bidirectional A* on regional graphs; introduce CH for scale.

  • Traffic precision

    • Live per-edge speeds give better ETA but costlier; coarse overlay acceptable initially.

13 — Evolution & next steps

  • Add offline region downloads (vector tiles + graph).

  • Add multimodal routing (transit, walking, biking).

  • Improve ranking with ML for POIs.

  • Implement contraction hierarchies / commercial-grade routing for global scale.

  • Add user-contributed edits and moderation workflows.

14 — Example flow (user opens app, searches, gets route)

  1. App requests tiles for viewport → CDN serves vector tiles.

  2. User types “Starbucks” → client calls /search (autocomplete) → search service queries ES restricted to viewport → returns ranked POIs.

  3. User selects POI and taps “Navigate” → app POST /route with coords → routing service loads relevant graph partition(s), runs A*/bidirectional search (consider traffic), returns polyline + steps + ETA.

  4. App renders route, overlays traffic tiles from CDN.


If you want, I can:

  • produce a one-page cheat-sheet for interview delivery (time-boxed talking points), or

  • sketch detailed routing algorithm pseudocode (A*/CH) and show exactly what pre-processing is needed.

Which follow-up would help you most?

Last updated