Designing a URL shortener like bit.ly or TinyURL is a classic high-level system design interview question. It tests your ability to handle read-heavy workloads and distributed ID generation.
1. Requirements
- Functional: Shorten long URLs, Redirect short URLs to original.
- Non-functional: High availability, Low latency, Scalability (read-heavy 100:1).
2. Key Components
Redirection: Use HTTP 301 (Permanent Redirect) for SEO and caching, or HTTP 302 (Found) if you want to track redirection analytics in your server.
Storage: A NoSQL Key-Value store (like DynamoDB or Redis) is ideal as we only need simple short_code -> long_url mapping.
3. The Shortening Algorithm
Base62 Encoding: Use characters [a-z, A-Z, 0-9]. A 7-character string provides 62^7 = 3.5 trillion possible URLs.
- Generate a unique numerical ID (e.g., using a distributed ID generator like Snowflake or a DB auto-increment).
- Encode that integer to Base62.
Avoid Collisions: Do not just hash the URL (e.g., MD5), as two long URLs could collide, and handling collisions in a distributed system is complex. Sequental IDs + Encoding is superior.