Designing a real-time chat system like WhatsApp or Slack involves handling long-lived connections and ensuring message delivery guarantees.
1. Communication Protocol
HTTP is not suitable for real-time bi-directional messaging (polling is inefficient). Use WebSockets for full-duplex communication over a single TCP connection.
2. High-Level Components
- Chat Servers: Maintain WebSocket connections with clients.
- Presence Servers: Track who is online/offline (usually using Heartbeats and Redis).
- Message Storage: Use a distributed database like Cassandra or HBase for high write throughput and scalability.
3. Message Delivery Workflow
- User A sends message to Chat Server.
- Chat Server acknowledges receipt (Status: Sent).
- Chat Server pushes to User B if online, or to Message Queue (Kafka) if offline.
- When User B receives, client sends ACK back (Status: Delivered).
4. Scaling the System
Use a Pub/Sub (e.g., Redis Pub/Sub) between Chat Servers. If User A is connected to Server 1 and User B is on Server 2, Server 1 publishes the message to a channel that Server 2 is subscribed to.