Problem Statement
Design a real-time messaging system similar to WhatsApp that supports one-on-one messaging, group chats, online/offline status, message delivery receipts, and media sharing. The system should handle billions of messages per day with low latency delivery.
Requirements
Functional Requirements
- One-on-one text messaging with delivery/read receipts
- Group messaging (up to 256 members)
- Online/offline presence indicators
- Media sharing (images, videos, documents)
- Message history and sync across devices
- End-to-end encryption
Non-Functional Requirements
- Latency: Message delivery in under 200ms for online users
- Availability: 99.99% uptime
- Scale: 2B users, 100B messages/day
- Ordering: Messages within a conversation must be ordered
- Durability: Messages must not be lost in transit
Back-of-the-Envelope Estimation
| Metric | Value |
|---|---|
| Messages per second | ~1.2M (100B / 86400) |
| Concurrent connections | ~500M (25% of 2B users) |
| Storage per message | ~100 bytes (text) |
| Daily storage (text only) | ~10 TB |
| WebSocket servers needed | ~50,000 (10K connections per server) |
High-Level Architecture
Core Components
- Chat Service: Handles message sending, delivery, and storage
- Connection Service: Manages WebSocket connections and routing
- Presence Service: Tracks online/offline status
- Media Service: Handles upload, storage, and delivery of media files
- Notification Service: Push notifications for offline users
- Group Service: Manages group membership and message fan-out
Message Delivery Flow
Online User Flow
- Sender's device sends message via WebSocket to Chat Service
- Chat Service persists message to message store
- Chat Service looks up recipient's connection server via Connection Registry
- Routes message to recipient's WebSocket connection
- Recipient's device sends delivery acknowledgment
- Chat Service updates delivery status and notifies sender
Offline User Flow
- Sender's device sends message via WebSocket
- Chat Service persists message and marks as "sent, not delivered"
- Message is queued in recipient's offline message queue
- Push notification sent via APNs/FCM
- When recipient comes online, offline queue is drained and messages are delivered
- Delivery receipts flow back to sender
Database Design
Message Storage
For message storage, use Cassandra — it's designed for high-throughput writes and time-series data:
CREATE TABLE messages (
conversation_id UUID,
message_id TIMEUUID,
sender_id BIGINT,
content TEXT,
message_type TEXT,
created_at TIMESTAMP,
PRIMARY KEY (conversation_id, message_id)
) WITH CLUSTERING ORDER BY (message_id DESC);
Why Cassandra?
- Handles write-heavy workloads (100B messages/day)
- Natural time-series ordering via TIMEUUID clustering
- Linear horizontal scaling
- Tunable consistency (LOCAL_QUORUM for writes, ONE for reads)
Connection Registry
Redis for mapping user IDs to WebSocket server instances:
- Key:
user:{userId}→ Value:{serverId, connectionId} - TTL: Auto-expires when WebSocket disconnects
Presence System
Tracking online/offline status for 2B users is challenging. Naive approaches (heartbeat every 5 seconds per user) generate enormous traffic.
Approach: Lazy Presence with Subscription
- Users only receive presence updates for contacts they're actively chatting with
- Presence is tracked via WebSocket connection lifecycle (connect = online, disconnect = offline)
- Heartbeat interval: 30 seconds (to detect zombie connections)
- Presence state is stored in Redis with 60-second TTL, refreshed by heartbeats
Group Messaging
Fan-Out on Write
When a message is sent to a group:
- Chat Service receives the message
- Looks up group membership (cached in Redis)
- Creates a copy of the message for each member's conversation timeline
- Routes each copy to the member's connection server
For small groups (≤256 members), fan-out on write is acceptable. For large broadcast lists, consider fan-out on read instead.
Scaling Strategies
Connection Layer
- Consistent hashing to route users to specific WebSocket servers
- Sticky sessions via load balancer to maintain WebSocket connections
- Horizontal scaling by adding more WebSocket servers
Message Storage
- Sharding by conversation_id ensures all messages for a conversation are on the same Cassandra node
- Compaction strategy — Time-Window compaction for time-series message data
- Hot conversation handling — Viral group chats are detected and distributed across multiple replicas
Failure Handling
- WebSocket server crash: Client detects disconnect, reconnects to a different server, drains offline queue
- Message store failure: Write-ahead log in the Chat Service buffers messages during brief Cassandra outages
- Split-brain in presence: Accept that presence may be stale for up to 60 seconds; clients should handle "last seen" gracefully
Tradeoffs
- Fan-out on write vs. read: Write amplification for groups (each message stored N times) vs. read amplification. We chose write for small groups since most messages are read by all members anyway.
- Cassandra vs. PostgreSQL: Cassandra's eventual consistency means a message might briefly appear on one device before another, but the write throughput is necessary at this scale.
- WebSocket vs. Long Polling: WebSockets have higher resource cost per connection but provide true real-time delivery. At WhatsApp's scale, the infrastructure cost is justified.