CHAPTER 14
Intermediate
Real-Time Systems Design
Updated: May 16, 2026
30 min read
# CHAPTER 14
Real-Time Systems Design
1. Introduction
The standard HTTP Request-Response model has a massive limitation: the server can only speak when spoken to. If a user is waiting for a crucial message, the server cannot reach out and hand it to them; the user's browser must actively refresh the page to ask, "Do I have a new message yet?" In a modern application like WhatsApp, Twitch live chat, or a stock trading dashboard, forcing the client to refresh the page every second is an architectural catastrophe that will DDoS your own servers. We must establish persistent, bi-directional communication. In this chapter, we will master Real-Time Systems Design. We will abandon traditional HTTP polling, deploy persistent WebSockets, and architect the immense distributed infrastructure required to route millions of live messages instantly.2. Learning Objectives
By the end of this chapter, you will be able to:- Explain the immense CPU and network waste of Short Polling.
- Differentiate between Long Polling, Server-Sent Events (SSE), and WebSockets.
- Architect a bi-directional WebSocket connection.
- Design a scalable Real-Time Chat architecture (e.g., WhatsApp/Discord clone).
- Understand the complexity of connection management and stateful servers.
3. The Problem with Polling
Before WebSockets, engineers had to use "hacks" to simulate real-time data.-
Short Polling: The browser is programmed with JavaScript to blindly send an HTTP
GETrequest every 3 seconds asking for new messages. *The Failure:* If you have 100,000 users, your servers are hit with 33,000 requests *every second*, and 99% of those requests return empty. Massive waste of resources.
- Long Polling: The browser sends a request, but the server holds the connection open and *does not reply* until it actually has a message. *The Improvement:* Saves bandwidth, but still requires constantly tearing down and rebuilding heavy HTTP connections.
4. WebSockets (Persistent Bi-Directional Connection)
WebSockets are the modern standard for real-time architecture.-
The Handshake: It starts as a standard HTTP request, but it asks the server to "Upgrade" the connection to a WebSocket (
ws://).
- The Persistence: If accepted, the physical TCP connection remains permanently open.
- Bi-Directional Magic: The client can push data to the server at any time, and the server can push data to the client at any time, with virtually zero HTTP header overhead. It is lightning fast.
5. Server-Sent Events (SSE)
A simpler alternative to WebSockets.- The Concept: WebSockets are two-way. Sometimes, you only need one-way data (e.g., a Live Stock Ticker pushing prices to a dashboard; the user never sends data back).
- The Tech: SSE operates over standard HTTP. The server establishes the connection and streams a continuous flow of data to the client. It is easier to scale than WebSockets and naturally handles automatic reconnections.
6. Architecting a Scalable Chat System
Scaling WebSockets is incredibly difficult because they are Stateful.- The Challenge: User A connects to Web Server 1. User B connects to Web Server 5. When User A sends a chat message to User B, how does Server 1 push the message to User B, who is physically connected to Server 5?
- The Pub/Sub Architecture (Redis):
- 1. User A sends message to Server 1.
- 2. Server 1 looks up the database, realizes User B is the recipient, but doesn't know where User B is.
-
3.
Server 1 publishes the message to a central Redis Pub/Sub channel called
user_b_messages.
- 4. ALL Web Servers are subscribed to Redis. Server 5 sees the message on the channel, realizes it holds the active WebSocket connection for User B, and pushes the message down the socket to User B's phone.
7. Diagrams/Visual Suggestions
*Architecture Diagram: Scalable WebSocket Chat via Pub/Sub*
text
8. Best Practices
- Connection Management (Heartbeats): Because WebSockets stay open indefinitely, mobile networks will often silently drop the connection when a user drives through a tunnel. The server won't know the user disconnected and will waste RAM keeping the ghost socket open. *Best Practice:* Implement a "Ping/Pong Heartbeat." Every 30 seconds, the server sends a tiny ping. If the client doesn't pong back, the server ruthlessly kills the connection to save memory.
9. Common Mistakes
- Load Balancer Timeouts: By default, an Nginx or AWS Load Balancer will automatically kill any HTTP connection that has been idle for 60 seconds to prevent DDoS attacks. *The Failure:* This default setting will violently disconnect all your WebSocket users every minute. *The Fix:* You must explicitly configure your Load Balancers to support WebSocket upgrades and extend the idle timeout limits.
10. Mini Project: Design a Live Collaboration Tool
Let's build the architecture for a Google Docs clone.- 1. The Client: User A and User B open the same document. They both establish WebSocket connections.
- 2. The Gateway: The connections hit an API Gateway, which routes them to a specialized "Collaboration Microservice."
-
3.
The State: User A types the word "Hello." The
[A, H, e, l, l, o]keystrokes are sent instantly over the socket.
- 4. The Conflict Resolution: The microservice uses an algorithm (Operational Transformation) to ensure both users see the same text even if they type at the exact same millisecond.
- 5. The Broadcast: The microservice pushes the update to a Redis Pub/Sub channel, ensuring User B's screen is updated in less than 50 milliseconds.
11. Practice Exercises
- 1. Compare the immense inefficiency of "Short Polling" against the persistent architecture of "WebSockets." Why is polling considered an architectural anti-pattern for live chat apps?
- 2. Explain the fundamental difference between WebSockets (Bi-directional) and Server-Sent Events (SSE). Give an example of a feature where SSE is the superior, simpler choice.
12. MCQs with Answers
Question 1
A system architect is building a live stock-trading dashboard where the server must continuously stream rapidly updating stock prices to 100,000 connected browsers. The users never need to send data back through this connection. What is the most efficient, standard protocol for this one-way real-time architecture?
Question 2
Scaling WebSocket servers introduces extreme architectural complexity because the physical TCP connections remain permanently open, meaning the servers are now highly "Stateful." If User A is connected to Server 1, and wants to send a chat message to User B who is connected to Server 5, what specific centralized technology must the architect deploy to route the message between the independent servers?
13. Interview Questions
- Q: Explain the specific architectural nightmare that occurs when you attempt to deploy horizontal Auto-Scaling to a cluster of WebSocket servers. How does the "Stateful" nature of WebSockets conflict with standard Load Balancer routing?
- Q: Walk me through the implementation of a "Heartbeat" (Ping/Pong) mechanism in a real-time system. Why is this critical for managing memory leaks on the backend servers?
-
Q: You are tasked with building the notification system for Facebook. Millions of users need to receive a tiny red
[1]alert when someone likes their photo. Would you use WebSockets, SSE, or Long Polling? Defend your architectural choice.