Chapter 13: Stock Exchange System Design
Loading audio…
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
The established system scope requires support for the handling and cancellation of limit orders, processing billions of orders daily, and serving tens of thousands of simultaneous users, while meeting extremely demanding non-functional requirements, including maintaining highly reliable availability (at least 99.993%) and ultra-low, predictable latency, with a strict focus on the 99th percentile response time. Core business terminology is introduced, covering the differences between limit and market orders, the functions of brokers and institutional clients, and the tiered data visibility provided by market data levels (L1, L2, L3), all of which communicate using standardized protocols like FIX (Financial Information exchange protocol). The architecture is defined by three flows: the highly latency-sensitive critical trading path—which directs orders through the Client Gateway, Order Manager (where initial risk checks and fund verification occur), Sequencer, and Matching Engine—and the less restrictive Market Data and Reporting flows. To achieve sub-millisecond latency targets, advanced low-latency exchanges often consolidate all critical trading path components onto a single physical server, eliminating network communication time. Components on this single server communicate using shared memory via the mmap(2) system call over a memory-backed file system, bypassing costly disk and network access and reducing end-to-end latency to tens of microseconds. This is further optimized by running application loops as single threads pinned to dedicated CPU cores to eliminate context switching and contention. The design relies heavily on event sourcing, where the Sequencer assigns a unique sequence identifier to every order and execution, maintaining an immutable log of state-changing events. This inherent functional determinism enables accurate state recovery and fault tolerance, implemented using redundant instances (e.g., hot-warm matching engines) and consensus algorithms like Raft to ensure cluster synchronization, near-zero data loss (Recovery Point Objective), and rapid recovery (second-level Recovery Time Objective). The Matching Engine utilizes highly specialized data structures, such as the order book, which employs doubly-linked lists and maps to guarantee that primary operations like placing, matching, and canceling orders are completed with O(1) time complexity, commonly using the FIFO (First In First Out) matching algorithm. Finally, the design addresses critical operational challenges, including mitigating distributed denial-of-service (DDoS) attacks through layered security and ensuring the fair distribution of market data to subscribers, often utilizing reliable multicast protocols.