Chapter 10: Real-time Gaming Leaderboard Design
Loading audio…
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
Initial back-of-the-envelope calculations determine the system must handle a scale of 5 million daily active users (DAU), requiring allowance for a peak score update query per second (QPS) of 2,500. The high-level architecture separates the process into a Game Service, which validates wins, and a Leaderboard Service, which updates the scores via an internal API call (POST /v1/scores), ensuring security by setting the score on the server side to prevent client-side manipulation, such as a man-in-the-middle attack. While a simple relational database solution (MySQL) is viable only for small datasets, it is quickly rejected for high-scale, real-time leaderboards because calculating rank by sorting millions of continuously changing records is slow and unscalable, potentially taking tens of seconds. The superior solution utilizes Redis, an in-memory data store, specifically employing the sorted set data type, which is optimal for leaderboard operations, offering logarithmic time complexity for adding or finding elements, O(log(n)). This specialized structure is internally implemented using a hash table and a skip list. Core Redis operations needed are ZINCRBY to increment a user's score, ZREVRANGE to fetch the top players, and ZREVRANK to retrieve a specific user's position. Deployment can be achieved either by managing private infrastructure with MySQL for auxiliary user details and Redis for the leaderboard data or by building on the cloud using serverless architectures like AWS API Gateway and Lambda functions, which intrinsically supports auto-scaling. To scale the system dramatically (e.g., to 500 million DAU), data sharding is introduced, examining two approaches: fixed partition (sharding by score range, preferred for specific rank retrieval) and hash partition (using Redis Cluster, which distributes data across 16384 hash slots and requires a scatter-gather process to compile the overall top 10 list). As an alternative design, a NoSQL solution using Amazon DynamoDB is proposed, leveraging its Global Secondary Indexes (GSI) and applying write sharding to prevent the leaderboard for the current month from becoming a single overloaded partition. Finally, system concerns like tie-breaking are addressed by storing the timestamp of a user's most recent win using a Redis Hash, where an older timestamp grants a higher rank in the event of equal scores, and failure recovery is planned by using historical data stored in MySQL to rebuild the Redis leaderboard offline.