Chapter 1: Proximity Service System Design
Loading audio…
ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.
The initial design phase establishes crucial system requirements, emphasizing low latency responses and adherence to data privacy regulations like GDPR and CCPA, while estimating a high volume of traffic demanding thousands of search queries per second. The resulting architecture employs a high-level structure that separates the core LBS, which is designed to be read-heavy and stateless for horizontal scaling, from the Business Service, which handles infrequent update operations and requests for detailed business information. Given the system's strong read orientation, database scalability is achieved using a primary-secondary replication setup, ensuring consistency is adequate since business updates are only required to be effective the following day. A central technical challenge involves efficiently indexing and querying two-dimensional geographic data, requiring the rejection of inefficient naive two-dimensional search methods or evenly divided grids, which suffer from uneven data distribution. To address this, the design explores advanced geospatial indexing techniques that map 2D coordinates to a one-dimensional space, including Geohash and Quadtree. Geohash offers simplicity and ease of updating but requires complex handling for boundary cases where two close locations may lack a common string prefix. In contrast, Quadtree uses recursive subdivision based on population density, making it highly effective for dynamically returning the k-nearest results, although its tree-based structure complicates index updates. For persistent storage, the main business information table is sharded by business ID, while the relatively small geospatial index table is scaled via read replicas to manage high read load without introducing the complexity of sharding the index itself. Finally, to enhance performance, a multi-layer caching strategy is implemented, using the stable geohash key instead of fluctuating coordinate pairs for cache lookups, storing both lists of business identifiers and fully processed business objects in memory. The service is deployed globally across multiple regions and availability zones to minimize cross-continent latency, distribute traffic load, and comply with various local data storage requirements.