Chapter 1: Reliable, Scalable & Maintainable Applications

Loading audio…

ⓘ This audio and summary are simplified educational interpretations and are not a substitute for the original text.

If there is an issue with this chapter, please let us know → Contact Us

Reliable, Scalable & Maintainable Applications establishes the three core nonfunctional requirements essential for engineering modern data-intensive applications: reliability, scalability, and maintainability. Given that today’s systems often deal with immense volume, complexity, and rapid changes in data, they are built using specialized data systems—such as databases, caches, and stream processors—which often must be stitched together by application code to handle diverse needs. Reliability ensures that a system continues to function correctly despite faults, which are defined as component deviations, distinguishing them from failures, which occur when the system stops providing service. Faults are categorized as hardware issues, often mitigated by redundancy or software fault-tolerance techniques; systematic software errors, which are harder to anticipate and require robust testing and monitoring; and human errors, frequently configuration mistakes, which are minimized by well-designed interfaces, non-production sandbox environments, quick rollback mechanisms, and effective telemetry or monitoring. Scalability addresses the system’s ability to cope with growing load, which is quantified using specific load parameters, such as request rates or the fan-out ratio demonstrated by the Twitter timeline example. Performance metrics for online systems focus on response time, which is ideally measured using high percentiles (p95, p99, p99.9), known as tail latencies, rather than the average, as they reflect the experience of the slowest and often most valuable users; furthermore, high percentiles are critical due to the risk of tail latency amplification when multiple backend calls are required. Coping with increasing load involves strategic resource allocation, either scaling up (vertical scaling to a more powerful machine) or scaling out (horizontal scaling or distributed, shared-nothing architecture), noting that scalable architecture must always be specific to the application’s unique load characteristics. Finally, maintainability is crucial as the majority of software cost lies in ongoing operations and modification; this goal is achieved through operability, which involves designing systems for smooth running by operations teams via automation support and visibility; simplicity, which removes accidental complexity through effective abstraction; and evolvability, which ensures the system can be easily adapted to unforeseen use cases and changing requirements over time.