This is a lecture note from one of the courses I’m taking at Cornell University: Architecture of Large Scale Information System. Alan Demers, the professor, is very strong. Somebody said that he is 80 years old. The course explores three main properties of large-scale systems: scalability, availability, and elasticity.
Scalability is the ability for a system to grow if more computational power is required. For example, a web server may initially scrape only a few websites but grow to scrape the entire web. Or, a web server may increase its query rate. For example, Google now process between 50 and 100 trillion URLS and services more than 50, 000 queries per second. These are examples of increased data size and query rate respectively.
Availability is the ability for a system to be reliably responsive even in the face of faults. For example, an available server responds to clients in a timely manner whenever they issue queries. If one component of an fault tolerant system fails, other components of the system can assume the failed node’s responsibility to preserver the availability of the system.
Elasticity is the ability for a system to dynamically grow in size. Roughly ten years ago, an organization would govern its own physical server farm. Changing the size of the server farm could only happen as fast as the organization could buy more servers. Nowadays, with the advent of public clouds such as Amazon’s AWS, organizations instead rent a small slice of computational power from a large cloud. In an elastic system, if the load on the system increases, it can increase its size. This elasticity can also affect the design of the system.
There are two types of scalability: vertical scalability (scaling up) and horizontal scalability (scaling out).
In a vertically scalable system, more computational power is introduced by purchasing more powerful machines. With Moore’s Law, you may argue that the number of transis- tors that fit on a chip is rising exponentially, so vertical scalability is a promising form of scalability. However, performance, power and cost don’t scale as well. Typically cost scales superlinearly with performance. This means that increased computation power is increas- ingly expensive. Also, there is a ceiling on the amount of computational power available, as there is some machine that is the fastest on the market at any given time. Vertical scalability is also rather uninteresting.
Horizontal scalability is the more interesting form of scalability where computation power is introduced by adding more machines to a system. Before, horizontally scaled systems were composed of commercial-off-the-shelf (COTS) machines, but today’s public clouds are typically built of machines specialized for the cloud. The benefit of horizontal scalability is the abundance of low-cost machines. Parallelizable applications take advantage of this scalability, while less parallelizable applications cannot. We will talk mainly about horizontal scalability in this class.
Along with scalability naturally comes the concept of availability. As the number of machines in a system increases, the likelihood of failure increases. Thus, our systems better be fault tolerant in order to handle the increased likelihood and frequency of failure. Mathematically, if we have the mean time to failure (MTTF) and mean time to repair (MTTR) of a system, the average availability is (MTTF/MTTF+MTTR). As MTTF decreases, availability decreases.
A classic example of a system that is fault tolerant to increase availability is a hot standby database. Clients connect to a database via a network switch. If the database ever goes down, the switch redirects traffic to a stand by database which has the replicated data. This systems is tolerant to the failure of a single database at the cost of purchasing two database servers.
Below is example of large-scale information system applied in Amazon Web Service.