System Design

When designing scalable distributed web architecture, some key features to consider are:

– Make service scale
– Make deployment consistent
– Understand all layers
– Monitor everything
– Plan for failure
– Break things in a controlled manner

The design of a system should specifically address at least the following:

– Availability
– Performance
– Reliability
– Scalability
– Manageability
– Cost

During operations keep track of:

– Capacity management and planning
– Traffic routing
– Multiple clusters
– Stage releases
During implementation consider some of the following strategies:

– Data partitions
– Cache
* Global
* Distributed
– Proxies
* Collapsing requests
– Indexes
– Load balancers
– Queues

Keep in mind that distributed systems are subject to the CAP theorem.
CAP theorem states that a shared-data system can have at most two of the three following:

– Consistency
– Availability
– Partition Tolerance

Consistency is “requests of the distributed shared memory to act as if they were executing on a single node, responding to operations one at a time”.
Availability is “every request received by a non failing node in the system must result in a response” .
Partition tolerance means that when the network is partitioned, it will continue to deliver the same results as when the network is whole.

Consistency is of different types:
– Linearizable
– Serializable
* Two phase commits
* Sync replication
* Memory access in multi-core
* Relational Databases

Consistency has trade off with latency and operational simplicity.

CAP Availability is different than high availability (HA). HA refers to the whole system, CAP refers to non-failing nodes.
CAP expects unbounded response time, HA has realistic response time.

PACELC (pass-elk) is another theorem that talks about distributed systems.
It tries to figure out:
– If there is a partition, how does the system trade-off availability and consistency
– If there is no partition, how does the system trade-off latency and consistency

C
|
A—–C
|
L

For additional information refer to:

Syed Ali

Random Thoughts from an SRE Manager

System Design

Share this: