CAP theorem


Before getting deeper into the CAP theorem let’s get familiarize what a cluster is? 

A cluster is a group of computers connected to each other to work as a single system. To understand CAP theorem easily, let’s assume that we are working with a cluster of the database. This cluster of the database has multiple database servers installed in it. Each database server is a node. These nodes can communicate to each other via network to sync up the latest writes, updating or deleting the existing entries.

CAP theorem explains the properties for the stability of a distributed system. Each letter in CAP stands for a property as explained below.

Consistency:-

As there are many databases in the cluster, A dedicated database takes the entire responsibility for executing all write queries. Similarly, Another database takes the responsibility of executing all read queries. As a result, To make system consistent all the data which is written in to write DB should be available immediately on the read DB. If the data is available immediately, such systems are examples of a consistent system. Otherwise, the inconsistencies will cause bad experience to the user. Making sure that the system is consistent is very important in distributed systems.  

 

Availability:-

The availability property states that an application should run normally even if one of the nodes (database) fails. 
By following a simple rule while creating the cluster we can achieve this. While creating the cluster make sure that we spread the nodes across the world. This is an important aspect.
For example, Assume that you have created all the nodes in the cluster in the same region. In the future, because of some natural disaster or some other technical problem power failure occurred. As a result, all nodes in the cluster are down. In this case, it is impossible to run the application without having the past data of the user.
Database cluster spread across the globe.
 
Partitioning:- 
In simple words, even if the network connection between the two nodes breaks application should work normally. This is network failure is also known as partition tolerance.
Observe the below image for a better understanding of CAP theorem.
CAP theorem.
 

Ideally, there is no system that provides all 3 properties. Among all the properties any system can achieve two of them.

  • CA: Data is consistent between the nodes – as long as all nodes are online and avoid read/write from any node and be sure that the data is the same , but if you ever develop a partition between nodes, The data will be out of sync(and won’t re-sync once the partition is resolved).
  •  CP: Data is consistent between all nodes, and maintains partition tolerance by becoming unavailable when the node goes down.
choosing consistency, availability.
  • AP: Node remains online even if they don’t communicate with each other and will re-sync data once the partition is resolved, but you aren’t guaranteed that all nodes will have the same data(either during or after the partition).
choosing availability, partition tolerance.