To understand CAP Theorem, first, we need to understand about
network partitioning. Network partitioning occurs when a cluster of interconnected nodes breaks into two or more partitions and the nodes in each partition are unable to communicate with nodes in the other partitions. This happens because the network is unreliable, but why is this the case! well because the network deals with physical cables and switches and routers, which are prone to damages.
CAP Theorem or Brewers Theorem states that it is not possible for a distributed data store to simultaneously provide 3 guarantees of Availability, Consistency, and Partition Tolerant at once. Let's understand these in a bit more detail.
- Partition Tolerant: This is the ability of a system to work even if the network has partitioned.
- Consistency: If a distributed data-store guarantees Consistency, then each read will receive either the latest write or an error. Please note that this is not the C of ACID properties.
- Availability: If a distributed data store guarantees Availability, then each request will receive a response, but it may not be the latest version of the data. Also note, that this is not the A of ACID properties.
Since the network is unreliable, most of the distributed data-store supports partition tolerance by design, therefore the only choice they make is to be either available or be consistent in case of network partitioning. Please note that under normal circumstances i.e. without network partitioning both consistency and availability are guaranteed. CAP theorem is not about "2 of 3" all the time.
As explained above the real choice is between Consistency and Availability, as most the distributed databases are designed to work even under network partitioning.
- CP - Consistency/Partition Tolerance - In this pattern, in the case of partitioning
Consistencyis favored that is error or TimeOut Exception will be returned as the system can no longer guarantee consistency of data across partitions. Examples of databases falling under CP are Hbase, MongoDB, Redis, etc.
- AP - Availability/Partition Tolerance - In this pattern, in the case of partitioning,
Availabilityis favored ie. the system will continue to return responses, even when it is inconsistent across partitions. When there is a requirement that the database remains available at all times, we can use these databases. Examples of databases falling under AP are CouchDB, Cassandra.
CAP Theorem really boils down to a trade-off between Consistency and Availability and you as an application developer can choose any pattern as per your requirement.