System Design - Consistency vs Availability
The CAP Theorem dictates that in a distributed system you can only ever have two of three core properties.
We have spent several posts focused on scaling the art of taking one system and dividing it across multiple servers and multiple data centers. This is how we achieve the massive scale that modern internet applications require.
But distributing your data and your components creates a profound problem. Whenever you rely on a network to connect parts of your system you introduce the possibility of the network failing or slowing down.
This reality forces the system designer to confront the most critical theoretical choice in the field The CAP Theorem. This theorem dictates that in a distributed system you can only ever have two of three core properties.
This guide will break down the three pillars of the CAP theorem with simple metaphors explain why this choice is unavoidable and show you how to decide between prioritizing real time accuracy C P or constant access A P based on your application’s business needs.
The Three Pillars of Distributed Systems
The CAP theorem stands for Consistency Availability and Partition Tolerance. In a truly scalable system these three elements are constantly fighting each other.
Consistency (C)
Consistency means that every client sees the same data at the exact same time no matter which server they talk to.
Imagine checking your bank account on your phone and then immediately checking it on your desktop. If the balance is 100.00 on both screens even milliseconds after a transaction then the system is consistent. If your phone shows 100.00 but your desktop shows 90.00 for a few seconds the system is inconsistent.
The goal for a system to be consistent all nodes or database copies must agree on the latest piece of data before any user is allowed to read it. This is usually achieved by slowing down writes or locking the data until all copies have been updated.
Availability (A)
Availability means that every client request receives a non error response. The system is always up and ready to serve data.
You walk up to the cashier and they always process your transaction. Even if the store’s inventory system is struggling and cannot confirm the stock they tell you “Yes you can buy this” (even if they might regret it later).
The goal is to maximize uptime. If a server or data center fails the remaining servers continue to take requests and return data without delay. The system prioritizes responding quickly over ensuring that the response is the very latest version of the data.
Partition Tolerance (P)
A Partition occurs when the network connection between two or more system components fails. The two sides of the system can no longer communicate with each other. The system is split into separate islands.
Imagine your system has two databases one in New York and one in London. A deep sea cable is cut and they cannot talk for five minutes. This is a network partition.
For any large-scale distributed system (multiple data centers multiple servers) Partition Tolerance is a non negotiable requirement. Because networks are inherently unreliable you must assume partitions will happen and your system must be designed to survive them. A system that crashes when the network fails is not scalable.
The CAP Theorem
The theorem states that in the presence of a network partition P you must choose between Consistency (C) and Availability (A). You cannot have all three at the same time.
Why the Choice is Forced
Let us use our New York and London databases to illustrate the forced choice.
A network partition occurs the cable is cut. New York and London cannot talk. P is active.
A user in New York writes a new piece of data e.g. they post a new photo. This write goes only to the New York database.
A user in London tries to read data.
Now the system must make a choice.
Choice A - Prioritize Consistency (C)
The London server knows it has not heard from New York since the network broke. It cannot guarantee the data it has is the latest copy. To be consistent it must block the read request and wait for the network to heal and the data to sync. The system is down for that user for that moment. You choose C P but you sacrifice A.
Choice B - Prioritize Availability (A)
The London server decides to serve the data it currently has even though it knows it might be missing the new photo posted in New York. The user gets a quick response and the application stays 100% available. The data is temporarily stale. You choose A P but you sacrifice C.
This explains the theorem. Once the network fails you must make a decision wait or serve old data.
C P versus A P Systems
The vast majority of modern distributed systems fall into one of these two camps.
C P Systems (Consistency and Partition Tolerance)
C P systems prioritize data accuracy and integrity above all else. They choose to stop working rather than risk giving out incorrect or stale information.
Characteristics
Strong Consistency - Data is always accurate.
Latency - Higher latency during a partition because the system is waiting for synchronization.
Downtime - Higher risk of partial downtime or blocking operations during a partition.
Primary Use Cases
Financial Transactions - Banking systems ledgers where $100\%$ accurate balances are non negotiable.
User Authentication - Ensuring that a password change has propagated everywhere before allowing the user to log in.
Inventory - Ensuring you never sell a product you do not actually have in stock.
Traditional SQL databases when sharded or configured for cluster operations. Some NoSQL options like Apache HBase.
A P Systems (Availability and Partition Tolerance)
A P systems prioritize constant uptime and a fast response time. They are willing to return data that might be outdated or stale for a short period.
Characteristics
Eventual Consistency - Data will become consistent eventually usually within milliseconds or seconds after the partition heals.
Latency - Low latency even during a partition because the system does not block requests.
Downtime - Extremely low risk of downtime.
Primary Use Cases
Social Media Feeds - It is okay if your friend’s latest post takes 5 seconds to show up. Users prefer seeing most of the feed now rather than waiting for 100% of the feed.
E Commerce Carts - If you add an item to your cart it might take a few seconds to appear on all your devices.
Comments and Likes - A temporary lack of synchronization is acceptable.
Highly scalable NoSQL databases designed for distribution like Cassandra and DynamoDB.
It is theoretically possible to have a C A system that is 100% Consistent and 100% Available but this is only true for a single non distributed system a monolith with no network involved. As soon as you scale horizontally and rely on a network P becomes inevitable and C A is no longer possible.
Balancing the Trade Offs (Eventual Consistency)
The practical reality is that most applications are a hybrid. The designer isolates the critical parts that need C P from the non critical parts that can tolerate A P.
The E commerce Checkout
Payment Processing - This must be C P. The charge must be recorded accurately before the order is confirmed. You must wait for confirmation.
Product Recommendations - This can be A P. If the recommendation engine is slow or has slightly old data it is served immediately so the user can keep browsing.
This is the power of a microservices architecture. You can use an SQL database (C P) for the payment service and a Document database (A P) for the user profile service ensuring that each part of the business gets the right trade off.
Eventual Consistency
A P systems rely on Eventual Consistency.
The guarantee that if no new updates are made to a given data item all accesses to that item will eventually return the last written value.
When a partition heals the servers compare their data logs and fast forward the updates to ensure all copies eventually match. It is a highly scalable solution but requires the business to accept a small window of data inaccuracy.
The Next Step
The CAP theorem is not a technical specification it is a foundational truth of distributed systems. It forces you to choose your priority Do you value data truth C or guaranteed responsiveness A when the network breaks? For any scalable design you must accept P.
The key takeaway is to design services with the right choice of C P or A P based on the severity of the business risk caused by stale data.
Now that we have covered how to store data and the critical trade offs in data consistency we can look at a more advanced way of designing how services talk to each other.
In the next post “Event Driven and Asynchronous Design” we will dive into a pattern that maximizes scale and availability A P by eliminating the need for services to talk to each other directly. We will explore message queues like Kafka and RabbitMQ and the power of decoupling systems.


