System Design - Understanding Requirements and Constraints

Without clear, measurable goals for functional and non functional needs you are just writing code that might not meet the business objectives.

Nov 02, 2025

3 Ways To Check For Understanding - TeacherToolkit

In our first post we established that system design is about creating a great blueprint.

Now before an architect can draw a single line they need to sit down with the client and understand two things the dream (the requirements) and the reality (the constraints).

If you skip this step you are building a magnificent bridge to nowhere. The difference between a senior designer and a junior designer is that the senior designer spends 80% of their time on this requirements phase before drawing the first box on a diagram.

This guide will focus on how to interpret vague requests from product managers or clients and turn them into clear measurable goals that will actually guide your technical decisions. This is the art of translating business needs into engineering specs.

The Two Faces of Requirement Functional and Non Functional

Any complex software system has two types of needs. Think of a simple application like an online shopping site.

Functional Requirements (FRs)

These define what the system must do. They are the core actions or behaviors that the user interacts with directly. Simply put it defines the features and capabilities of the product.

For example

A user must be able to log in with an email and password.
The system must allow users to add items to a shopping cart.
The platform must send an order confirmation email after a purchase.

Functional requirements are generally easier to define and verify. You can simply test if the feature works or not.

Non Functional Requirements (NFRs)

These define how well the system must do it. These requirements are the true drivers of system architecture because they determine the choice of technology, complexity, and cost.

If Functional Requirements are the ingredients NFRs are the quality of the cooking. NFRs are the constraints and qualities that define the system’s performance, reliability, and security.

For example

Scalability - The system must handle 10 million daily active users.
Latency - Search results must load in less than 200 milliseconds.
Security - All user passwords must be encrypted at rest.

Designing for NFRs is what separates an application that works on your laptop from one that runs a global service.

Understanding Constraints

While NFRs are often related to qualities like speed and availability, the term constraints refers to the hard limits imposed on the project. Ignoring these limits is a recipe for project failure.

The designer’s job is not to build the perfect system but to build the best possible system given the constraints.

Here are some key constraints every System Designer faces

Latency vs Throughput (Speed)

These two terms are often confused but they are fundamentally different and dictate your performance design.

Latency (The Delay)

This is the time it takes for a single request to travel from the user to the server and back. It’s measured in time e.g. milliseconds.

The time it takes for you to order one meal and receive it at your table. A 100 ms latency means waiting 1/10 of a second for a response.

High latency leads to poor user experience. Low latency requires geographical proximity (CDNs, regional data centers).

Throughput (The Volume)

This is the amount of work the system can handle over a period of time. It’s measured in volume per unit time e.g. requests per second (RPS) or transactions per minute (TPM).

The total number of meals the kitchen can deliver in one hour.

High throughput requires many servers (horizontal scaling) and efficient processing (caching, queues).

User Base and Load (Scale)

You must define not just who is using the system now but who will be using it in the future.

Current User Base

The number of users logging in today.

Target Growth

How many users do we expect in 6 months in 2 years? A system designed for 1000 users will collapse at 1 million.

Read vs Write Ratio

This is crucial. If you are building Twitter, users read millions of times more than they write a 1001 read to write ratio.

If you are building a banking ledger, the ratio is closer to 11. This ratio determines where you spend your engineering dollars on reading speed (caching) or writing capacity (database clusters)

Budget and Cost

This is the non-technical constraint that kills many beautiful designs.

Constraint

How much money can we spend on hosting, licensing, and development time?

Trade Off

Using a managed service like Amazon RDS is easier (higher maintainability) but more expensive. Running your own database on a virtual machine (VM) is cheaper (lower budget) but requires more developer time (higher maintenance).

Insight

Senior designers always include a preliminary cost estimate in their design proposals. A design that costs 1 million a month to run is useless if the company only budgets 100,000.

Compliance and Legal Rules

If your system handles sensitive data, these are non-negotiable.

GDPR (General Data Protection Regulation)

If you serve users in Europe, you must protect their data privacy, allow them to request data deletion (the “right to be forgotten”), and process data legally. This affects how you store and manage user data globally.

SOC2 (Service Organization Control 2)

This is an auditing standard for cloud providers. If you are handling customer financial data, you must design your system with strict security controls for access, change management, and data integrity. This forces you to add components like audit logging and robust security layers.

The Difference Between Good and Bad Requirements

A vague requirement leads to a vague design. A good requirement is always measurable and testable. Let’s understand this by some examples

“The website should be fast.”

The above requirement is a template example of a vague requirement, it does not clarifies what fast actually means. To put this in a different way

“The homepage must load in under 1.5 seconds on a 3G network for 90% of users.”

This clarifies what Fast actually means and also help in identifying that this may require a Content Delivery Network (CDN) and heavy caching.

“The system must be secure.”

Same as the first one this does not defines what secure means, which we can define as

“Users must authenticate using OAuth 2.0 and all passwords must be hashed using a modern algorithm like Argon2”

This help in selecting specific security libraries and authentication flow components.

Some more examples can be

“The database should be big.” which can be defined as “The system must ingest up to 500 new data records per second during peak hours.“ which forces a choice of database (e.g. a highly performant distributed database like Cassandra over a traditional SQL server).

“The chat feature should work.“ which should be “Messages must be delivered to recipients within 500 ms of being sent.“ which help us identify the need of a real time messaging architecture (WebSockets) rather than simple API polling.

Here is a tip, always ask for the numbers. If a product manager says “fast,” ask “How fast exactly? What is the 95th percentile latency target?”

Prioritizing Trade Offs

Once you have all your requirements and constraints, you will immediately notice a problem they conflict. You cannot maximize everything.

Speed vs Cost

To achieve 100 ms latency (speed), you could deploy 50 servers around the world (geo distribution).

This solution maximizes speed but maximizes hosting cost. A cheaper solution might be fewer servers but results in 500 ms latency. You must decide which is more important to the business.

Consistency vs Availability

To ensure every user sees the absolute latest data (consistency), you might slow down the whole system waiting for all databases to update.

You gain perfect consistency but sacrifice speed (latency) and availability (if the system has to wait, it can’t serve the request). We will dedicate an entire blog post to this crucial trade off the CAP theorem.

The CAP theorem

A fundamental concept in distributed systems. It states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees Consistency Availability and Partition Tolerance.

Consistency means every read receives the most recent write or an error.

Availability means every request receives a response without guarantee that it is the most recent.

Partition Tolerance means the system continues to operate even if communication between parts of the system is lost. Since network failures are certain every modern system must choose P and then balance C and A.

How to Prioritize

A common way senior teams prioritize NFRs is by understanding the business impact.

Non Negotiable Constraints (Must Have)

These are often legal (GDPR) or core to the product (e.g. if you are a payment processor, security and strong consistency are non-negotiable).

Highly Desired Qualities (Should Have)

These make the product excellent but won’t kill it if they are slightly compromised. E.g. loading time of 1.5 seconds is great, but 2.0 seconds is acceptable for now.

Future Enhancements (Could Have)

These are nice to haves that you design the architecture to allow for later, but you do not build or pay for them today. E.g. multi-region failover.

The design decision is not which technology to use but which constraint you are willing to relax. Always document the trade offs you make and why they were made.

Case Study - The Photo Upload Service

Let’s apply this to a real world example. Imagine you are designing a service that lets users upload a photo and get a cropped version back.

Initial Vague Requirements

Users can upload photos.
The photo needs to be processed.
The service should handle a lot of people.

Refined requirements and constraints

FR - User can upload a JPEG or PNG file up to 10 MB. Requires a file storage service (like AWS S3).
NFR Latency - The user must get a response confirming the upload was received in under 500 ms. Requires a fast, lightweight API endpoint to handle the initial receipt immediately.
NFR Throughput - The system must handle 100 photo uploads per second during peak hours. Requires a queue (like Kafka or SQS) to decouple the fast API receipt from the slow, heavy image processing task.
Constraint Budget - We must keep monthly server costs under 500. Forces the use of cheaper serverless compute (AWS Lambda) for the processing, rather than expensive always-on VMs.
Constraint Security - Photos must be scanned for viruses before processing. Requires an integration with an antivirus service in the processing pipeline.

Notice how the specific numbers and constraints immediately guide the design away from a simple single server toward a complex, yet highly scalable, architecture involving S3, a queue, and serverless workers.

This is the power of good requirements gathering.

The Next Step

Understanding requirements and constraints is the foundation of system design. Without clear, measurable goals for functional and non functional needs you are just writing code that might not meet the business objectives. Your role as a designer is to ask “how fast” “how much” and “how secure” and then use those answers to make intelligent trade offs.

Now that we know what the house needs to look like and what budget we have we can pick the tools.

The next post is “Building Blocks of a System.” We will break down the essential components like databases, APIs, queues, and caches and learn how they connect to form the core of any modern application.

Discussion about this post

Ready for more?