Deep Dive - What it actually cost to Scale to 1 Million Requests per Second

1,000,000 Requests Per Second (RPS) on a single instance. The result? We hit it. But the bottleneck wasn’t the code logic. It wasn’t the database. It was the AWS billing department.

Feb 26, 2026

Speedlines Images – Browse 3,500 Stock Photos, Vectors, and Video | Adobe Stock

Every CTO has a slide in their pitch deck that says: “We will just auto-scale.”
It is the lie we tell investors to hide the fact that we haven’t thought about physics.

To test the limits of modern cloud infrastructure, I recently provisioned a monster. I spun up an AWS c8g.48xlarge, the “Beast.”

192 vCPUs (ARM-based Graviton).
384 GB of DDR5 RAM.
50 Gbps baseline bandwidth.

The goal?

1,000,000 Requests Per Second (RPS) on a single instance.

The result? We hit it.

But the bottleneck wasn’t the code logic. It wasn’t the database.

It was the AWS billing department.

The Compute Tax (Node.js vs. C++)

We started with the industry standard, Node.js.

Node is famous for its non-blocking I/O. It’s supposed to be fast.

But when you throw 600,000 concurrent connections at it, the V8 engine starts to cough.

The Node.js Ceiling (~500k RPS)

Even with 192 cores, our Node cluster capped out at roughly 500,000 requests per second.

Why? Overhead.

At this volume, the cost of creating a JavaScript object, managing the closure scope, and running the Garbage Collector becomes heavier than the actual request logic. The CPU wasn’t processing user data; it was managing V8 memory pointers.

The C++ Fix (1M RPS)

To unlock the hardware, we had to strip away the runtime.

We rewrote the service in C++ using the Drogon framework.

Drogon is brutal. It maps threads directly to cores. It has zero garbage collection overhead.

The result, 1.2 Million RPS at 80% CPU utilization.

This forces a massive economic trade-off.

Option A (Node.js) - You pay for 2x the hardware (two c8g.48xlarge servers) but you can hire JavaScript developers for $120k/year.
Option B (C++) - You pay for 1x hardware, but you need C++ systems engineers who cost $200k/year and take 6 months to hire.

At small scale, hardware is cheap.

At massive scale, hardware is expensive.

Pick your poison.

Disk vs. RAM

Hitting the endpoint is easy. Saving the data is a suicide mission.

Try executing INSERT INTO logs 1,000,000 times per second on Postgres.

We tried writing to a Provisioned IOPS SSD (EBS io2 Block Express).

The disk latency spiked immediately.

The queue depth exploded. The SSD simply couldn't physically accept the write signals fast enough.

The database locked up.

To provision the 250,000+ IOPS required to even attempt this would cost roughly $30,000/month for a single volume.

The only way to absorb 1M RPS is to never touch the disk. We switched to a Redis Cluster.

RAM is orders of magnitude faster than NVMe. We could ingest the traffic easily.

The major trade-off was durability.

If the power fails on that rack in the data center, the last few seconds of data, potentially millions of transactions, are gone forever.

You are trading $25,000/month in EBS costs for the risk of data loss.

Is your data worth that much?

Bandwidth

We fixed the CPU. We fixed the Storage.

Then we got the bill for the network cable.

During a 30-minute stress test, the “Beast” moved roughly 60 Terabytes of data.

(1M RPS × 30KB payload × 1800 seconds ≈ 54 TB).

Public Traffic vs. Internal Traffic

The cost depends entirely on where the data goes.

Internal VPC (Inter-AZ, the test)

60,000 GB × $0.02 = $1,200 for the 30-minute test.

Public Internet (The Reality)

If these were real users downloading data over the internet, you pay AWS standard egress rates which is ~$0.09 per GB

60,000 GB × $0.09 = $5,400.

That is $5,400 for a single 30-minute test.

60,000 GB × $0.09 = $5,400.
That is $5,400 for a 30-minute test.

If you sustain 1M RPS for a month?

The cost will be ~$7.9 Million/month (based on 730 hours).

This is a huge cost for the Bandwidth alone.

You can optimize your C++ code all you want.

You cannot optimize the size of a byte. If your business model involves high-throughput data transfer (video, logging, telemetry), the CPU cost is a rounding error.

The bandwidth will bankrupt you.

The Price of 1 Million Req /Sec

Let’s look at the actual receipt for this “experiment.”

The “Beast” Server (c8g.48xlarge)

$10.89 per hour. (Cheap!)

The Load Generator Fleet

It took 60 c5.large instances just to generate enough traffic to crash the Beast.
$40.00 per hour.

The Failed Storage

We provisioned a high-IOPS EBS volume for 2 hours before giving up. $200 (pro-rated).

The Bandwidth

Internal Scenario (Inter-AZ)
- Even moving data between zones costs money.
- $1,200 (30 mins) × 2 = $2,400/hr.
Public Scenario (Internet)
- $5,400 (30 mins) × 2 = $10,800/hr.

Total Cost Summary (1 Hour)

Total Cost (Internal Test) - $11 + $40 + $200 + $2,400 = ~$2,651.
Total Cost (Public Internet) - $11 + $40 + $200 + $10,800 = ~$11,051.
- This is estimated as we don’t have such money lying around to do the actual test.

To calculate a sustained month cost we can use the below formula

(Cost of Bandwidth + Cost of Compute) X 730 hours

Sustained Monthly Cost (Internet)

AWS - ($10,800 bandwidth + $51 compute) × 730 hours = ~$7,921,000/month.
GCP - (~$9,600 bandwidth + $51 compute) × 730 hours = ~$7,045,000/month. (Est. $0.08/GB egress)
Azure - (~$9,600 bandwidth + $51 compute) × 730 hours = ~$7,045,000/month. (Est. $0.08/GB egress)
Bare Metal - ~$140,000/month. (See breakdown below)

Bare Metal consideration

To achieve this without the “Cloud Tax,” you move to wholesale IP Transit.

Traffic Load

120TB/hour converts to roughly 266 Gbps sustained throughput.

Bandwidth Cost

~$0.50 per Mbps - Wholesale IP Transit (e.g., Cogent, Hurricane Electric) rough cost at this volume.

266,000 Mbps × $0.50 = $133,000/month.

Hardware (CapEx)

1x High-End Server (Dual AMD EPYC 96-core, 384GB RAM, 2x 100GbE NICs): ~$25,000 (one-time).
~$700/month - Spread over 3 years.

Colocation Fee

~$2,000/month - Rack space, power (2kW), and cross-connects.

Conclusion

1 Million Requests Per Second is a vanity metric.

Technically, it is solvable with C++, RAM, and big iron.

Economically, it is a different beast entirely.

If you are building a system, do not ask “Can we handle 1M RPS?”

Ask, “Is the user paying us enough to cover the $0.09/GB egress tax on 1M RPS?”

If the answer is no, stop optimizing your code and start optimizing your business model.

Discussion about this post

Ready for more?