Database - How Over-Indexing cost us in Performance?
We love Indexes, they are the first tool we reach for when a query is slow. But do you know every Index you add to speed up reads has a performance cost on every write.
We love indexes. They are the first tool we reach for when a query is slow.
A SELECT takes 4 seconds?
Add an index. Problem solved. Response time drops to 12 milliseconds.
You feel like a hero.
So you add another. And another.
A composite index for that dashboard query.
A partial index for that admin report.
A unique index on that email column.
Before you know it, your table has 12 indexes and your insert latency has quietly tripled.
Nobody notices. Not yet.
This article is about the other side of the indexing story, the side that almost nobody talks about.
Not "how to create the right index" but "how your indexes are silently destroying your write performance."
What happens when you INSERT a row?
To understand why over-indexing is dangerous, you need to understand what the database actually does when you insert a single row.
On a table with zero indexes (a heap table), an INSERT is almost trivially cheap. The database finds a page with free space, writes the row, and it is done. This is mostly a sequential memory operation. It is fast.
Now add a Primary Key.
That key is backed by a B-Tree index. The database must now also navigate that B-Tree, find the correct leaf page, and insert the key. If the leaf page is full, the database performs a page split, it allocates a new page, redistributes half the entries, and updates the parent node pointers.
This is the same mechanics I described in my UUID v4 article.
Now add a second index. And a third.
For every single INSERT, the database must repeat this B-Tree navigation and insertion process once per index. Five indexes means five separate B-Tree insertions for every row you write.
The critical insight is this, the database cannot skip any of them. Even if an index exists only to serve a query that runs once a month, it must be updated on every single write.
The Multiplier Effect
Markus Winand, author of SQL Performance Explained and the site use-the-index-luke.com, published a benchmark that should be pinned to every engineer's wall
“A table with zero indexes can be 100x faster to insert into than a table with a single index. And each additional index degrades insert performance further in a roughly linear fashion.”
Think about that.
You go from zero indexes to one, and your insert speed drops by two orders of magnitude. The actual data write is trivial. The index maintenance is the expensive part.
Now consider what happens in production. You have a table with 8 indexes. You deploy 10 application instances.
Each instance is doing 500 writes per second to that table. Every write triggers 8 separate B-Tree insertions.
That is 40,000 index operations per second hitting your database.
And when you wonder why your RDS CPU is at 85%, nobody thinks to look at the index count.
It costs beyond Write Latency
The write speed penalty is only the beginning. Over-indexing creates a cascade of secondary problems that are even harder to diagnose.
WAL Amplification.
In PostgreSQL, every change to an index page generates Write-Ahead Log (WAL) entries. More indexes means more WAL data.
That WAL is shipped to your replicas. It is archived for point-in-time recovery. It is processed by your backup systems.
Unnecessary indexes do not just slow down your primary, they increase replication lag, inflate your backup storage, and slow down disaster recovery.
Buffer Pool Pressure
Every index occupies pages in the shared buffer pool (PostgreSQL) or InnoDB buffer pool (MySQL).
Indexes you never query still compete for memory with the indexes you actually need.
When your buffer pool is full of cold, unused index pages, the hot pages that serve your real queries get evicted.
This causes more disk I/O for the queries you are trying to make fast.
The irony, an unused index can make your other indexes slower.
Vacuum and Maintenance Overhead
In PostgreSQL, VACUUM must process every index on a table to clean up dead tuples.
A table with 2 indexes vacuums twice as fast as a table with 8.
If autovacuum falls behind because it is spending all its time maintaining unnecessary indexes, you get table bloat, dead rows pile up, the table grows on disk, and sequential scans (which ignore indexes entirely) get slower and slower.
Query Planner Confusion
Every index gives the query planner another option to consider.
With many indexes, the planner spends more time evaluating plans, and the chance of it choosing a suboptimal plan increases.
PostgreSQL's planner has been measured to slow down measurably when a table has a large number of indexes, especially without prepared statements.
How Over-Indexing happens
We are not doing it intentionally for sure.
Nobody sets out to over-index a table. It happens gradually, through a series of individually reasonable decisions.
The "Just Add an Index" Reflex
A query is slow.
Someone runs EXPLAIN ANALYZE, sees a sequential scan, and creates an index.
The query gets faster.
Nobody checks if an existing index already covers the query, or if the query itself could be restructured.
Composite Index Duplication
You have an index on (user_id, created_at).
Someone creates a new index on (user_id) because their query only filters by user_id.
But a composite index on (user_id, created_at) already supports lookups by user_id alone, it is a "leftmost prefix" index.
The new index is pure waste.
Abandoned Feature Indexes
A feature was built six months ago.
It had specific queries that needed indexes. The feature was removed or rewritten.
The indexes stayed. Nobody went back to clean them up.
The codebase forgot. The database remembers.
ORM-Generated Indexes
ORMs like Sequlize, Django, Rails, and Hibernate often generate indexes automatically on every foreign key, on every column used in a unique_together, sometimes on columns that are barely queried.
Engineers trust the ORM defaults without auditing them.
Let me tell you a story.
The story (it’s not a fiction though)
I once worked with a SaaS company that had an events table tracking user activity.
The table received about 3,000 inserts per second during peak hours.
Over two years, different teams had added indexes for various reporting queries, admin dashboards, and one-off data exports.
The table had 14 indexes.
The symptom, insert latency would spike to 800ms during peak traffic, triggering timeouts in the application layer.
They had already upgraded the RDS instance twice. They were looking at sharding.
I ran a simple query against pg_stat_user_indexes
SELECT
schemaname || '.' || relname AS table,
indexrelname AS index,
idx_scan AS times_used,
pg_size_pretty(pg_relation_size(indexrelid)) AS size
FROM pg_stat_user_indexes
WHERE relname = 'events'
ORDER BY idx_scan ASC;Six of the fourteen indexes had zero scans since the last statistics reset (which was three months prior).
Two more had fewer than 10 scans total, they were backing queries that had been removed from the codebase months ago.
The total size of those 8 unused indexes was 11 GB.
Sitting in memory. Being updated 3,000 times per second.
Generating WAL. Slowing down vacuum.
Contributing nothing.
We dropped them. Insert latency fell from 800ms peaks to under 50ms.
CPU dropped from 80% to 35%. They cancelled the sharding project.
The fix was not adding something. It was removing something.
How to Find and Kill Unused Indexes
Here is a practical process for auditing your indexes in PostgreSQL.
Step 1 - Check When Stats Were Last Reset
Before trusting scan counts, verify that your statistics have been accumulating long enough to be meaningful.
SELECT
stats_reset,
age(now(), stats_reset) AS stats_age
FROM pg_stat_database
WHERE datname = current_database();If the stats were reset yesterday, the scan counts are not reliable. Wait at least one full business cycle (ideally a month) before making decisions.
Step 2 - List All Indexes by Usage
SELECT
schemaname || '.' || relname AS table,
indexrelname AS index,
idx_scan AS scans,
pg_size_pretty(pg_relation_size(indexrelid)) AS size,
pg_size_pretty(pg_total_relation_size(relid)) AS table_total_size
FROM pg_stat_user_indexes
ORDER BY idx_scan ASC;Anything with idx_scan = 0 over a multi-week window is a candidate for removal.
Anything with very low scan counts (single digits) should be investigated.
Step 3 - Check for Duplicate and Overlapping Indexes
A composite index on (a, b) makes a standalone index on (a) redundant for most queries. Look for these overlaps.
SELECT
pg_size_pretty(sum(pg_relation_size(idx))::bigint) AS total_size,
(array_agg(idx))[1] AS index_1,
(array_agg(idx))[2] AS index_2
FROM (
SELECT
indexrelid::regclass AS idx,
(indrelid::regclass)::text AS table_name,
(indkey::text) AS columns
FROM pg_index
) sub
GROUP BY table_name, columns
HAVING count(*) > 1;Step 4 - Drop with a Safety Net
Do not drop indexes directly in production. First, make them invisible to the planner (PostgreSQL 17+)
-- PostgreSQL 17+ only
ALTER INDEX idx_events_legacy SET (deduplicate_items = off);
-- Or simply: monitor after droppingFor older PostgreSQL versions, the safest approach is, drop the index, monitor for one to two weeks, and keep the CREATE INDEX statement saved so you can recreate it if a query regresses.
-- Save the DDL first
SELECT pg_get_indexdef(indexrelid)
FROM pg_index
WHERE indexrelid = 'idx_events_legacy'::regclass;
-- Then drop
DROP INDEX CONCURRENTLY idx_events_legacy;Always use DROP INDEX CONCURRENTLY to avoid locking the table.
The Rule of Thumb
There is no magic number for how many indexes a table should have. But there is a principle
Every index must justify its existence against the write cost it imposes.
If you cannot point to a specific, frequently-executed query that needs an index, that index should not exist.
"We might need it someday" is not a justification.
Creating an index takes seconds. The cost of maintaining a useless one runs 24 hours a day, 365 days a year.
For write-heavy tables (logging, events, analytics, time-series), aim for the absolute minimum, a primary key and perhaps one or two carefully chosen indexes.
For read-heavy tables with infrequent writes (reference data, configuration, product catalogs), you can afford more indexes.
The question is never "Will this index make my reads faster?" The answer is almost always yes.
The real question is "Is the read improvement worth the write cost across every insert, update, and delete this table will ever process?"
Conclusion
Indexes are not free.
Every index you create is a standing instruction to the database, "Every time any row in this table changes, do extra work."
Most engineers treat indexes as read-side optimizations and forget they are write-side penalties.
The next time you reach for CREATE INDEX, stop and ask three questions
Is there already an index that covers this query? Check for leftmost-prefix matches on existing composite indexes.
How often does this query actually run? If it runs once a day, you probably do not need an index. If it runs 10,000 times a second, you probably do.
How write-heavy is this table? A table receiving thousands of inserts per second pays a much higher tax per index than a table that changes once an hour.
And once a quarter, run the audit queries above.
Find your dead indexes.
Drop them.
Your database will thank you with lower CPU, faster writes, less replication lag, and a smaller storage bill.
Stop hoarding indexes. Start earning them.


