Newsletter

Every Database Index Is a Bet on Your Read/Write Ratio

Stackshala

24 Mar 2026 — 2 min read

An index is a redundant, sorted copy of a subset of your data, maintained by the database on every write. When you CREATE INDEX ON orders(customer_id), the database builds a B-tree over that column. Every insert into orders now writes twice: once to the table, once to the index. Every update to customer_id writes twice. Every delete writes twice. The index doesn’t come free at write time — it doubles the write cost for that column.

For one index, that’s tolerable. Most tables in production have three to eight indexes. A table with six indexes turns every write into seven writes — the row plus six index entries. At 10,000 writes per second, that’s 70,000 write operations. Your disk throughput, WAL size, and replication lag all scale with the number of indexes, not the number of rows you’re inserting.

This is a bet. You’re betting that the read speedup (turning a full table scan into a B-tree lookup, O(n) → O(log n)) is worth the write amplification. For a table with a 100:1 read/write ratio, that bet almost always pays off. For a table with a 1:1 ratio, it might not. For a write-heavy event log with a 1:100 ratio, every index is a direct drag on your most performance-critical path.

Covering indexes raise the stakes. A covering index includes all columns needed by a query, so the database never touches the table. Reads become a single B-tree scan — fast. But now every write updates a larger index entry with more columns. The bet is higher: bigger write amplification in exchange for even faster reads.

Partial indexes are the sophisticated play. CREATE INDEX ON orders(customer_id) WHERE status = 'active' indexes only active orders. If 90% of your orders are completed and queries only touch active ones, the index is 10% the size — faster reads, smaller write amplification. You’re making a more precise bet: this specific read pattern justifies this specific write cost.

Composite indexes have ordering constraints most engineers ignore. An index on (customer_id, created_at) supports queries filtering on customer_id alone, or on customer_id and created_at together. It does not efficiently support queries filtering on created_at alone — the B-tree is sorted by customer_id first. The leftmost prefix rule. An engineer who creates (A, B, C) thinking it covers queries on B has created write amplification with no read benefit for that query pattern.

The signal that the bet is wrong: replication lag. In a primary-replica setup, every index on the primary must also be maintained on each replica. Write-heavy tables with many indexes cause replicas to fall behind. If your replica lag is growing and your read latency on the primary is fine, you may have too many indexes — your write amplification is outrunning your replication throughput.

Before adding an index, know the table’s read/write ratio. Know which queries will use it (check EXPLAIN). Know the write amplification cost. An index that saves 50ms on a query that runs 10 times per second while adding 2ms to a write that runs 5,000 times per second is a net loss of 9.5 seconds of compute per second. The arithmetic isn’t complicated. It’s just rarely done.

Cache Locality Beats Big O - What Asymptotic Notation Can't Tell You

Consistency Isn’t Binary — There Are At Least Six Levels Between “Strong” and “Eventual”

Hybrid Logical Clocks: Causality and Wall Time Without Coordination

Load Balancing Doesn’t Balance Load

Read more

Cache Locality Beats Big O - What Asymptotic Notation Can't Tell You

Consistency Isn’t Binary — There Are At Least Six Levels Between “Strong” and “Eventual”

Hybrid Logical Clocks: Causality and Wall Time Without Coordination

Load Balancing Doesn’t Balance Load