Architect Your Database: Query Optimization Strategies

by Syntax Void Databases 9 min read
Architect Your Database: Query Optimization Strategies

Slow queries are symptoms, not root causes. Every slow query reflects a gap between what the query optimizer can infer and what it needs to know to execute efficiently. Fixing that gap requires understanding how query planners work — not just how to add indexes.

Reading an Execution Plan

Start here. Every query optimization journey begins with EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT):

EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT u.id, u.email, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
WHERE u.created_at > NOW() - INTERVAL '30 days'
GROUP BY u.id, u.email
ORDER BY order_count DESC
LIMIT 20;

The output shows: the operation type (Seq Scan, Index Scan, Hash Join), the estimated vs actual row counts, the actual time per node, and the buffer hits/misses. The key signal is the gap between estimated and actual rows: when the planner thinks a node returns 100 rows but it returns 10,000, the join strategy chosen may be completely wrong.

Seq Scan vs Index Scan: When Each is Correct

The presence of a Seq Scan is not always a problem. PostgreSQL’s planner performs a cost-based analysis: if the index would be accessed for more than ~10-15% of rows in the table, a full sequential scan is often faster because sequential I/O is dramatically faster than random I/O.

This means adding an index to fix a slow query sometimes does nothing — or makes it slower. The fix is a more selective index, a partial index, or a query rewrite.

Partial indexes are the most underused optimization in PostgreSQL:

-- Instead of an index on all orders:
CREATE INDEX orders_status_idx ON orders(status);

-- Index only the subset you actually query:
CREATE INDEX orders_pending_created_idx
ON orders(created_at)
WHERE status = 'pending';

If 95% of your orders are in completed status and you only ever query pending orders by created_at, this partial index is tiny and hyper-fast.

Index Types: Choosing the Right Tool

PostgreSQL supports multiple index types. Using the wrong one is a hidden performance tax.

B-Tree — Default. Efficient for equality, range queries, sorting, and pattern matching with left-anchored LIKE 'prefix%'. Use this for the vast majority of cases.

GIN (Generalized Inverted Index) — For jsonb, arrays, and full-text search. Stores all the values inside compound values and builds an inverted index. Fast for containment checks (@>, ?, @@) but slow to update.

GiST (Generalized Search Tree) — For geometric data, IP ranges (with ip4r), and full-text with tsvector. Supports nearest-neighbor searches. Slower than GIN for most text search workloads.

BRIN (Block Range Index) — For append-only, naturally ordered data (timestamps, sequential IDs). Stores min/max values per block range, not per row. Tiny size, fast to scan, only useful when data is correlated with its physical storage order.

-- For a time-series table where rows are appended in time order:
CREATE INDEX events_time_brin ON events USING brin(created_at);
-- This index is ~1000x smaller than a B-Tree and still fast for range scans

N+1 Queries: The Silent Killer

The N+1 problem is responsible for more production performance incidents than any other query pattern. It manifests when application code fetches a list of N records, then executes N additional queries to fetch related data:

# This code executes 1 + N queries
users = db.query("SELECT * FROM users LIMIT 100")  # 1 query
for user in users:
    orders = db.query(f"SELECT * FROM orders WHERE user_id = {user.id}")  # N queries

The fix is always a join or a batch fetch:

-- One query for everything
SELECT u.*, array_agg(o.id) as order_ids
FROM users u
LEFT JOIN orders o ON o.user_id = u.id
GROUP BY u.id
LIMIT 100;

In ORMs, this manifests as missing includes/eager_load calls. Instrument your queries in staging — any endpoint that executes proportionally more queries than the count of records it returns has an N+1 problem.

Connection Pooling: The Invisible Bottleneck

PostgreSQL maintains a per-connection process. At high concurrency, the overhead of maintaining thousands of idle connections degrades performance significantly — before you even start executing queries.

PgBouncer in transaction-pooling mode solves this:

[pgbouncer]
pool_mode = transaction
max_client_conn = 5000
default_pool_size = 25
server_idle_timeout = 600

With transaction pooling, a database connection is only held for the duration of a transaction. 5,000 application clients can share 25 real PostgreSQL connections. Your P99 query time drops significantly, not because the query changed, but because the connection overhead was eliminated.

Conclusion

Query optimization is schema design, index strategy, and query structure working together. No single technique is universally applicable. Master EXPLAIN ANALYZE, understand your access patterns before you build your indexes, and instrument your application to detect N+1 problems before they reach production. The database is the foundation — build it to carry the load you are about to deploy.