Advanced PostgreSQL Tricks for Production Systems
By Oleksandr Andrushchenko — Published on
PostgreSQL is easy to start with, but production PostgreSQL has many behaviors that developers usually learn only after slow queries, lock contention, failed migrations, or painful bulk imports.
This article covers practical PostgreSQL tricks around indexes, locks, transactions, bulk inserts, query performance, and production-safe operations.
Table of Contents
- Indexes
- Locks and Concurrency
- Transactions and Isolation
- Query Performance
- Bulk Operations
- PostgreSQL Features Developers Ignore
- Real Production Examples
- Production Checklist
- Conclusion
Indexes
Indexes Are Not Free
Indexes speed up reads, but they slow down writes. Every time PostgreSQL inserts, updates, or deletes a row, it may also need to update every related index. This means indexes improve query performance but increase write cost, storage usage, and maintenance overhead.
For example, an index on users.email makes login lookup fast. But if a table has 12 indexes, every insert into that table may need to update 12 separate index structures. On write-heavy tables, unnecessary indexes can become a serious bottleneck.
CREATE INDEX idx_users_email ON users (email);
| Index Benefit | Index Cost |
|---|---|
| Faster reads | Slower inserts, updates, and deletes |
| Faster sorting and filtering | More disk usage |
| Better join performance | More vacuum and maintenance work |
| Can support uniqueness | Can be unused or redundant |
Rule of thumb: do not create indexes because a column “looks important.” Create indexes for real query patterns.
Drop Indexes Before Massive Inserts
For very large imports, maintaining indexes during every insert can be slower than loading data first and rebuilding indexes afterward. This is especially relevant for staging tables, historical imports, analytics tables, and one-time migrations.
DROP INDEX IF EXISTS idx_events_created_at;
COPY events (account_id, event_type, created_at, payload)
FROM '/tmp/events.csv'
WITH (FORMAT csv, HEADER true);
CREATE INDEX idx_events_created_at ON events (created_at);
This does not mean you should drop indexes before every insert. It is useful when loading large amounts of data into a controlled table where you can tolerate the rebuild time.
| Situation | Better Choice |
|---|---|
| Normal application writes | Keep indexes |
| Small batch insert | Keep indexes |
| Large one-time historical import | Consider dropping and rebuilding non-critical indexes |
| Production table receiving live writes | Be very careful; prefer safer migration strategy |
Partial Indexes
A partial index indexes only rows that match a condition. This is extremely useful when most queries filter on the same condition.
For example, if most orders are completed but your application frequently queries only pending orders, indexing the entire table may be wasteful.
CREATE INDEX idx_orders_pending
ON orders (created_at)
WHERE status = 'pending';
Now PostgreSQL can use a much smaller index for this query:
SELECT *
FROM orders
WHERE status = 'pending'
ORDER BY created_at
LIMIT 50;
Key benefit: partial indexes are smaller, cheaper to maintain, and faster to scan when the query matches the predicate.
Covering Indexes with INCLUDE
A covering index can satisfy a query using only the index without reading the full table row. In PostgreSQL, you can use INCLUDE columns to store extra values in the index that are not part of the search key.
CREATE INDEX idx_users_email_include_name
ON users (email)
INCLUDE (id, name);
This can help queries like:
SELECT id, name
FROM users
WHERE email = 'alex@example.com';
The indexed search key is email, but id and name are available from the index. This can reduce table lookups for read-heavy queries.
CREATE INDEX CONCURRENTLY
Creating an index on a large production table can block writes if done with regular CREATE INDEX. PostgreSQL provides CREATE INDEX CONCURRENTLY to build an index while allowing reads and writes to continue.
CREATE INDEX CONCURRENTLY idx_users_created_at
ON users (created_at);
Important: CREATE INDEX CONCURRENTLY is usually safer for production tables, but it takes longer and cannot run inside a normal transaction block.
-- This will fail:
BEGIN;
CREATE INDEX CONCURRENTLY idx_users_email
ON users (email);
COMMIT;
| Command | Production Impact | Use Case |
|---|---|---|
CREATE INDEX |
Faster but can block writes | Small tables, maintenance windows |
CREATE INDEX CONCURRENTLY |
Slower but allows writes | Large production tables |
GIN Indexes for JSONB
PostgreSQL supports JSONB, but querying JSONB without the right index can become slow. For containment queries, a GIN index is often useful.
CREATE TABLE events (
id bigserial PRIMARY KEY,
payload jsonb NOT NULL
);
CREATE INDEX idx_events_payload_gin
ON events
USING gin (payload);
Example query:
SELECT *
FROM events
WHERE payload @> '{"type": "signup"}';
Trade-off: JSONB gives flexibility, but relational columns are still better when the field is frequently filtered, joined, validated, or indexed.
Locks and Concurrency
Row Locks
PostgreSQL uses locks to protect data consistency. A row lock prevents conflicting updates to the same row. Most developers do not notice row locks until requests start waiting in production.
BEGIN;
UPDATE accounts
SET balance = balance - 100
WHERE id = 1;
-- Row is locked until COMMIT or ROLLBACK.
COMMIT;
While this transaction is open, another transaction trying to update the same account row may wait.
SELECT FOR UPDATE
SELECT FOR UPDATE reads rows and locks them for update. This is useful when you need to read something, make a decision, and then update it safely.
BEGIN;
SELECT *
FROM accounts
WHERE id = 1
FOR UPDATE;
UPDATE accounts
SET balance = balance - 100
WHERE id = 1;
COMMIT;
This prevents two transactions from reading the same row and making conflicting updates based on stale information.
FOR UPDATE SKIP LOCKED
FOR UPDATE SKIP LOCKED is one of the most useful PostgreSQL tricks for building job queues. It lets workers skip rows already locked by other workers and pick available work.
BEGIN;
SELECT id, payload
FROM jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 10
FOR UPDATE SKIP LOCKED;
UPDATE jobs
SET status = 'processing'
WHERE id IN (1, 2, 3);
COMMIT;
This allows multiple workers to process jobs without fighting over the same rows.
Worker 1 -> locks jobs 1-10
Worker 2 -> skips locked jobs and gets 11-20
Worker 3 -> skips locked jobs and gets 21-30
Important: this works well for simple queues, but for complex distributed job systems, dedicated queues like SQS, RabbitMQ, or Kafka may be better.
Deadlocks
A deadlock happens when two transactions wait for each other forever. PostgreSQL detects deadlocks and aborts one transaction.
Transaction A locks row 1
Transaction B locks row 2
Transaction A tries to lock row 2
Transaction B tries to lock row 1
Deadlock.
A common solution is to lock rows in a consistent order.
-- Good: always lock accounts by ascending id.
SELECT *
FROM accounts
WHERE id IN (1, 2)
ORDER BY id
FOR UPDATE;
Long Transactions Kill Performance
Long transactions are dangerous in PostgreSQL. They keep locks open, delay cleanup, hold old row versions visible, and can cause vacuum problems.
For example, opening a transaction, running a query, then doing slow network work before committing is a common production mistake.
BEGIN;
UPDATE orders
SET status = 'processing'
WHERE id = 123;
-- Bad idea:
-- application waits for external payment API here
COMMIT;
Rule of thumb: keep transactions short. Do database work, commit, then call slow external systems when possible.
Transactions and Isolation
READ COMMITTED
READ COMMITTED is the default isolation level in PostgreSQL. Each statement sees data committed before that statement begins.
BEGIN ISOLATION LEVEL READ COMMITTED;
SELECT balance FROM accounts WHERE id = 1;
-- Another transaction commits a change here.
SELECT balance FROM accounts WHERE id = 1;
COMMIT;
The second query may see different data than the first query. This is often fine for normal application behavior.
REPEATABLE READ
REPEATABLE READ gives the transaction a stable snapshot. If you read the same row twice, you see the same version inside the transaction.
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT balance FROM accounts WHERE id = 1;
-- Another transaction commits a change here.
SELECT balance FROM accounts WHERE id = 1;
COMMIT;
The second query sees the same snapshot as the first query. This is useful for reports or workflows where consistent reads matter.
SERIALIZABLE
SERIALIZABLE is the strictest isolation level. PostgreSQL tries to make concurrent transactions behave as if they ran one at a time.
BEGIN ISOLATION LEVEL SERIALIZABLE;
-- Critical transaction logic here.
COMMIT;
This can prevent subtle concurrency bugs, but it may produce serialization failures. The application must be ready to retry transactions.
Phantom Reads
A phantom read happens when a transaction repeats a query and sees new rows inserted by another transaction. Isolation level determines whether this can happen.
| Isolation Level | Behavior | Trade-off |
|---|---|---|
| READ COMMITTED | Each statement sees latest committed data | Simple and common |
| REPEATABLE READ | Transaction sees stable snapshot | More consistency, more conflict risk |
| SERIALIZABLE | Strongest isolation | May require retries |
Query Performance
EXPLAIN ANALYZE
EXPLAIN shows the query plan. EXPLAIN ANALYZE actually runs the query and shows real execution timing. This is one of the most important PostgreSQL tools for performance debugging.
EXPLAIN ANALYZE
SELECT *
FROM orders
WHERE account_id = 123
ORDER BY created_at DESC
LIMIT 50;
Look for sequential scans on large tables, wrong row estimates, expensive sorts, nested loops over many rows, and missing indexes.
Avoid SELECT *
SELECT * is convenient but often wasteful. It returns columns the application may not need, increases network transfer, prevents some index-only scans, and makes API behavior more fragile when columns are added.
-- Avoid:
SELECT *
FROM users
WHERE id = 123;
-- Better:
SELECT id, email, name
FROM users
WHERE id = 123;
OFFSET Pagination Is Slow
OFFSET pagination becomes slower as the page number grows because PostgreSQL still has to walk through skipped rows.
SELECT id, title, created_at
FROM posts
ORDER BY created_at DESC
LIMIT 20 OFFSET 100000;
This query may skip 100,000 rows before returning 20. It gets worse as users go deeper into pagination.
Keyset Pagination
Keyset pagination uses the last seen value instead of OFFSET. It is usually much faster for infinite scroll, feeds, timelines, and large ordered lists.
SELECT id, title, created_at
FROM posts
WHERE created_at < '2026-06-01 10:00:00'
ORDER BY created_at DESC
LIMIT 20;
With a matching index, PostgreSQL can jump directly to the next page range.
CREATE INDEX idx_posts_created_at
ON posts (created_at DESC);
N+1 Queries
The N+1 query problem happens when the application runs one query to fetch parent rows, then one additional query for each parent row.
1 query: get 100 posts
100 queries: get author for each post
Total: 101 queries
Use joins, batching, or preloading instead.
SELECT posts.id, posts.title, users.name AS author_name
FROM posts
JOIN users ON users.id = posts.user_id
WHERE posts.created_at > now() - interval '7 days';
COUNT(*) Can Be Expensive
COUNT(*) on a large filtered table can be expensive because PostgreSQL may need to scan many rows to produce an exact count.
SELECT COUNT(*)
FROM events
WHERE created_at > now() - interval '30 days';
For user interfaces, consider whether you really need an exact count. Approximate counts, cached counts, materialized views, or pagination without total count may be better.
Bulk Operations
COPY Is Faster Than INSERT
For large imports, PostgreSQL COPY is usually much faster than many individual INSERT statements.
COPY users (id, email, name, created_at)
FROM '/tmp/users.csv'
WITH (FORMAT csv, HEADER true);
Use COPY for data imports, backfills, analytics loads, and staging-table workflows.
Batch Inserts
If COPY is not practical, batch inserts are usually better than inserting one row at a time.
INSERT INTO users (email, name)
VALUES
('a@example.com', 'Alice'),
('b@example.com', 'Bob'),
('c@example.com', 'Charlie');
Batching reduces round trips and transaction overhead.
INSERT ... ON CONFLICT
PostgreSQL supports upserts with INSERT ... ON CONFLICT. This is useful when inserting data that may already exist.
INSERT INTO users (email, name)
VALUES ('alex@example.com', 'Alex')
ON CONFLICT (email)
DO UPDATE SET name = EXCLUDED.name;
This is cleaner and safer than manually checking whether the row exists before inserting.
Staging Tables
A staging table is a temporary or intermediate table used to load, validate, deduplicate, and transform data before moving it into production tables.
CREATE TEMP TABLE staging_users (
email text,
name text
);
COPY staging_users (email, name)
FROM '/tmp/users.csv'
WITH (FORMAT csv, HEADER true);
INSERT INTO users (email, name)
SELECT DISTINCT email, name
FROM staging_users
ON CONFLICT (email)
DO UPDATE SET name = EXCLUDED.name;
This pattern is much safer than importing unvalidated data directly into production tables.
PostgreSQL Features Developers Ignore
Materialized Views
A materialized view stores the result of a query physically. It is useful for expensive reports, dashboard summaries, and repeated aggregations.
CREATE MATERIALIZED VIEW daily_sales AS
SELECT
date_trunc('day', created_at) AS day,
sum(total_amount) AS revenue
FROM orders
GROUP BY 1;
Refresh it when needed:
REFRESH MATERIALIZED VIEW daily_sales;
Trade-off: materialized views are fast to query but can become stale until refreshed.
JSONB
JSONB is useful when data is semi-structured or changes frequently. It works well for metadata, event payloads, feature configuration, and integrations.
CREATE TABLE webhooks (
id bigserial PRIMARY KEY,
provider text NOT NULL,
payload jsonb NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
But do not use JSONB as a replacement for every relational table. If a field is important for filtering, joining, validation, or reporting, a normal column is usually better.
Table Partitioning
Partitioning splits a large table into smaller physical parts while keeping one logical table. It is useful for large time-series tables, logs, events, and multi-tenant data.
CREATE TABLE events (
id bigserial,
created_at timestamptz NOT NULL,
payload jsonb NOT NULL
) PARTITION BY RANGE (created_at);
Example monthly partition:
CREATE TABLE events_2026_06
PARTITION OF events
FOR VALUES FROM ('2026-06-01') TO ('2026-07-01');
Partitioning helps with pruning, retention, archiving, and maintenance. It also adds operational complexity, so do not use it too early.
Generated Columns
Generated columns store values computed from other columns. They are useful when a derived value is frequently queried or indexed.
CREATE TABLE users (
id bigserial PRIMARY KEY,
email text NOT NULL,
email_lower text GENERATED ALWAYS AS (lower(email)) STORED
);
CREATE INDEX idx_users_email_lower
ON users (email_lower);
This can make queries simpler and more consistent.
Advisory Locks
Advisory locks are application-controlled locks. PostgreSQL does not attach them to a specific row automatically. They are useful for distributed coordination, scheduled jobs, and preventing duplicate work.
SELECT pg_try_advisory_lock(12345);
If it returns true, the process acquired the lock. If false, another process already holds it.
SELECT pg_advisory_unlock(12345);
Use carefully: advisory locks are powerful, but the application must manage lock IDs and release behavior correctly.
Real Production Examples
Building a Job Queue with SKIP LOCKED
For a simple PostgreSQL-backed queue, use FOR UPDATE SKIP LOCKED so multiple workers do not process the same job.
WITH picked AS (
SELECT id
FROM jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 10
FOR UPDATE SKIP LOCKED
)
UPDATE jobs
SET status = 'processing'
WHERE id IN (SELECT id FROM picked)
RETURNING *;
This is practical for moderate workloads. For very large queues, dedicated message systems may be better.
Scaling an Event Log Table
Event tables grow quickly. A common design is to partition by time, index only the most important access patterns, and archive old partitions.
CREATE INDEX idx_events_account_created_at
ON events (account_id, created_at DESC);
For analytics-heavy workloads, PostgreSQL may eventually need help from specialized analytical databases, but good partitioning and indexes can go a long way.
Handling Billions of Rows
For very large reporting tables, avoid running expensive aggregations on every request. Use precomputed tables, materialized views, partitioning, and scheduled refreshes.
CREATE MATERIALIZED VIEW monthly_account_usage AS
SELECT
account_id,
date_trunc('month', created_at) AS month,
count(*) AS events_count
FROM events
GROUP BY account_id, date_trunc('month', created_at);
This turns repeated expensive aggregation into a cheaper read.
Production Checklist
- Use EXPLAIN ANALYZE before guessing. Measure the real query plan.
- Create indexes for query patterns, not for every column.
- Use CREATE INDEX CONCURRENTLY on large production tables.
- Use partial indexes for frequently filtered subsets.
- Use covering indexes when queries need only a few selected columns.
- Keep transactions short. Avoid external API calls inside open transactions.
- Lock rows in consistent order. This reduces deadlock risk.
- Use SKIP LOCKED for simple PostgreSQL-backed workers.
- Prefer COPY for large imports.
- Use staging tables for validation and deduplication.
- Avoid deep OFFSET pagination. Prefer keyset pagination for large lists.
- Be careful with COUNT(*) on large tables. Cache, approximate, or precompute when exact counts are not required.
- Use JSONB for flexible data, not for everything.
- Partition only when there is a real data-size, retention, or query-pruning need.
Conclusion
Advanced PostgreSQL performance is usually not about one magic trick. It is about understanding indexes, locks, transactions, query plans, and write patterns together.
The biggest production wins often come from simple but disciplined practices: use the right index, keep transactions short, avoid unnecessary writes, batch large operations, inspect query plans, and choose the right pagination and locking strategy.
Key takeaway: PostgreSQL is powerful, but it rewards engineers who understand its trade-offs. Every index, lock, transaction, and query plan has a cost. Production-ready PostgreSQL design means choosing those costs intentionally.
Comments (0)