Speeding Up and Monitoring Index Creation in PostgreSQL May 26, 2025 • Fahad Siddiqui Creating indexes on large datasets in PostgreSQL can feel like watching paint dry—especially when you’re dealing with millions of rows. But with the right system settings and live monitoring, you can speed things up and see what’s happening under the hood. Here’s how. 🛠️ Problem: Index Creation Is Slow When working with a large table that has 100M+ rows (in our case it’s all_contributions just for the sake of example), a simple command like: CREATE INDEX idx_recipient_state ON all_contributions (recipient_state); can take several minutes—or worse, timeout if PostgreSQL’s default settings aren’t tuned. Step 1: Tune PostgreSQL for Faster Indexing Before creating indexes, tweak some Postgres configuration variables for better performance. a. Increase maintenance_work_mem SET maintenance_work_mem = '8GB'; This gives PostgreSQL more RAM to use during index creation. By default, this is often set to a few MBs, which is not enough for large tables. b. Boost Parallel Workers SET max_parallel_maintenance_workers = 8; This tells PostgreSQL to use up to 8 CPU cores for index creation. On machines with multiple cores, this can significantly cut down build time. 💡 Note: These are session-level settings, so you’ll need to run them each time or include them in your script. Step 2: Monitor Index Progress in Real Time PostgreSQL provides a system view called pg_stat_progress_create_index that shows you what’s happening behind the scenes during index creation. Use this query to monitor index creation: SELECT age(clock_timestamp(), a.query_start) AS duration, a.query, p.phase, round(p.blocks_done / NULLIF(p.blocks_total::numeric, 0) * 100, 2) AS "% done", p.blocks_total, p.blocks_done, round(p.tuples_done / NULLIF(p.tuples_total::numeric, 0) * 100, 2) AS "% tuples done", p.tuples_total, p.tuples_done FROM pg_stat_progress_create_index p JOIN pg_stat_activity a ON p.pid = a.pid LEFT JOIN pg_stat_all_indexes ai ON ai.relid = p.relid AND ai.indexrelid = p.index_relid; This gives you a live view of progress, including which phase it’s in (scanning, sorting, building), and how far along it is. Wrap-Up By combining smart configuration settings with real-time insight, you can: Speed up the creation of large indexes Avoid unnecessary timeouts Track progress live and make better decisions Indexing doesn’t have to be a black box anymore. Need help debugging slow queries or managing schema changes at scale? Reach out to the team at Datum Brain. Please enable JavaScript to view the comments powered by Disqus.