PostgreSQL Interview Questions
Core Concepts
Section titled “Core Concepts”What makes PostgreSQL different from MySQL?
| Feature | PostgreSQL | MySQL |
|---|---|---|
| ACID compliance | Full | InnoDB only |
| JSON support | JSONB (binary, indexed) | JSON (text) |
| Window functions | Full support | 8.0+ |
| CTEs | Yes (including writable) | 8.0+ |
| Partial indexes | Yes | No |
| Table inheritance | Yes | No |
| Custom types | Yes | Limited |
PostgreSQL is more standards-compliant and feature-rich; MySQL is simpler and historically faster for simple read-heavy workloads.
What is MVCC?
Multi-Version Concurrency Control — PostgreSQL keeps multiple versions of each row so readers never block writers and writers never block readers.
- Each transaction sees a snapshot of the database at a consistent point in time
- Old row versions are cleaned up by
VACUUM - Avoids dirty reads without locking
What is the difference between TRUNCATE and DELETE?
| DELETE | TRUNCATE | |
|---|---|---|
| Speed | Slower (row by row) | Fast (deallocates pages) |
| WHERE clause | Yes | No |
| Triggers | Fires per row | Fires once (statement-level) |
| Rollback | Yes | Yes (PostgreSQL only) |
| Resets sequences | No | Yes (with RESTART IDENTITY) |
Indexes
Section titled “Indexes”What index types does PostgreSQL support?
-- B-tree (default) — equality and range queriesCREATE INDEX idx_orders_date ON orders(created_at);
-- Hash — equality only, faster than B-tree for =CREATE INDEX idx_users_email ON users USING HASH (email);
-- GIN — full-text search, JSONB, arraysCREATE INDEX idx_docs_body ON documents USING GIN (to_tsvector('english', body));CREATE INDEX idx_tags ON posts USING GIN (tags);
-- GiST — geometric types, full-text, nearest-neighborCREATE INDEX idx_locations ON places USING GIST (coordinates);
-- BRIN — very large, naturally ordered tables (e.g., time-series)CREATE INDEX idx_logs_time ON logs USING BRIN (created_at);What is a partial index?
An index on a subset of rows — smaller, faster, only covers the rows you actually query:
-- Index only active users (not the millions of inactive ones)CREATE INDEX idx_active_users ON users(email)WHERE status = 'active';
-- Query uses the index automatically when WHERE matchesSELECT * FROM users WHERE email = 'user@example.com' AND status = 'active';What is an index on expression?
-- Case-insensitive email lookupsCREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';CTEs and Window Functions
Section titled “CTEs and Window Functions”What is a CTE and when would you use one?
A Common Table Expression (WITH clause) is a named temporary result set:
-- Basic CTEWITH monthly_sales AS ( SELECT DATE_TRUNC('month', created_at) AS month, SUM(amount) AS total FROM orders GROUP BY 1)SELECT month, total, total - LAG(total) OVER (ORDER BY month) AS changeFROM monthly_salesORDER BY month;
-- Recursive CTE — for hierarchical dataWITH RECURSIVE category_tree AS ( -- Base case: root categories SELECT id, name, parent_id, 0 AS depth FROM categories WHERE parent_id IS NULL
UNION ALL
-- Recursive step SELECT c.id, c.name, c.parent_id, ct.depth + 1 FROM categories c JOIN category_tree ct ON c.parent_id = ct.id)SELECT * FROM category_tree ORDER BY depth, name;What are window functions?
Window functions compute values across a set of rows related to the current row, without collapsing them into groups:
SELECT employee_id, department, salary,
-- Rank within department RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
-- Running total SUM(salary) OVER (PARTITION BY department ORDER BY salary) AS running_total,
-- Compare to department average salary - AVG(salary) OVER (PARTITION BY department) AS vs_avg,
-- Previous row's value LAG(salary) OVER (ORDER BY salary) AS prev_salary,
-- Row number (no gaps in sequence) ROW_NUMBER() OVER (ORDER BY salary DESC) AS overall_rank
FROM employees;Transactions and Locking
Section titled “Transactions and Locking”What isolation levels does PostgreSQL support?
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;| Level | Dirty Read | Non-Repeatable Read | Phantom Read |
|---|---|---|---|
| Read Uncommitted | Not possible* | Yes | Yes |
| Read Committed (default) | No | Yes | Yes |
| Repeatable Read | No | No | No** |
| Serializable | No | No | No |
*PostgreSQL never allows dirty reads even at READ UNCOMMITTED
**PostgreSQL’s MVCC prevents phantom reads at REPEATABLE READ
What is SELECT FOR UPDATE?
Locks selected rows for the duration of the transaction, preventing other transactions from modifying them:
BEGIN;SELECT * FROM inventory WHERE product_id = 42 FOR UPDATE;-- row is now lockedUPDATE inventory SET quantity = quantity - 1 WHERE product_id = 42;COMMIT;
-- SKIP LOCKED — skip rows that are already locked (queue processing pattern)SELECT * FROM jobs WHERE status = 'pending'ORDER BY created_atLIMIT 1FOR UPDATE SKIP LOCKED;Performance
Section titled “Performance”How do you analyze a slow query?
-- See execution planEXPLAIN SELECT * FROM orders WHERE customer_id = 5;
-- See actual execution time and row countsEXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 5;
-- Full verbose outputEXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...;Key things to check:
Seq Scanon large tables — suggests missing indexRowsestimate vs actual — large gaps indicate stale statistics; runANALYZENested Loopwith large tables — might need aHash JoinSortstep — consider an index matching the ORDER BY
What is VACUUM and why does it matter?
PostgreSQL’s MVCC leaves dead row versions after updates/deletes. VACUUM reclaims that space:
-- Manual vacuumVACUUM orders;
-- Vacuum and update statisticsVACUUM ANALYZE orders;
-- Full vacuum — rewrites table, reclaims disk space (locks table)VACUUM FULL orders;
-- Check table bloatSELECT relname, n_dead_tup, n_live_tup, ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pctFROM pg_stat_user_tablesORDER BY n_dead_tup DESC;autovacuum handles this automatically. Tune it for high-traffic tables:
ALTER TABLE orders SET ( autovacuum_vacuum_scale_factor = 0.01, autovacuum_analyze_scale_factor = 0.005);Quick Reference
Section titled “Quick Reference”-- Check running queriesSELECT pid, now() - pg_stat_activity.query_start AS duration, queryFROM pg_stat_activityWHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
-- Kill a slow querySELECT pg_terminate_backend(pid);
-- Table and index sizesSELECT relname AS table, pg_size_pretty(pg_total_relation_size(relid)) AS total_size, pg_size_pretty(pg_indexes_size(relid)) AS index_sizeFROM pg_catalog.pg_statio_user_tablesORDER BY pg_total_relation_size(relid) DESC;
-- Missing indexes (high sequential scans)SELECT relname, seq_scan, idx_scanFROM pg_stat_user_tablesWHERE seq_scan > idx_scanORDER BY seq_scan DESC;