Skip to content

PostgreSQL Interview Questions

What makes PostgreSQL different from MySQL?

FeaturePostgreSQLMySQL
ACID complianceFullInnoDB only
JSON supportJSONB (binary, indexed)JSON (text)
Window functionsFull support8.0+
CTEsYes (including writable)8.0+
Partial indexesYesNo
Table inheritanceYesNo
Custom typesYesLimited

PostgreSQL is more standards-compliant and feature-rich; MySQL is simpler and historically faster for simple read-heavy workloads.


What is MVCC?

Multi-Version Concurrency Control — PostgreSQL keeps multiple versions of each row so readers never block writers and writers never block readers.

  • Each transaction sees a snapshot of the database at a consistent point in time
  • Old row versions are cleaned up by VACUUM
  • Avoids dirty reads without locking

What is the difference between TRUNCATE and DELETE?

DELETETRUNCATE
SpeedSlower (row by row)Fast (deallocates pages)
WHERE clauseYesNo
TriggersFires per rowFires once (statement-level)
RollbackYesYes (PostgreSQL only)
Resets sequencesNoYes (with RESTART IDENTITY)

What index types does PostgreSQL support?

-- B-tree (default) — equality and range queries
CREATE INDEX idx_orders_date ON orders(created_at);
-- Hash — equality only, faster than B-tree for =
CREATE INDEX idx_users_email ON users USING HASH (email);
-- GIN — full-text search, JSONB, arrays
CREATE INDEX idx_docs_body ON documents USING GIN (to_tsvector('english', body));
CREATE INDEX idx_tags ON posts USING GIN (tags);
-- GiST — geometric types, full-text, nearest-neighbor
CREATE INDEX idx_locations ON places USING GIST (coordinates);
-- BRIN — very large, naturally ordered tables (e.g., time-series)
CREATE INDEX idx_logs_time ON logs USING BRIN (created_at);

What is a partial index?

An index on a subset of rows — smaller, faster, only covers the rows you actually query:

-- Index only active users (not the millions of inactive ones)
CREATE INDEX idx_active_users ON users(email)
WHERE status = 'active';
-- Query uses the index automatically when WHERE matches
SELECT * FROM users WHERE email = 'user@example.com' AND status = 'active';

What is an index on expression?

-- Case-insensitive email lookups
CREATE INDEX idx_users_email_lower ON users(LOWER(email));
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

What is a CTE and when would you use one?

A Common Table Expression (WITH clause) is a named temporary result set:

-- Basic CTE
WITH monthly_sales AS (
SELECT DATE_TRUNC('month', created_at) AS month,
SUM(amount) AS total
FROM orders
GROUP BY 1
)
SELECT month, total,
total - LAG(total) OVER (ORDER BY month) AS change
FROM monthly_sales
ORDER BY month;
-- Recursive CTE — for hierarchical data
WITH RECURSIVE category_tree AS (
-- Base case: root categories
SELECT id, name, parent_id, 0 AS depth
FROM categories
WHERE parent_id IS NULL
UNION ALL
-- Recursive step
SELECT c.id, c.name, c.parent_id, ct.depth + 1
FROM categories c
JOIN category_tree ct ON c.parent_id = ct.id
)
SELECT * FROM category_tree ORDER BY depth, name;

What are window functions?

Window functions compute values across a set of rows related to the current row, without collapsing them into groups:

SELECT
employee_id,
department,
salary,
-- Rank within department
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_rank,
-- Running total
SUM(salary) OVER (PARTITION BY department ORDER BY salary) AS running_total,
-- Compare to department average
salary - AVG(salary) OVER (PARTITION BY department) AS vs_avg,
-- Previous row's value
LAG(salary) OVER (ORDER BY salary) AS prev_salary,
-- Row number (no gaps in sequence)
ROW_NUMBER() OVER (ORDER BY salary DESC) AS overall_rank
FROM employees;

What isolation levels does PostgreSQL support?

BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
LevelDirty ReadNon-Repeatable ReadPhantom Read
Read UncommittedNot possible*YesYes
Read Committed (default)NoYesYes
Repeatable ReadNoNoNo**
SerializableNoNoNo

*PostgreSQL never allows dirty reads even at READ UNCOMMITTED
**PostgreSQL’s MVCC prevents phantom reads at REPEATABLE READ


What is SELECT FOR UPDATE?

Locks selected rows for the duration of the transaction, preventing other transactions from modifying them:

BEGIN;
SELECT * FROM inventory WHERE product_id = 42 FOR UPDATE;
-- row is now locked
UPDATE inventory SET quantity = quantity - 1 WHERE product_id = 42;
COMMIT;
-- SKIP LOCKED — skip rows that are already locked (queue processing pattern)
SELECT * FROM jobs WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;

How do you analyze a slow query?

-- See execution plan
EXPLAIN SELECT * FROM orders WHERE customer_id = 5;
-- See actual execution time and row counts
EXPLAIN ANALYZE SELECT * FROM orders WHERE customer_id = 5;
-- Full verbose output
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT) SELECT ...;

Key things to check:

  • Seq Scan on large tables — suggests missing index
  • Rows estimate vs actual — large gaps indicate stale statistics; run ANALYZE
  • Nested Loop with large tables — might need a Hash Join
  • Sort step — consider an index matching the ORDER BY

What is VACUUM and why does it matter?

PostgreSQL’s MVCC leaves dead row versions after updates/deletes. VACUUM reclaims that space:

-- Manual vacuum
VACUUM orders;
-- Vacuum and update statistics
VACUUM ANALYZE orders;
-- Full vacuum — rewrites table, reclaims disk space (locks table)
VACUUM FULL orders;
-- Check table bloat
SELECT relname, n_dead_tup, n_live_tup,
ROUND(100.0 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 1) AS dead_pct
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;

autovacuum handles this automatically. Tune it for high-traffic tables:

ALTER TABLE orders SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_analyze_scale_factor = 0.005
);

-- Check running queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
-- Kill a slow query
SELECT pg_terminate_backend(pid);
-- Table and index sizes
SELECT relname AS table,
pg_size_pretty(pg_total_relation_size(relid)) AS total_size,
pg_size_pretty(pg_indexes_size(relid)) AS index_size
FROM pg_catalog.pg_statio_user_tables
ORDER BY pg_total_relation_size(relid) DESC;
-- Missing indexes (high sequential scans)
SELECT relname, seq_scan, idx_scan
FROM pg_stat_user_tables
WHERE seq_scan > idx_scan
ORDER BY seq_scan DESC;