Deep EVM #25: PostgreSQL Table Partitioning — When Your Table Hits 10M+ Rows
Engineering Team
When to Partition
You have a transactions table that started small and is now at 34 million rows. Queries that used to take 50ms now take 5 seconds. VACUUM runs for hours and blocks autovacuum on other tables. Index rebuilds take the table offline for minutes. Your database is not slow — your table is too big for a single heap file.
Table partitioning splits a logical table into multiple physical tables (partitions). PostgreSQL’s query planner automatically routes queries to the correct partitions, scanning only the data needed.
Rules of Thumb for Partitioning
- Partition when: Table exceeds 10M rows, or queries consistently scan >20% of the table, or VACUUM cannot keep up with dead tuples
- Do NOT partition when: Table is under 1M rows (overhead exceeds benefit), queries always hit an index (partition pruning adds nothing), write patterns are random (no natural partition key)
Partition Strategies
PostgreSQL supports three native partitioning strategies:
Range Partitioning
Split by value ranges. Ideal for time-series data:
CREATE TABLE transactions (
id BIGINT GENERATED ALWAYS AS IDENTITY,
block_number BIGINT NOT NULL,
tx_hash BYTEA NOT NULL,
from_addr BYTEA NOT NULL,
to_addr BYTEA,
value_wei NUMERIC(78, 0) NOT NULL,
gas_used BIGINT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY RANGE (block_number);
-- Create partitions for every 1M blocks
CREATE TABLE transactions_0_1m
PARTITION OF transactions
FOR VALUES FROM (0) TO (1000000);
CREATE TABLE transactions_1m_2m
PARTITION OF transactions
FOR VALUES FROM (1000000) TO (2000000);
CREATE TABLE transactions_2m_3m
PARTITION OF transactions
FOR VALUES FROM (2000000) TO (3000000);
-- Continue for each range...
Queries that filter on block_number automatically prune irrelevant partitions:
-- Only scans transactions_18m_19m partition
SELECT * FROM transactions
WHERE block_number BETWEEN 18000000 AND 18100000;
-- EXPLAIN shows partition pruning:
-- Append
-- -> Index Scan on transactions_18m_19m
-- Index Cond: (block_number >= 18000000 AND block_number <= 18100000)
List Partitioning
Split by discrete values. Ideal for categorical data:
CREATE TABLE events (
id BIGINT GENERATED ALWAYS AS IDENTITY,
chain_id INT NOT NULL,
contract BYTEA NOT NULL,
event_name TEXT NOT NULL,
data JSONB NOT NULL,
block_number BIGINT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
) PARTITION BY LIST (chain_id);
CREATE TABLE events_ethereum
PARTITION OF events FOR VALUES IN (1);
CREATE TABLE events_polygon
PARTITION OF events FOR VALUES IN (137);
CREATE TABLE events_arbitrum
PARTITION OF events FOR VALUES IN (42161);
CREATE TABLE events_optimism
PARTITION OF events FOR VALUES IN (10);
CREATE TABLE events_base
PARTITION OF events FOR VALUES IN (8453);
Hash Partitioning
Split by hash of a column. Ensures even distribution when there is no natural range or list:
CREATE TABLE addresses (
address BYTEA PRIMARY KEY,
first_seen BIGINT NOT NULL,
tx_count BIGINT NOT NULL DEFAULT 0,
balance_wei NUMERIC(78, 0) NOT NULL DEFAULT 0
) PARTITION BY HASH (address);
-- Create 16 partitions
CREATE TABLE addresses_p0 PARTITION OF addresses
FOR VALUES WITH (MODULUS 16, REMAINDER 0);
CREATE TABLE addresses_p1 PARTITION OF addresses
FOR VALUES WITH (MODULUS 16, REMAINDER 1);
-- ... through p15
Real Example: 34M Rows to 20 Partitions
Let us walk through partitioning an existing transactions table with 34 million rows.
Step 1: Create the Partitioned Table
-- Create new partitioned table
CREATE TABLE transactions_partitioned (
LIKE transactions INCLUDING ALL
) PARTITION BY RANGE (block_number);
-- Create 20 partitions of ~1.7M rows each
DO $$
DECLARE
start_block BIGINT;
BEGIN
FOR i IN 0..19 LOOP
start_block := i * 1000000;
EXECUTE format(
'CREATE TABLE transactions_p%s PARTITION OF transactions_partitioned
FOR VALUES FROM (%s) TO (%s)',
i, start_block, start_block + 1000000
);
END LOOP;
END $$;
-- Add a default partition for future data
CREATE TABLE transactions_default
PARTITION OF transactions_partitioned DEFAULT;
Step 2: Migrate Data
-- Copy data in batches to avoid locking
DO $$
DECLARE
batch_size BIGINT := 100000;
max_id BIGINT;
current_id BIGINT := 0;
BEGIN
SELECT MAX(id) INTO max_id FROM transactions;
WHILE current_id < max_id LOOP
INSERT INTO transactions_partitioned
SELECT * FROM transactions
WHERE id > current_id AND id <= current_id + batch_size;
current_id := current_id + batch_size;
RAISE NOTICE 'Migrated up to id %', current_id;
COMMIT;
END LOOP;
END $$;
Step 3: Swap Tables
-- Atomic swap
BEGIN;
ALTER TABLE transactions RENAME TO transactions_old;
ALTER TABLE transactions_partitioned RENAME TO transactions;
COMMIT;
-- Verify, then drop old table
-- DROP TABLE transactions_old;
Step 4: Create Indexes on Partitions
-- Creating an index on the parent table automatically creates
-- identical indexes on all partitions
CREATE INDEX CONCURRENTLY idx_transactions_block
ON transactions (block_number);
CREATE INDEX CONCURRENTLY idx_transactions_from
ON transactions (from_addr, block_number);
CREATE INDEX CONCURRENTLY idx_transactions_to
ON transactions (to_addr, block_number);
Partition Pruning in EXPLAIN
Verify that the query planner prunes partitions:
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM transactions
WHERE block_number BETWEEN 18000000 AND 18500000;
Good output (only relevant partitions scanned):
Append (cost=0.43..1234.56 rows=50000 width=120)
-> Index Scan using transactions_p18_block_number_idx
on transactions_p18 (actual time=0.1..12.3 rows=50000)
Index Cond: block_number >= 18000000 AND block_number <= 18500000
Buffers: shared hit=423
Planning Time: 0.5ms
Execution Time: 15.2ms
Bad output (all partitions scanned):
Append (cost=0.00..999999.99 rows=34000000 width=120)
-> Seq Scan on transactions_p0 ...
-> Seq Scan on transactions_p1 ...
-> Seq Scan on transactions_p2 ...
... (all 20 partitions)
If you see all partitions being scanned, the WHERE clause does not match the partition key. Fix the query or add the partition key to the filter.
Automating Partition Creation
For time-series or block-number-based partitions, automate creation with a cron job or PostgreSQL function:
CREATE OR REPLACE FUNCTION create_next_partition()
RETURNS void AS $$
DECLARE
max_block BIGINT;
next_start BIGINT;
next_end BIGINT;
partition_name TEXT;
BEGIN
SELECT MAX(block_number) INTO max_block FROM transactions;
next_start := (max_block / 1000000 + 1) * 1000000;
next_end := next_start + 1000000;
partition_name := format('transactions_p%s', next_start / 1000000);
EXECUTE format(
'CREATE TABLE IF NOT EXISTS %I PARTITION OF transactions
FOR VALUES FROM (%s) TO (%s)',
partition_name, next_start, next_end
);
RAISE NOTICE 'Created partition % for blocks % to %',
partition_name, next_start, next_end;
END;
$$ LANGUAGE plpgsql;
Performance Comparison
| Query | Unpartitioned (34M) | Partitioned (20 x 1.7M) | Speedup |
|---|---|---|---|
| Point lookup by block | 230ms | 12ms | 19x |
| Range scan (500K blocks) | 4.8s | 180ms | 27x |
| COUNT(*) full table | 45s | 45s | 1x |
| VACUUM | 2.1 hours | 6.3 min/partition | Parallel |
| Index rebuild | 12 min (locks table) | 36s/partition | No lock |
Partitioning dramatically improves queries that filter on the partition key. Full-table scans see no improvement (all partitions are scanned). The biggest operational win is VACUUM and index maintenance, which can now run on individual partitions without affecting the others.
Common Pitfalls
1. Missing Partition Key in WHERE Clause
If your query does not filter on the partition key, PostgreSQL scans all partitions. Always include the partition key in WHERE clauses.
2. Too Many Partitions
Each partition has overhead (file descriptors, planner time). More than 100 partitions can slow down query planning. Aim for partitions of 1-10M rows each.
3. Forgetting the Default Partition
Without a default partition, inserts with partition key values outside defined ranges will fail with an error. Always create a default partition as a safety net.
4. Cross-Partition Foreign Keys
Partitioned tables cannot be referenced by foreign keys in PostgreSQL. If other tables reference your partitioned table, you need application-level referential integrity.
Conclusion
PostgreSQL table partitioning is a powerful tool for managing large tables. Range partitioning is ideal for time-series and block-based data, list partitioning for categorical splits, and hash partitioning for even distribution. Start partitioning when your table exceeds 10M rows and queries consistently scan large portions. The key to success: choose a partition key that matches your most common query patterns, keep partition counts reasonable (10-50), and always verify partition pruning with EXPLAIN ANALYZE.