DevOpsMar 28, 2026

Deep EVM #26: Sharding vs Partitioning — Architecture for Massive Tables

Engineering Team

Partitioning vs Sharding: The Fundamental Difference

Both partitioning and sharding split data into smaller pieces, but they operate at different levels:

Partitioning splits a table into multiple physical tables on the SAME database server. The database manages routing transparently.
Sharding splits data across MULTIPLE database servers. The application manages routing.

Partitioning (single server):
+-----------------------------------+
| PostgreSQL Server                 |
| +----------+ +----------+        |
| | Part. 1  | | Part. 2  | ...    |
| +----------+ +----------+        |
+-----------------------------------+

Sharding (multiple servers):
+-------------+  +-------------+  +-------------+
| PG Server 1 |  | PG Server 2 |  | PG Server 3 |
| (Shard A)   |  | (Shard B)   |  | (Shard C)   |
+-------------+  +-------------+  +-------------+

Partitioning scales storage and query performance on a single machine. Sharding scales beyond the limits of a single machine — CPU, memory, disk IOPS, and network bandwidth.

When to Partition vs When to Shard

Criterion	Partition	Shard
Data size < 1TB	Yes	Overkill
Data size > 1TB	Maybe	Yes
Write throughput < 10K/sec	Yes	Overkill
Write throughput > 50K/sec	Not enough	Yes
Read pattern is locality-based	Yes	Yes
Need cross-data joins	Easy	Difficult
Operational complexity budget	Low	High
Single-machine CPU is bottleneck	No help	Yes

The rule: start with partitioning, graduate to sharding only when a single server is insufficient.

Application-Level Sharding

The most common sharding approach puts routing logic in the application layer:

use sqlx::PgPool;
use std::collections::HashMap;

struct ShardRouter {
    shards: HashMap<u32, PgPool>,
    shard_count: u32,
}

impl ShardRouter {
    async fn new(shard_urls: Vec<String>) -> anyhow::Result<Self> {
        let mut shards = HashMap::new();
        for (i, url) in shard_urls.iter().enumerate() {
            let pool = PgPool::connect(url).await?;
            shards.insert(i as u32, pool);
        }
        Ok(Self {
            shard_count: shards.len() as u32,
            shards,
        })
    }

    fn get_shard(&self, shard_key: &[u8]) -> &PgPool {
        let hash = xxhash_rust::xxh3::xxh3_64(shard_key);
        let shard_id = (hash % self.shard_count as u64) as u32;
        &self.shards[&shard_id]
    }

    async fn query_single_shard<T: sqlx::FromRow<'static, sqlx::postgres::PgRow>>(
        &self,
        shard_key: &[u8],
        query: &str,
        params: &[&(dyn sqlx::Encode<'_, sqlx::Postgres> + Sync)],
    ) -> anyhow::Result<Option<T>> {
        let pool = self.get_shard(shard_key);
        let row = sqlx::query_as::<_, T>(query)
            .fetch_optional(pool)
            .await?;
        Ok(row)
    }
}

Usage

let router = ShardRouter::new(vec![
    "postgres://shard1:5432/db".into(),
    "postgres://shard2:5432/db".into(),
    "postgres://shard3:5432/db".into(),
]).await?;

// Route by user address
let user_address = b"0x742d35Cc6634C0532925a3b844Bc9e7595f2bD38";
let pool = router.get_shard(user_address);
let txs = sqlx::query_as::<_, Transaction>(
    "SELECT * FROM transactions WHERE from_addr = $1 ORDER BY block_number DESC LIMIT 100"
)
.bind(user_address)
.fetch_all(pool)
.await?;

Consistent Hashing

Simple modulo hashing (hash % shard_count) breaks when you add or remove shards — almost all keys remap. Consistent hashing minimizes key redistribution:

use std::collections::BTreeMap;

struct ConsistentHashRing {
    ring: BTreeMap<u64, u32>,  // hash -> shard_id
    virtual_nodes: u32,
    shards: HashMap<u32, PgPool>,
}

impl ConsistentHashRing {
    fn new(virtual_nodes: u32) -> Self {
        Self {
            ring: BTreeMap::new(),
            virtual_nodes,
            shards: HashMap::new(),
        }
    }

    fn add_shard(&mut self, shard_id: u32, pool: PgPool) {
        for i in 0..self.virtual_nodes {
            let key = format!("shard-{}-vnode-{}", shard_id, i);
            let hash = xxhash_rust::xxh3::xxh3_64(key.as_bytes());
            self.ring.insert(hash, shard_id);
        }
        self.shards.insert(shard_id, pool);
    }

    fn remove_shard(&mut self, shard_id: u32) {
        for i in 0..self.virtual_nodes {
            let key = format!("shard-{}-vnode-{}", shard_id, i);
            let hash = xxhash_rust::xxh3::xxh3_64(key.as_bytes());
            self.ring.remove(&hash);
        }
        self.shards.remove(&shard_id);
    }

    fn get_shard(&self, key: &[u8]) -> &PgPool {
        let hash = xxhash_rust::xxh3::xxh3_64(key);

        // Find the first node clockwise from the hash
        let shard_id = self.ring
            .range(hash..)
            .next()
            .or_else(|| self.ring.iter().next())  // Wrap around
            .map(|(_, &id)| id)
            .expect("Ring is empty");

        &self.shards[&shard_id]
    }
}

With consistent hashing and 256 virtual nodes per shard, adding a new shard only remaps approximately 1/N of the keys (where N is the total number of shards), instead of nearly all keys with modulo hashing.

Cross-Shard Queries

The hardest problem in sharding is queries that span multiple shards:

impl ShardRouter {
    /// Query all shards in parallel and merge results
    async fn query_all_shards<T>(
        &self,
        query: &str,
    ) -> anyhow::Result<Vec<T>>
    where
        T: sqlx::FromRow<'static, sqlx::postgres::PgRow> + Send + 'static,
    {
        let mut handles = Vec::new();

        for pool in self.shards.values() {
            let pool = pool.clone();
            let query = query.to_string();

            let handle = tokio::spawn(async move {
                sqlx::query_as::<_, T>(&query)
                    .fetch_all(&pool)
                    .await
            });
            handles.push(handle);
        }

        let mut results = Vec::new();
        for handle in handles {
            let shard_results = handle.await??;
            results.extend(shard_results);
        }

        Ok(results)
    }

    /// Cross-shard aggregation
    async fn count_all_shards(
        &self,
        query: &str,
    ) -> anyhow::Result<i64> {
        let mut handles = Vec::new();

        for pool in self.shards.values() {
            let pool = pool.clone();
            let query = query.to_string();

            let handle = tokio::spawn(async move {
                let row: (i64,) = sqlx::query_as(&query)
                    .fetch_one(&pool)
                    .await?;
                Ok::<i64, anyhow::Error>(row.0)
            });
            handles.push(handle);
        }

        let mut total: i64 = 0;
        for handle in handles {
            total += handle.await??;
        }

        Ok(total)
    }
}

Cross-shard queries are expensive: they hit every shard, transfer data over the network, and merge results in memory. Design your shard key to minimize cross-shard queries.

Choosing a Shard Key

Shard Key	Pros	Cons
User address	User queries are single-shard	Cross-user analytics hit all shards
Block number	Block range queries are single-shard	User history spans all shards
Chain ID	Chain-specific queries are fast	Multi-chain aggregations are slow
Hash of TX	Even distribution	Every query hits random shard

The best shard key matches your most common access pattern. For a blockchain explorer, sharding by address makes sense: most queries are “show me transactions for this address.”

Resharding Strategies

When you need to add shards to an existing cluster:

Double-Write Migration

async fn resharding_migration(
    old_router: &ShardRouter,
    new_router: &ShardRouter,
) -> anyhow::Result<()> {
    // Phase 1: Start double-writing to both old and new shards
    tracing::info!("Phase 1: Double-write enabled");

    // Phase 2: Backfill old data to new shards
    for (shard_id, pool) in &old_router.shards {
        let mut cursor: i64 = 0;
        loop {
            let batch: Vec<Transaction> = sqlx::query_as(
                "SELECT * FROM transactions WHERE id > $1 ORDER BY id LIMIT 10000"
            )
            .bind(cursor)
            .fetch_all(pool)
            .await?;

            if batch.is_empty() { break; }
            cursor = batch.last().unwrap().id;

            for tx in &batch {
                let new_pool = new_router.get_shard(&tx.from_addr);
                sqlx::query(
                    "INSERT INTO transactions (id, from_addr, to_addr, value_wei, block_number)
                     VALUES ($1, $2, $3, $4, $5)
                     ON CONFLICT (id) DO NOTHING"
                )
                .bind(tx.id)
                .bind(&tx.from_addr)
                .bind(&tx.to_addr)
                .bind(&tx.value_wei)
                .bind(tx.block_number)
                .execute(new_pool)
                .await?;
            }

            tracing::info!(shard = shard_id, cursor, "Backfill progress");
        }
    }

    // Phase 3: Switch reads to new shards
    tracing::info!("Phase 3: Reads switched to new cluster");

    // Phase 4: Stop writes to old shards
    tracing::info!("Phase 4: Old cluster decommissioned");

    Ok(())
}

PostgreSQL Native Partitioning vs Application Sharding

Feature	PG Partitioning	Application Sharding
Routing	Transparent (PG handles it)	Application code
Cross-partition joins	Native SQL	Manual merge
Transactions	Full ACID	Distributed (2PC or saga)
Max data size	Single server limits	Unlimited
Operational complexity	Low	High
Foreign keys	Limited	None
Scaling dimension	Storage, query performance	Everything

Hybrid Approach: Partition Within Shards

The most scalable architecture combines both:

Application Sharding (by address prefix)
+------------------+  +------------------+  +------------------+
| Shard 0 (0x0-5)  |  | Shard 1 (0x6-a)  |  | Shard 2 (0xb-f)  |
| +---------+      |  | +---------+      |  | +---------+      |
| | Part Q1 |      |  | | Part Q1 |      |  | | Part Q1 |      |
| | Part Q2 |      |  | | Part Q2 |      |  | | Part Q2 |      |
| | Part Q3 |      |  | | Part Q3 |      |  | | Part Q3 |      |
| | Part Q4 |      |  | | Part Q4 |      |  | | Part Q4 |      |
| +---------+      |  | +---------+      |  | +---------+      |
+------------------+  +------------------+  +------------------+

Sharding handles horizontal scaling across machines. Within each shard, partitioning handles query performance and maintenance efficiency.

Conclusion

Start with PostgreSQL native partitioning — it is transparent, supports joins, and has minimal operational overhead. Graduate to application-level sharding only when a single server cannot handle your write throughput, data volume, or query load. Use consistent hashing to minimize data redistribution when adding shards, design your shard key to match your primary access pattern, and expect cross-shard queries to be expensive. For maximum scale, combine sharding across servers with partitioning within each shard.