[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-deep-evm-26-sharding-vs-partitioning-massive-tables":3},{"article":4,"author":59},{"id":5,"category_id":6,"title":7,"slug":8,"excerpt":9,"content_md":10,"content_html":11,"locale":12,"author_id":13,"published":14,"published_at":15,"meta_title":16,"meta_description":17,"focus_keyword":18,"og_image":19,"canonical_url":19,"robots_meta":20,"created_at":15,"updated_at":15,"tags":21,"category_name":24,"related_articles":39},"d0000000-0000-0000-0000-000000000126","a0000000-0000-0000-0000-000000000005","Deep EVM #26: Sharding vs Partitioning — Architecture for Massive Tables","deep-evm-26-sharding-vs-partitioning-massive-tables","Compare database sharding and partitioning strategies for horizontal scaling. Covers consistent hashing, cross-shard queries, resharding, and when to choose each approach.","## Partitioning vs Sharding: The Fundamental Difference\n\nBoth partitioning and sharding split data into smaller pieces, but they operate at different levels:\n\n- **Partitioning** splits a table into multiple physical tables on the SAME database server. The database manages routing transparently.\n- **Sharding** splits data across MULTIPLE database servers. The application manages routing.\n\n```\nPartitioning (single server):\n+-----------------------------------+\n| PostgreSQL Server                 |\n| +----------+ +----------+        |\n| | Part. 1  | | Part. 2  | ...    |\n| +----------+ +----------+        |\n+-----------------------------------+\n\nSharding (multiple servers):\n+-------------+  +-------------+  +-------------+\n| PG Server 1 |  | PG Server 2 |  | PG Server 3 |\n| (Shard A)   |  | (Shard B)   |  | (Shard C)   |\n+-------------+  +-------------+  +-------------+\n```\n\nPartitioning scales storage and query performance on a single machine. Sharding scales beyond the limits of a single machine — CPU, memory, disk IOPS, and network bandwidth.\n\n## When to Partition vs When to Shard\n\n| Criterion | Partition | Shard |\n|-----------|-----------|-------|\n| Data size \u003C 1TB | Yes | Overkill |\n| Data size > 1TB | Maybe | Yes |\n| Write throughput \u003C 10K\u002Fsec | Yes | Overkill |\n| Write throughput > 50K\u002Fsec | Not enough | Yes |\n| Read pattern is locality-based | Yes | Yes |\n| Need cross-data joins | Easy | Difficult |\n| Operational complexity budget | Low | High |\n| Single-machine CPU is bottleneck | No help | Yes |\n\nThe rule: **start with partitioning, graduate to sharding only when a single server is insufficient**.\n\n## Application-Level Sharding\n\nThe most common sharding approach puts routing logic in the application layer:\n\n```rust\nuse sqlx::PgPool;\nuse std::collections::HashMap;\n\nstruct ShardRouter {\n    shards: HashMap\u003Cu32, PgPool>,\n    shard_count: u32,\n}\n\nimpl ShardRouter {\n    async fn new(shard_urls: Vec\u003CString>) -> anyhow::Result\u003CSelf> {\n        let mut shards = HashMap::new();\n        for (i, url) in shard_urls.iter().enumerate() {\n            let pool = PgPool::connect(url).await?;\n            shards.insert(i as u32, pool);\n        }\n        Ok(Self {\n            shard_count: shards.len() as u32,\n            shards,\n        })\n    }\n\n    fn get_shard(&self, shard_key: &[u8]) -> &PgPool {\n        let hash = xxhash_rust::xxh3::xxh3_64(shard_key);\n        let shard_id = (hash % self.shard_count as u64) as u32;\n        &self.shards[&shard_id]\n    }\n\n    async fn query_single_shard\u003CT: sqlx::FromRow\u003C'static, sqlx::postgres::PgRow>>(\n        &self,\n        shard_key: &[u8],\n        query: &str,\n        params: &[&(dyn sqlx::Encode\u003C'_, sqlx::Postgres> + Sync)],\n    ) -> anyhow::Result\u003COption\u003CT>> {\n        let pool = self.get_shard(shard_key);\n        let row = sqlx::query_as::\u003C_, T>(query)\n            .fetch_optional(pool)\n            .await?;\n        Ok(row)\n    }\n}\n```\n\n### Usage\n\n```rust\nlet router = ShardRouter::new(vec![\n    \"postgres:\u002F\u002Fshard1:5432\u002Fdb\".into(),\n    \"postgres:\u002F\u002Fshard2:5432\u002Fdb\".into(),\n    \"postgres:\u002F\u002Fshard3:5432\u002Fdb\".into(),\n]).await?;\n\n\u002F\u002F Route by user address\nlet user_address = b\"0x742d35Cc6634C0532925a3b844Bc9e7595f2bD38\";\nlet pool = router.get_shard(user_address);\nlet txs = sqlx::query_as::\u003C_, Transaction>(\n    \"SELECT * FROM transactions WHERE from_addr = $1 ORDER BY block_number DESC LIMIT 100\"\n)\n.bind(user_address)\n.fetch_all(pool)\n.await?;\n```\n\n## Consistent Hashing\n\nSimple modulo hashing (`hash % shard_count`) breaks when you add or remove shards — almost all keys remap. Consistent hashing minimizes key redistribution:\n\n```rust\nuse std::collections::BTreeMap;\n\nstruct ConsistentHashRing {\n    ring: BTreeMap\u003Cu64, u32>,  \u002F\u002F hash -> shard_id\n    virtual_nodes: u32,\n    shards: HashMap\u003Cu32, PgPool>,\n}\n\nimpl ConsistentHashRing {\n    fn new(virtual_nodes: u32) -> Self {\n        Self {\n            ring: BTreeMap::new(),\n            virtual_nodes,\n            shards: HashMap::new(),\n        }\n    }\n\n    fn add_shard(&mut self, shard_id: u32, pool: PgPool) {\n        for i in 0..self.virtual_nodes {\n            let key = format!(\"shard-{}-vnode-{}\", shard_id, i);\n            let hash = xxhash_rust::xxh3::xxh3_64(key.as_bytes());\n            self.ring.insert(hash, shard_id);\n        }\n        self.shards.insert(shard_id, pool);\n    }\n\n    fn remove_shard(&mut self, shard_id: u32) {\n        for i in 0..self.virtual_nodes {\n            let key = format!(\"shard-{}-vnode-{}\", shard_id, i);\n            let hash = xxhash_rust::xxh3::xxh3_64(key.as_bytes());\n            self.ring.remove(&hash);\n        }\n        self.shards.remove(&shard_id);\n    }\n\n    fn get_shard(&self, key: &[u8]) -> &PgPool {\n        let hash = xxhash_rust::xxh3::xxh3_64(key);\n\n        \u002F\u002F Find the first node clockwise from the hash\n        let shard_id = self.ring\n            .range(hash..)\n            .next()\n            .or_else(|| self.ring.iter().next())  \u002F\u002F Wrap around\n            .map(|(_, &id)| id)\n            .expect(\"Ring is empty\");\n\n        &self.shards[&shard_id]\n    }\n}\n```\n\nWith consistent hashing and 256 virtual nodes per shard, adding a new shard only remaps approximately `1\u002FN` of the keys (where N is the total number of shards), instead of nearly all keys with modulo hashing.\n\n## Cross-Shard Queries\n\nThe hardest problem in sharding is queries that span multiple shards:\n\n```rust\nimpl ShardRouter {\n    \u002F\u002F\u002F Query all shards in parallel and merge results\n    async fn query_all_shards\u003CT>(\n        &self,\n        query: &str,\n    ) -> anyhow::Result\u003CVec\u003CT>>\n    where\n        T: sqlx::FromRow\u003C'static, sqlx::postgres::PgRow> + Send + 'static,\n    {\n        let mut handles = Vec::new();\n\n        for pool in self.shards.values() {\n            let pool = pool.clone();\n            let query = query.to_string();\n\n            let handle = tokio::spawn(async move {\n                sqlx::query_as::\u003C_, T>(&query)\n                    .fetch_all(&pool)\n                    .await\n            });\n            handles.push(handle);\n        }\n\n        let mut results = Vec::new();\n        for handle in handles {\n            let shard_results = handle.await??;\n            results.extend(shard_results);\n        }\n\n        Ok(results)\n    }\n\n    \u002F\u002F\u002F Cross-shard aggregation\n    async fn count_all_shards(\n        &self,\n        query: &str,\n    ) -> anyhow::Result\u003Ci64> {\n        let mut handles = Vec::new();\n\n        for pool in self.shards.values() {\n            let pool = pool.clone();\n            let query = query.to_string();\n\n            let handle = tokio::spawn(async move {\n                let row: (i64,) = sqlx::query_as(&query)\n                    .fetch_one(&pool)\n                    .await?;\n                Ok::\u003Ci64, anyhow::Error>(row.0)\n            });\n            handles.push(handle);\n        }\n\n        let mut total: i64 = 0;\n        for handle in handles {\n            total += handle.await??;\n        }\n\n        Ok(total)\n    }\n}\n```\n\nCross-shard queries are expensive: they hit every shard, transfer data over the network, and merge results in memory. Design your shard key to minimize cross-shard queries.\n\n### Choosing a Shard Key\n\n| Shard Key | Pros | Cons |\n|-----------|------|------|\n| User address | User queries are single-shard | Cross-user analytics hit all shards |\n| Block number | Block range queries are single-shard | User history spans all shards |\n| Chain ID | Chain-specific queries are fast | Multi-chain aggregations are slow |\n| Hash of TX | Even distribution | Every query hits random shard |\n\nThe best shard key matches your most common access pattern. For a blockchain explorer, sharding by address makes sense: most queries are \"show me transactions for this address.\"\n\n## Resharding Strategies\n\nWhen you need to add shards to an existing cluster:\n\n### Double-Write Migration\n\n```rust\nasync fn resharding_migration(\n    old_router: &ShardRouter,\n    new_router: &ShardRouter,\n) -> anyhow::Result\u003C()> {\n    \u002F\u002F Phase 1: Start double-writing to both old and new shards\n    tracing::info!(\"Phase 1: Double-write enabled\");\n\n    \u002F\u002F Phase 2: Backfill old data to new shards\n    for (shard_id, pool) in &old_router.shards {\n        let mut cursor: i64 = 0;\n        loop {\n            let batch: Vec\u003CTransaction> = sqlx::query_as(\n                \"SELECT * FROM transactions WHERE id > $1 ORDER BY id LIMIT 10000\"\n            )\n            .bind(cursor)\n            .fetch_all(pool)\n            .await?;\n\n            if batch.is_empty() { break; }\n            cursor = batch.last().unwrap().id;\n\n            for tx in &batch {\n                let new_pool = new_router.get_shard(&tx.from_addr);\n                sqlx::query(\n                    \"INSERT INTO transactions (id, from_addr, to_addr, value_wei, block_number)\n                     VALUES ($1, $2, $3, $4, $5)\n                     ON CONFLICT (id) DO NOTHING\"\n                )\n                .bind(tx.id)\n                .bind(&tx.from_addr)\n                .bind(&tx.to_addr)\n                .bind(&tx.value_wei)\n                .bind(tx.block_number)\n                .execute(new_pool)\n                .await?;\n            }\n\n            tracing::info!(shard = shard_id, cursor, \"Backfill progress\");\n        }\n    }\n\n    \u002F\u002F Phase 3: Switch reads to new shards\n    tracing::info!(\"Phase 3: Reads switched to new cluster\");\n\n    \u002F\u002F Phase 4: Stop writes to old shards\n    tracing::info!(\"Phase 4: Old cluster decommissioned\");\n\n    Ok(())\n}\n```\n\n## PostgreSQL Native Partitioning vs Application Sharding\n\n| Feature | PG Partitioning | Application Sharding |\n|---------|----------------|---------------------|\n| Routing | Transparent (PG handles it) | Application code |\n| Cross-partition joins | Native SQL | Manual merge |\n| Transactions | Full ACID | Distributed (2PC or saga) |\n| Max data size | Single server limits | Unlimited |\n| Operational complexity | Low | High |\n| Foreign keys | Limited | None |\n| Scaling dimension | Storage, query performance | Everything |\n\n## Hybrid Approach: Partition Within Shards\n\nThe most scalable architecture combines both:\n\n```\nApplication Sharding (by address prefix)\n+------------------+  +------------------+  +------------------+\n| Shard 0 (0x0-5)  |  | Shard 1 (0x6-a)  |  | Shard 2 (0xb-f)  |\n| +---------+      |  | +---------+      |  | +---------+      |\n| | Part Q1 |      |  | | Part Q1 |      |  | | Part Q1 |      |\n| | Part Q2 |      |  | | Part Q2 |      |  | | Part Q2 |      |\n| | Part Q3 |      |  | | Part Q3 |      |  | | Part Q3 |      |\n| | Part Q4 |      |  | | Part Q4 |      |  | | Part Q4 |      |\n| +---------+      |  | +---------+      |  | +---------+      |\n+------------------+  +------------------+  +------------------+\n```\n\nSharding handles horizontal scaling across machines. Within each shard, partitioning handles query performance and maintenance efficiency.\n\n## Conclusion\n\nStart with PostgreSQL native partitioning — it is transparent, supports joins, and has minimal operational overhead. Graduate to application-level sharding only when a single server cannot handle your write throughput, data volume, or query load. Use consistent hashing to minimize data redistribution when adding shards, design your shard key to match your primary access pattern, and expect cross-shard queries to be expensive. For maximum scale, combine sharding across servers with partitioning within each shard.","\u003Ch2 id=\"partitioning-vs-sharding-the-fundamental-difference\">Partitioning vs Sharding: The Fundamental Difference\u003C\u002Fh2>\n\u003Cp>Both partitioning and sharding split data into smaller pieces, but they operate at different levels:\u003C\u002Fp>\n\u003Cul>\n\u003Cli>\u003Cstrong>Partitioning\u003C\u002Fstrong> splits a table into multiple physical tables on the SAME database server. The database manages routing transparently.\u003C\u002Fli>\n\u003Cli>\u003Cstrong>Sharding\u003C\u002Fstrong> splits data across MULTIPLE database servers. The application manages routing.\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Cpre>\u003Ccode>Partitioning (single server):\n+-----------------------------------+\n| PostgreSQL Server                 |\n| +----------+ +----------+        |\n| | Part. 1  | | Part. 2  | ...    |\n| +----------+ +----------+        |\n+-----------------------------------+\n\nSharding (multiple servers):\n+-------------+  +-------------+  +-------------+\n| PG Server 1 |  | PG Server 2 |  | PG Server 3 |\n| (Shard A)   |  | (Shard B)   |  | (Shard C)   |\n+-------------+  +-------------+  +-------------+\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>Partitioning scales storage and query performance on a single machine. Sharding scales beyond the limits of a single machine — CPU, memory, disk IOPS, and network bandwidth.\u003C\u002Fp>\n\u003Ch2 id=\"when-to-partition-vs-when-to-shard\">When to Partition vs When to Shard\u003C\u002Fh2>\n\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Criterion\u003C\u002Fth>\u003Cth>Partition\u003C\u002Fth>\u003Cth>Shard\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\n\u003Ctr>\u003Ctd>Data size &lt; 1TB\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003Ctd>Overkill\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Data size &gt; 1TB\u003C\u002Ftd>\u003Ctd>Maybe\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Write throughput &lt; 10K\u002Fsec\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003Ctd>Overkill\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Write throughput &gt; 50K\u002Fsec\u003C\u002Ftd>\u003Ctd>Not enough\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Read pattern is locality-based\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Need cross-data joins\u003C\u002Ftd>\u003Ctd>Easy\u003C\u002Ftd>\u003Ctd>Difficult\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Operational complexity budget\u003C\u002Ftd>\u003Ctd>Low\u003C\u002Ftd>\u003Ctd>High\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Single-machine CPU is bottleneck\u003C\u002Ftd>\u003Ctd>No help\u003C\u002Ftd>\u003Ctd>Yes\u003C\u002Ftd>\u003C\u002Ftr>\n\u003C\u002Ftbody>\u003C\u002Ftable>\n\u003Cp>The rule: \u003Cstrong>start with partitioning, graduate to sharding only when a single server is insufficient\u003C\u002Fstrong>.\u003C\u002Fp>\n\u003Ch2 id=\"application-level-sharding\">Application-Level Sharding\u003C\u002Fh2>\n\u003Cp>The most common sharding approach puts routing logic in the application layer:\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-rust\">use sqlx::PgPool;\nuse std::collections::HashMap;\n\nstruct ShardRouter {\n    shards: HashMap&lt;u32, PgPool&gt;,\n    shard_count: u32,\n}\n\nimpl ShardRouter {\n    async fn new(shard_urls: Vec&lt;String&gt;) -&gt; anyhow::Result&lt;Self&gt; {\n        let mut shards = HashMap::new();\n        for (i, url) in shard_urls.iter().enumerate() {\n            let pool = PgPool::connect(url).await?;\n            shards.insert(i as u32, pool);\n        }\n        Ok(Self {\n            shard_count: shards.len() as u32,\n            shards,\n        })\n    }\n\n    fn get_shard(&amp;self, shard_key: &amp;[u8]) -&gt; &amp;PgPool {\n        let hash = xxhash_rust::xxh3::xxh3_64(shard_key);\n        let shard_id = (hash % self.shard_count as u64) as u32;\n        &amp;self.shards[&amp;shard_id]\n    }\n\n    async fn query_single_shard&lt;T: sqlx::FromRow&lt;'static, sqlx::postgres::PgRow&gt;&gt;(\n        &amp;self,\n        shard_key: &amp;[u8],\n        query: &amp;str,\n        params: &amp;[&amp;(dyn sqlx::Encode&lt;'_, sqlx::Postgres&gt; + Sync)],\n    ) -&gt; anyhow::Result&lt;Option&lt;T&gt;&gt; {\n        let pool = self.get_shard(shard_key);\n        let row = sqlx::query_as::&lt;_, T&gt;(query)\n            .fetch_optional(pool)\n            .await?;\n        Ok(row)\n    }\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch3>Usage\u003C\u002Fh3>\n\u003Cpre>\u003Ccode class=\"language-rust\">let router = ShardRouter::new(vec![\n    \"postgres:\u002F\u002Fshard1:5432\u002Fdb\".into(),\n    \"postgres:\u002F\u002Fshard2:5432\u002Fdb\".into(),\n    \"postgres:\u002F\u002Fshard3:5432\u002Fdb\".into(),\n]).await?;\n\n\u002F\u002F Route by user address\nlet user_address = b\"0x742d35Cc6634C0532925a3b844Bc9e7595f2bD38\";\nlet pool = router.get_shard(user_address);\nlet txs = sqlx::query_as::&lt;_, Transaction&gt;(\n    \"SELECT * FROM transactions WHERE from_addr = $1 ORDER BY block_number DESC LIMIT 100\"\n)\n.bind(user_address)\n.fetch_all(pool)\n.await?;\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"consistent-hashing\">Consistent Hashing\u003C\u002Fh2>\n\u003Cp>Simple modulo hashing (\u003Ccode>hash % shard_count\u003C\u002Fcode>) breaks when you add or remove shards — almost all keys remap. Consistent hashing minimizes key redistribution:\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-rust\">use std::collections::BTreeMap;\n\nstruct ConsistentHashRing {\n    ring: BTreeMap&lt;u64, u32&gt;,  \u002F\u002F hash -&gt; shard_id\n    virtual_nodes: u32,\n    shards: HashMap&lt;u32, PgPool&gt;,\n}\n\nimpl ConsistentHashRing {\n    fn new(virtual_nodes: u32) -&gt; Self {\n        Self {\n            ring: BTreeMap::new(),\n            virtual_nodes,\n            shards: HashMap::new(),\n        }\n    }\n\n    fn add_shard(&amp;mut self, shard_id: u32, pool: PgPool) {\n        for i in 0..self.virtual_nodes {\n            let key = format!(\"shard-{}-vnode-{}\", shard_id, i);\n            let hash = xxhash_rust::xxh3::xxh3_64(key.as_bytes());\n            self.ring.insert(hash, shard_id);\n        }\n        self.shards.insert(shard_id, pool);\n    }\n\n    fn remove_shard(&amp;mut self, shard_id: u32) {\n        for i in 0..self.virtual_nodes {\n            let key = format!(\"shard-{}-vnode-{}\", shard_id, i);\n            let hash = xxhash_rust::xxh3::xxh3_64(key.as_bytes());\n            self.ring.remove(&amp;hash);\n        }\n        self.shards.remove(&amp;shard_id);\n    }\n\n    fn get_shard(&amp;self, key: &amp;[u8]) -&gt; &amp;PgPool {\n        let hash = xxhash_rust::xxh3::xxh3_64(key);\n\n        \u002F\u002F Find the first node clockwise from the hash\n        let shard_id = self.ring\n            .range(hash..)\n            .next()\n            .or_else(|| self.ring.iter().next())  \u002F\u002F Wrap around\n            .map(|(_, &amp;id)| id)\n            .expect(\"Ring is empty\");\n\n        &amp;self.shards[&amp;shard_id]\n    }\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>With consistent hashing and 256 virtual nodes per shard, adding a new shard only remaps approximately \u003Ccode>1\u002FN\u003C\u002Fcode> of the keys (where N is the total number of shards), instead of nearly all keys with modulo hashing.\u003C\u002Fp>\n\u003Ch2 id=\"cross-shard-queries\">Cross-Shard Queries\u003C\u002Fh2>\n\u003Cp>The hardest problem in sharding is queries that span multiple shards:\u003C\u002Fp>\n\u003Cpre>\u003Ccode class=\"language-rust\">impl ShardRouter {\n    \u002F\u002F\u002F Query all shards in parallel and merge results\n    async fn query_all_shards&lt;T&gt;(\n        &amp;self,\n        query: &amp;str,\n    ) -&gt; anyhow::Result&lt;Vec&lt;T&gt;&gt;\n    where\n        T: sqlx::FromRow&lt;'static, sqlx::postgres::PgRow&gt; + Send + 'static,\n    {\n        let mut handles = Vec::new();\n\n        for pool in self.shards.values() {\n            let pool = pool.clone();\n            let query = query.to_string();\n\n            let handle = tokio::spawn(async move {\n                sqlx::query_as::&lt;_, T&gt;(&amp;query)\n                    .fetch_all(&amp;pool)\n                    .await\n            });\n            handles.push(handle);\n        }\n\n        let mut results = Vec::new();\n        for handle in handles {\n            let shard_results = handle.await??;\n            results.extend(shard_results);\n        }\n\n        Ok(results)\n    }\n\n    \u002F\u002F\u002F Cross-shard aggregation\n    async fn count_all_shards(\n        &amp;self,\n        query: &amp;str,\n    ) -&gt; anyhow::Result&lt;i64&gt; {\n        let mut handles = Vec::new();\n\n        for pool in self.shards.values() {\n            let pool = pool.clone();\n            let query = query.to_string();\n\n            let handle = tokio::spawn(async move {\n                let row: (i64,) = sqlx::query_as(&amp;query)\n                    .fetch_one(&amp;pool)\n                    .await?;\n                Ok::&lt;i64, anyhow::Error&gt;(row.0)\n            });\n            handles.push(handle);\n        }\n\n        let mut total: i64 = 0;\n        for handle in handles {\n            total += handle.await??;\n        }\n\n        Ok(total)\n    }\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>Cross-shard queries are expensive: they hit every shard, transfer data over the network, and merge results in memory. Design your shard key to minimize cross-shard queries.\u003C\u002Fp>\n\u003Ch3>Choosing a Shard Key\u003C\u002Fh3>\n\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Shard Key\u003C\u002Fth>\u003Cth>Pros\u003C\u002Fth>\u003Cth>Cons\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\n\u003Ctr>\u003Ctd>User address\u003C\u002Ftd>\u003Ctd>User queries are single-shard\u003C\u002Ftd>\u003Ctd>Cross-user analytics hit all shards\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Block number\u003C\u002Ftd>\u003Ctd>Block range queries are single-shard\u003C\u002Ftd>\u003Ctd>User history spans all shards\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Chain ID\u003C\u002Ftd>\u003Ctd>Chain-specific queries are fast\u003C\u002Ftd>\u003Ctd>Multi-chain aggregations are slow\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Hash of TX\u003C\u002Ftd>\u003Ctd>Even distribution\u003C\u002Ftd>\u003Ctd>Every query hits random shard\u003C\u002Ftd>\u003C\u002Ftr>\n\u003C\u002Ftbody>\u003C\u002Ftable>\n\u003Cp>The best shard key matches your most common access pattern. For a blockchain explorer, sharding by address makes sense: most queries are “show me transactions for this address.”\u003C\u002Fp>\n\u003Ch2 id=\"resharding-strategies\">Resharding Strategies\u003C\u002Fh2>\n\u003Cp>When you need to add shards to an existing cluster:\u003C\u002Fp>\n\u003Ch3>Double-Write Migration\u003C\u002Fh3>\n\u003Cpre>\u003Ccode class=\"language-rust\">async fn resharding_migration(\n    old_router: &amp;ShardRouter,\n    new_router: &amp;ShardRouter,\n) -&gt; anyhow::Result&lt;()&gt; {\n    \u002F\u002F Phase 1: Start double-writing to both old and new shards\n    tracing::info!(\"Phase 1: Double-write enabled\");\n\n    \u002F\u002F Phase 2: Backfill old data to new shards\n    for (shard_id, pool) in &amp;old_router.shards {\n        let mut cursor: i64 = 0;\n        loop {\n            let batch: Vec&lt;Transaction&gt; = sqlx::query_as(\n                \"SELECT * FROM transactions WHERE id &gt; $1 ORDER BY id LIMIT 10000\"\n            )\n            .bind(cursor)\n            .fetch_all(pool)\n            .await?;\n\n            if batch.is_empty() { break; }\n            cursor = batch.last().unwrap().id;\n\n            for tx in &amp;batch {\n                let new_pool = new_router.get_shard(&amp;tx.from_addr);\n                sqlx::query(\n                    \"INSERT INTO transactions (id, from_addr, to_addr, value_wei, block_number)\n                     VALUES ($1, $2, $3, $4, $5)\n                     ON CONFLICT (id) DO NOTHING\"\n                )\n                .bind(tx.id)\n                .bind(&amp;tx.from_addr)\n                .bind(&amp;tx.to_addr)\n                .bind(&amp;tx.value_wei)\n                .bind(tx.block_number)\n                .execute(new_pool)\n                .await?;\n            }\n\n            tracing::info!(shard = shard_id, cursor, \"Backfill progress\");\n        }\n    }\n\n    \u002F\u002F Phase 3: Switch reads to new shards\n    tracing::info!(\"Phase 3: Reads switched to new cluster\");\n\n    \u002F\u002F Phase 4: Stop writes to old shards\n    tracing::info!(\"Phase 4: Old cluster decommissioned\");\n\n    Ok(())\n}\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Ch2 id=\"postgresql-native-partitioning-vs-application-sharding\">PostgreSQL Native Partitioning vs Application Sharding\u003C\u002Fh2>\n\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Feature\u003C\u002Fth>\u003Cth>PG Partitioning\u003C\u002Fth>\u003Cth>Application Sharding\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\n\u003Ctr>\u003Ctd>Routing\u003C\u002Ftd>\u003Ctd>Transparent (PG handles it)\u003C\u002Ftd>\u003Ctd>Application code\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Cross-partition joins\u003C\u002Ftd>\u003Ctd>Native SQL\u003C\u002Ftd>\u003Ctd>Manual merge\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Transactions\u003C\u002Ftd>\u003Ctd>Full ACID\u003C\u002Ftd>\u003Ctd>Distributed (2PC or saga)\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Max data size\u003C\u002Ftd>\u003Ctd>Single server limits\u003C\u002Ftd>\u003Ctd>Unlimited\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Operational complexity\u003C\u002Ftd>\u003Ctd>Low\u003C\u002Ftd>\u003Ctd>High\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Foreign keys\u003C\u002Ftd>\u003Ctd>Limited\u003C\u002Ftd>\u003Ctd>None\u003C\u002Ftd>\u003C\u002Ftr>\n\u003Ctr>\u003Ctd>Scaling dimension\u003C\u002Ftd>\u003Ctd>Storage, query performance\u003C\u002Ftd>\u003Ctd>Everything\u003C\u002Ftd>\u003C\u002Ftr>\n\u003C\u002Ftbody>\u003C\u002Ftable>\n\u003Ch2 id=\"hybrid-approach-partition-within-shards\">Hybrid Approach: Partition Within Shards\u003C\u002Fh2>\n\u003Cp>The most scalable architecture combines both:\u003C\u002Fp>\n\u003Cpre>\u003Ccode>Application Sharding (by address prefix)\n+------------------+  +------------------+  +------------------+\n| Shard 0 (0x0-5)  |  | Shard 1 (0x6-a)  |  | Shard 2 (0xb-f)  |\n| +---------+      |  | +---------+      |  | +---------+      |\n| | Part Q1 |      |  | | Part Q1 |      |  | | Part Q1 |      |\n| | Part Q2 |      |  | | Part Q2 |      |  | | Part Q2 |      |\n| | Part Q3 |      |  | | Part Q3 |      |  | | Part Q3 |      |\n| | Part Q4 |      |  | | Part Q4 |      |  | | Part Q4 |      |\n| +---------+      |  | +---------+      |  | +---------+      |\n+------------------+  +------------------+  +------------------+\n\u003C\u002Fcode>\u003C\u002Fpre>\n\u003Cp>Sharding handles horizontal scaling across machines. Within each shard, partitioning handles query performance and maintenance efficiency.\u003C\u002Fp>\n\u003Ch2 id=\"conclusion\">Conclusion\u003C\u002Fh2>\n\u003Cp>Start with PostgreSQL native partitioning — it is transparent, supports joins, and has minimal operational overhead. Graduate to application-level sharding only when a single server cannot handle your write throughput, data volume, or query load. Use consistent hashing to minimize data redistribution when adding shards, design your shard key to match your primary access pattern, and expect cross-shard queries to be expensive. For maximum scale, combine sharding across servers with partitioning within each shard.\u003C\u002Fp>\n","en","b0000000-0000-0000-0000-000000000001",true,"2026-03-28T10:44:23.179014Z","Sharding vs Partitioning — Architecture for Massive Tables","Compare database sharding and partitioning for horizontal scaling. Covers consistent hashing, cross-shard queries, resharding, and PostgreSQL native partitioning.","database sharding vs partitioning",null,"index, follow",[22,27,31,35],{"id":23,"name":24,"slug":25,"created_at":26},"c0000000-0000-0000-0000-000000000012","DevOps","devops","2026-03-28T10:44:21.513630Z",{"id":28,"name":29,"slug":30,"created_at":26},"c0000000-0000-0000-0000-000000000022","Performance","performance",{"id":32,"name":33,"slug":34,"created_at":26},"c0000000-0000-0000-0000-000000000005","PostgreSQL","postgresql",{"id":36,"name":37,"slug":38,"created_at":26},"c0000000-0000-0000-0000-000000000001","Rust","rust",[40,47,53],{"id":41,"title":42,"slug":43,"excerpt":44,"locale":12,"category_name":45,"published_at":46},"d0200000-0000-0000-0000-000000000003","Why Bali Is Becoming Southeast Asia's Impact-Tech Hub in 2026","why-bali-becoming-southeast-asia-impact-tech-hub-2026","Bali ranks #16 among Southeast Asian startup ecosystems. With a growing concentration of Web3 builders, AI sustainability startups, and eco-travel tech companies, the island is carving a niche as the region's impact-tech capital.","Engineering","2026-03-28T10:44:37.748283Z",{"id":48,"title":49,"slug":50,"excerpt":51,"locale":12,"category_name":45,"published_at":52},"d0200000-0000-0000-0000-000000000002","ASEAN Data Protection Patchwork: A Developer's Compliance Checklist","asean-data-protection-patchwork-developer-compliance-checklist","Seven ASEAN countries now have comprehensive data protection laws, each with different consent models, localization requirements, and penalty structures. Here is a practical compliance checklist for developers building multi-country applications.","2026-03-28T10:44:37.374741Z",{"id":54,"title":55,"slug":56,"excerpt":57,"locale":12,"category_name":45,"published_at":58},"d0200000-0000-0000-0000-000000000001","Indonesia's $29 Billion Digital Transformation: Opportunities for Software Companies","indonesia-29-billion-digital-transformation-opportunities-software-companies","Indonesia's IT services market is projected to reach $29.03 billion in 2026, up from $24.37 billion in 2025. Cloud infrastructure, AI, e-commerce, and data centers are driving the fastest growth in Southeast Asia.","2026-03-28T10:44:37.349311Z",{"id":13,"name":60,"slug":61,"bio":62,"photo_url":19,"linkedin":19,"role":63,"created_at":64,"updated_at":64},"Open Soft Team","open-soft-team","The engineering team at Open Soft, building premium software solutions from Bali, Indonesia.","Engineering Team","2026-03-28T08:31:22.226811Z"]