Skip to main content
EngineeringMar 28, 2026

Deep EVM #23: Performance Debugging — When Database Reads Kill Your Latency

OS
Open Soft Team

Engineering Team

The Symptom: 500 Errors and 114-Second Latency

Your monitoring dashboard lights up red. Response times spike from 50ms to 114 seconds. 500 errors cascade across all endpoints. The CPU is calm at 20%, memory is stable, and network bandwidth is nowhere near capacity. Something is choking the system, and it is not an obvious resource bottleneck.

This article walks through a real debugging session that traced the problem to database read amplification — a pattern where N logical operations trigger N x M physical reads, turning linear workloads into quadratic disasters.

Understanding Read Amplification

Read amplification occurs when a single logical operation requires multiple physical reads from the storage layer. In key-value stores like MDBX or RocksDB, this manifests at multiple levels:

LSM-Tree Read Amplification (RocksDB)

RocksDB uses a Log-Structured Merge Tree (LSM). A single get() call may need to check:

  1. The MemTable (in-memory)
  2. Level 0 SST files (unsorted)
  3. Level 1 SST files
  4. Level 2 SST files
  5. … up to Level N
// Looks simple, but triggers up to N level reads internally
let value = db.get(key)?;

With 5 levels and a bloom filter false positive rate of 1%, a missing key query touches approximately 1.05 levels on average. But a scan over a range touches every level:

// This iterates across ALL levels for each key in range
let iter = db.iterator(IteratorMode::From(start_key, Direction::Forward));
for item in iter {
    let (key, value) = item?;
    if key.as_ref() > end_key { break; }
    process(key, value);
}

Application-Level Read Amplification

The more insidious form happens at the application layer. Consider an EVM node processing a route update:

async fn process_route_update(
    db: &Database,
    affected_routes: &[Route],
) -> Result<()> {
    for route in affected_routes {
        // For EACH route, re-evaluate ALL route evaluators
        for evaluator in &self.evaluators {
            // Each evaluator reads multiple DB keys
            let state = evaluator.load_state(db, route).await?;
            let result = evaluator.evaluate(&state)?;
            evaluator.store_result(db, route, &result).await?;
        }
    }
    Ok(())
}

If there are 180,000 affected routes and 12 evaluators each reading 5 database keys, that is:

180,000 routes x 12 evaluators x 5 reads = 10,800,000 database reads

At 10 microseconds per read (optimistic for MDBX with warm cache), that is 108 seconds — explaining our 114-second latency.

Profiling the Problem

Before optimizing, measure. Use tracing with spans to identify where time is spent:

use tracing::{instrument, info_span, Instrument};

#[instrument(skip(db), fields(route_count = affected_routes.len()))]
async fn process_route_update(
    db: &Database,
    affected_routes: &[Route],
) -> Result<()> {
    let start = std::time::Instant::now();
    let mut read_count: u64 = 0;

    for route in affected_routes {
        let span = info_span!("route", id = %route.id);
        async {
            for evaluator in &self.evaluators {
                let eval_span = info_span!("evaluator", name = %evaluator.name());
                async {
                    let state = evaluator.load_state(db, route).await?;
                    read_count += state.reads_performed;
                    evaluator.evaluate(&state)?;
                }.instrument(eval_span).await;
            }
        }.instrument(span).await;
    }

    tracing::info!(
        elapsed_ms = start.elapsed().as_millis(),
        total_reads = read_count,
        reads_per_sec = read_count as f64 / start.elapsed().as_secs_f64(),
        "Route update complete"
    );
    Ok(())
}

Output reveals the problem:

INFO process_route_update: elapsed_ms=114230 total_reads=10843291
     reads_per_sec=94892 route_count=180412

94,892 reads per second sounds fast, but 10.8 million reads at that rate takes 114 seconds.

Solution 1: O(affected) Instead of O(N)

The most impactful fix is to only process what actually changed, not everything:

async fn process_route_update(
    db: &Database,
    changed_routes: &HashSet<RouteId>,
    all_routes: &[Route],
) -> Result<()> {
    // Only process routes that are actually affected
    let affected: Vec<&Route> = all_routes
        .iter()
        .filter(|r| changed_routes.contains(&r.id))
        .collect();

    tracing::info!(
        total_routes = all_routes.len(),
        affected_routes = affected.len(),
        "Processing only affected routes"
    );

    for route in affected {
        for evaluator in &self.evaluators {
            let state = evaluator.load_state(db, route).await?;
            evaluator.evaluate(&state)?;
        }
    }
    Ok(())
}

If a typical update affects 500 routes instead of 180,000, that is a 360x reduction in database reads:

500 routes x 12 evaluators x 5 reads = 30,000 reads
30,000 / 94,892 reads_per_sec = 0.316 seconds

From 114 seconds to 316 milliseconds.

Solution 2: The CacheDB Pattern

For evaluators that share common reads, use a write-through cache that sits in front of the database:

use std::collections::HashMap;
use parking_lot::RwLock;

struct CacheDB {
    inner: Database,
    cache: RwLock<HashMap<Vec<u8>, Vec<u8>>>,
    hits: AtomicU64,
    misses: AtomicU64,
}

impl CacheDB {
    fn new(inner: Database) -> Self {
        Self {
            inner,
            cache: RwLock::new(HashMap::new()),
            hits: AtomicU64::new(0),
            misses: AtomicU64::new(0),
        }
    }

    fn get(&self, key: &[u8]) -> Result<Option<Vec<u8>>> {
        // Check cache first
        if let Some(value) = self.cache.read().get(key) {
            self.hits.fetch_add(1, Ordering::Relaxed);
            return Ok(Some(value.clone()));
        }

        self.misses.fetch_add(1, Ordering::Relaxed);

        // Cache miss — read from DB and populate cache
        let value = self.inner.get(key)?;
        if let Some(ref v) = value {
            self.cache.write().insert(key.to_vec(), v.clone());
        }
        Ok(value)
    }

    fn put(&self, key: &[u8], value: &[u8]) -> Result<()> {
        self.inner.put(key, value)?;
        self.cache.write().insert(key.to_vec(), value.to_vec());
        Ok(())
    }

    fn stats(&self) -> (u64, u64) {
        let hits = self.hits.load(Ordering::Relaxed);
        let misses = self.misses.load(Ordering::Relaxed);
        (hits, misses)
    }
}

With 12 evaluators reading overlapping keys, the cache hit rate is typically 60-80%, reducing physical reads by that same percentage.

Solution 3: Pre-Populated State

Instead of each evaluator loading state individually, batch-load all needed state upfront:

async fn process_route_update(
    db: &Database,
    affected_routes: &[Route],
) -> Result<()> {
    // Phase 1: Collect all keys needed by all evaluators
    let mut needed_keys: HashSet<Vec<u8>> = HashSet::new();
    for route in affected_routes {
        for evaluator in &self.evaluators {
            needed_keys.extend(evaluator.required_keys(route));
        }
    }

    // Phase 2: Batch read all keys in a single DB transaction
    let state_map = db.multi_get(&needed_keys.into_iter().collect::<Vec<_>>())?;

    // Phase 3: Evaluate with pre-loaded state (zero DB reads)
    for route in affected_routes {
        for evaluator in &self.evaluators {
            let result = evaluator.evaluate_with_state(&state_map, route)?;
        }
    }
    Ok(())
}

Batch reads are dramatically faster because they can be served in a single database transaction, amortizing the overhead of transaction setup, lock acquisition, and B-tree traversal.

Solution 4: Connection Pooling and Async Reads

If you must perform many reads, parallelize them:

use futures::stream::{self, StreamExt};

async fn parallel_evaluate(
    db: &Database,
    routes: &[Route],
    concurrency: usize,
) -> Result<Vec<EvalResult>> {
    let results = stream::iter(routes)
        .map(|route| async move {
            let state = load_route_state(db, route).await?;
            evaluate_route(&state)
        })
        .buffer_unordered(concurrency)
        .collect::<Vec<_>>()
        .await;

    results.into_iter().collect()
}

With concurrency = 64, you can sustain 64 parallel database reads, fully utilizing the disk’s IO bandwidth.

Real Numbers: Before and After

MetricBeforeAfter (all fixes)
Affected routes180,000 (full scan)500 (indexed)
DB reads per update10,800,00012,500
Cache hit rate0%72%
Physical reads10,800,0003,500
Latency114 seconds42ms
500 errorsCascadingZero

The key insight: it was never a database performance problem. MDBX and RocksDB are fast. The problem was asking for 10 million reads when only 3,500 were needed.

Monitoring Database Read Patterns

Add metrics to catch read amplification before it becomes a production incident:

use metrics::{counter, histogram};

impl CacheDB {
    fn get_instrumented(&self, key: &[u8]) -> Result<Option<Vec<u8>>> {
        let start = std::time::Instant::now();
        let result = self.get(key);
        histogram!("db.read.duration_us").record(
            start.elapsed().as_micros() as f64
        );
        counter!("db.reads.total").increment(1);
        result
    }
}

Alert on:

  • db.reads.total rate exceeding 100K/sec (unusual volume)
  • db.read.duration_us p99 exceeding 1ms (storage bottleneck)
  • Read-to-write ratio exceeding 1000:1 (potential amplification)

Conclusion

Database read amplification is a performance killer that hides behind innocent-looking code. A for loop over 180,000 items with 12 evaluators each reading 5 keys produces 10.8 million physical reads — enough to turn a fast MDBX database into a bottleneck. The fix is architectural: process only affected items (O(affected) not O(N)), cache shared reads, batch-load state, and parallelize when needed. Always instrument your database access layer so you can see read patterns before they become production incidents.