Skip to main content
BlockchainMar 28, 2026

Deep EVM #7: Gas-Efficient Loops and Conditionals in Yul

OS
Open Soft Team

Engineering Team

Why Loops Are Where Gas Goes to Die

In most smart contracts, the single largest gas consumer is iteration. A function that loops over an array of 100 elements executes its body 100 times — and every opcode inside that body is multiplied by 100. A single unnecessary SLOAD inside a loop costs 10,000-210,000 gas. An unoptimized loop counter adds 4,000 gas over 100 iterations.

This article breaks down the exact gas cost of loop constructs in Yul and Solidity, and shows patterns that cut loop gas by 30-60%.

Anatomy of a Yul For-Loop

for { let i := 0 } lt(i, 10) { i := add(i, 1) } {
    // body
}

This compiles to the following opcode sequence:

// Initialization: let i := 0
PUSH1 0x00          // 3 gas (executed once)

// Condition check: lt(i, 10)
JUMPDEST            // 1 gas (loop entry point)
DUP1                // 3 gas
PUSH1 0x0a          // 3 gas
LT                  // 3 gas
PUSH2 <exit>        // 3 gas
JUMPI               // 10 gas (conditional jump)

// Body: (varies)
...

// Post-iteration: i := add(i, 1)
PUSH1 0x01          // 3 gas
ADD                 // 3 gas

// Jump back to condition
PUSH2 <loop_start>  // 3 gas
JUMP                // 8 gas

Per iteration, the loop overhead is:

  • Condition check: 1 + 3 + 3 + 3 + 3 + 10 = 23 gas
  • Post-iteration: 3 + 3 = 6 gas
  • Jump back: 3 + 8 = 11 gas
  • Total overhead per iteration: 40 gas

For 100 iterations, that is 4,000 gas just in loop mechanics — before any work in the body.

Solidity Loop vs Yul Loop: A Benchmark

Let us compare summing a uint256[] calldata array:

Solidity (Checked Arithmetic)

function sumSolidity(uint256[] calldata arr) external pure returns (uint256 total) {
    for (uint256 i = 0; i < arr.length; i++) {
        total += arr[i];
    }
}

Gas per iteration:

  • Loop overhead: ~60 gas (Solidity adds overflow checks on i++ and total +=)
  • CALLDATALOAD: 3 gas
  • Offset computation: ~10 gas
  • Total: ~73 gas/iteration

Solidity (Unchecked)

function sumUnchecked(uint256[] calldata arr) external pure returns (uint256 total) {
    uint256 len = arr.length;
    for (uint256 i = 0; i < len;) {
        total += arr[i];
        unchecked { ++i; }
    }
}

Gas per iteration:

  • Loop overhead: ~40 gas (no overflow checks)
  • CALLDATALOAD: 3 gas
  • Offset computation: ~10 gas
  • Total: ~53 gas/iteration

Yul

function sumYul(offset, length) -> total {
    let end := add(offset, mul(length, 0x20))
    for { } lt(offset, end) { offset := add(offset, 0x20) } {
        total := add(total, calldataload(offset))
    }
}

Gas per iteration:

  • Condition (lt): 1 + 3 + 3 + 3 + 10 = 20 gas
  • Body (add + calldataload): 3 + 3 = 6 gas
  • Post (add offset): 3 + 3 = 6 gas
  • Jump back: 11 gas
  • Total: ~43 gas/iteration

Benchmark Results (100 elements)

MethodGas per iterationTotal (100 elements)Savings vs Solidity
Solidity (checked)~737,300baseline
Solidity (unchecked)~535,30027%
Yul~434,30041%

The savings come from three sources: no overflow checks, direct calldata offset arithmetic (instead of Solidity’s bounds checking), and tighter loop structure.

Switch vs If: Choosing the Right Conditional

Yul provides two conditional constructs. Use the right one.

If (Single Condition)

if iszero(value) {
    revert(0, 0)
}

Compiles to:

DUP1            // 3 gas
ISZERO          // 3 gas
PUSH2 <skip>    // 3 gas
JUMPI           // 10 gas
// ... revert ...
JUMPDEST        // 1 gas

Total: 20 gas for the condition check.

Switch (Multiple Conditions)

switch selector
case 0xa9059cbb { transferImpl() }   // transfer
case 0x70a08231 { balanceOfImpl() }  // balanceOf
case 0x18160ddd { totalSupplyImpl() } // totalSupply
default { revert(0, 0) }

The Yul compiler generates sequential comparisons:

DUP1 PUSH4 0xa9059cbb EQ PUSH2 <case1> JUMPI   // 22 gas
DUP1 PUSH4 0x70a08231 EQ PUSH2 <case2> JUMPI   // 22 gas
DUP1 PUSH4 0x18160ddd EQ PUSH2 <case3> JUMPI   // 22 gas
// default: revert

Each case costs 22 gas to check. For N cases, the worst case is N * 22 gas. Solidity optimizes this with a binary search tree for large function counts, but in Yul you must implement that yourself if needed.

Optimization: Order cases by frequency. Put the most-called function selector first:

// If 80% of calls are transfers, check it first
switch selector
case 0xa9059cbb { transferImpl() }    // Most common: checked first
case 0x70a08231 { balanceOfImpl() }   // Second most common
default { revert(0, 0) }

Unchecked Arithmetic in Yul

All arithmetic in Yul is unchecked by default. There are no overflow or underflow reverts. This is both a feature (performance) and a danger (bugs).

// These all silently wrap:
let a := add(0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff, 1)  // a = 0
let b := sub(0, 1)     // b = 2^256 - 1
let c := mul(0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff, 2)  // c = 2^256 - 2

When is unchecked arithmetic safe?

  1. Loop counters — If the bound is known to be < 2^256 (always true in practice)
  2. Array index calculationoffset + i * 32 where i < array.length and array.length fits in uint256
  3. Token amounts with bounded totals — If totalSupply <= 2^128, no addition of two balances can overflow
  4. Subtraction after comparisonif gt(a, b) { let diff := sub(a, b) } is always safe

When you need overflow checking in Yul:

function safeAdd(a, b) -> c {
    c := add(a, b)
    if lt(c, a) { revert(0, 0) }  // overflow check
}

function safeSub(a, b) -> c {
    if lt(a, b) { revert(0, 0) }  // underflow check
    c := sub(a, b)
}

function safeMul(a, b) -> c {
    if and(b, gt(a, div(not(0), b))) { revert(0, 0) }
    c := mul(a, b)
}

Loop Unrolling

Loop unrolling reduces the overhead per element by processing multiple elements per iteration:

// Standard loop: 40 gas overhead per iteration
function sum(offset, length) -> total {
    let end := add(offset, mul(length, 0x20))
    for { } lt(offset, end) { offset := add(offset, 0x20) } {
        total := add(total, calldataload(offset))
    }
}

// Unrolled 4x: ~13 gas overhead per element
function sumUnrolled(offset, length) -> total {
    let end := add(offset, mul(length, 0x20))

    // Process 4 elements at a time
    let endAligned := sub(end, mod(mul(length, 0x20), 0x80))
    for { } lt(offset, endAligned) { offset := add(offset, 0x80) } {
        total := add(total, calldataload(offset))
        total := add(total, calldataload(add(offset, 0x20)))
        total := add(total, calldataload(add(offset, 0x40)))
        total := add(total, calldataload(add(offset, 0x60)))
    }

    // Handle remaining elements
    for { } lt(offset, end) { offset := add(offset, 0x20) } {
        total := add(total, calldataload(offset))
    }
}

The unrolled version amortizes the loop overhead (condition check + jump) across 4 elements. In practice, this saves 10-20% gas for large arrays.

Minimizing Stack Depth in Loops

The EVM stack is limited, and deep stacks mean more SWAP/DUP operations. Keep loop bodies shallow:

// BAD: many live variables in loop body
for { let i := 0 } lt(i, len) { i := add(i, 1) } {
    let a := calldataload(add(offset, mul(i, 0x20)))
    let b := sload(add(baseSlot, i))
    let c := add(a, b)
    let d := mul(c, price)
    let e := div(d, 1000)
    total := add(total, e)
}

// BETTER: use helper function to reduce stack depth
function processElement(offset, slot, price) -> result {
    let a := calldataload(offset)
    let b := sload(slot)
    result := div(mul(add(a, b), price), 1000)
}

for { let i := 0 } lt(i, len) { i := add(i, 1) } {
    total := add(total, processElement(
        add(offset, mul(i, 0x20)),
        add(baseSlot, i),
        price
    ))
}

Early Exit Patterns

When searching for an element, exit the loop as soon as you find it:

// Find the index of 'target' in a calldata array
function indexOf(offset, length, target) -> index, found {
    let end := add(offset, mul(length, 0x20))
    for { index := 0 } lt(offset, end) { offset := add(offset, 0x20) } {
        if eq(calldataload(offset), target) {
            found := 1
            // Yul has no 'break' — use a function return instead
            leave  // exits the function immediately
        }
        index := add(index, 1)
    }
}

The leave statement is Yul’s equivalent of return — it exits the current function immediately. Use it for early exits in search loops.

Bit Manipulation as Loop Alternative

Sometimes you can replace a loop with bitwise operations:

// Count the number of set bits (population count)
// Loop approach: up to 256 iterations
function popcountLoop(x) -> count {
    for { } x { x := shr(1, x) } {
        count := add(count, and(x, 1))
    }
}

// Bitwise approach: always 5 steps (Brian Kernighan's algorithm)
function popcountFast(x) -> count {
    for { } x { } {
        x := and(x, sub(x, 1))  // clear lowest set bit
        count := add(count, 1)
    }
}

// Even faster: parallel bit counting (no loop)
function popcountParallel(x) -> count {
    x := sub(x, and(shr(1, x), 0x5555555555555555555555555555555555555555555555555555555555555555))
    x := add(and(x, 0x3333333333333333333333333333333333333333333333333333333333333333),
             and(shr(2, x), 0x3333333333333333333333333333333333333333333333333333333333333333))
    x := and(add(x, shr(4, x)), 0x0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f0f)
    count := mod(mul(x, 0x0101010101010101010101010101010101010101010101010101010101010101), exp(2, 248))
}

Practical Example: Gas-Efficient Array Comparison

Compare two calldata arrays for equality:

function arraysEqual(offset1, offset2, length) -> equal {
    equal := 1
    let end := add(offset1, mul(length, 0x20))
    for { } lt(offset1, end) { } {
        if iszero(eq(calldataload(offset1), calldataload(offset2))) {
            equal := 0
            leave
        }
        offset1 := add(offset1, 0x20)
        offset2 := add(offset2, 0x20)
    }
}

This exits on the first mismatch. For MEV applications, this pattern is used to verify expected pool states before executing a trade.

Gas Cost Summary: Loop Patterns

PatternGas per iteration (approximate)
Solidity for (checked)73
Solidity for (unchecked)53
Yul for43
Yul for (unrolled 4x)35
Yul while (sentinel)40

The absolute numbers depend on the loop body, but the relative savings are consistent: Yul loops save 30-50% over standard Solidity.

Conclusion

Loop optimization in Yul is not about clever tricks — it is about understanding the exact gas cost of every opcode in your loop body and systematically eliminating waste. Cache storage reads, unroll tight loops, exit early, and prefer bitwise operations over iteration where possible. These patterns compound: a 30% loop savings in a function called 1000 times per block is 30% more profit for your MEV bot.

In the final article, we put everything together and build a complete token swap contract in pure Yul.