BlockchainMar 28, 2026

Deep EVM #18: Debugging EVM Bytecode — Traces, Stack Dumps, and cast run

Engineering Team

The Debugging Challenge with Low-Level EVM Code

When a Solidity transaction reverts, you typically get a descriptive error message like ERC20: transfer amount exceeds balance. When a Huff or Yul transaction reverts, you get 0x — an empty revert payload with zero context. The contract simply hit a REVERT opcode, and it is up to you to figure out why.

Debugging at the bytecode level requires different tools and mental models. You need to think in terms of the stack machine, track memory and storage changes opcode by opcode, and understand how the EVM executes control flow through JUMP and JUMPI instructions.

This article covers the essential debugging toolkit: cast run for replaying historical transactions, forge debug for interactive step-through debugging, and manual trace analysis for understanding exactly what happened inside the EVM.

cast run: Replaying Transactions

cast run is the fastest way to debug a failed transaction. It replays the transaction against the historical state and shows you exactly what happened:

cast run 0xYOUR_TX_HASH --rpc-url https://eth-mainnet.g.alchemy.com/v2/KEY

The output shows a structured trace with call depth, gas usage, and return data:

Traces:
  [328439] 0xContractAddr::transfer(0xRecipient, 1000000000000000000)
    +- [2604] 0xContractAddr::balanceOf(0xSender) [staticcall]
    |   +- <- 500000000000000000
    +- <- revert: EvmError: Revert

This immediately tells you the transfer failed because the sender had 0.5 ETH but tried to transfer 1.0 ETH. For Huff contracts, the function names will not be decoded (they appear as raw selectors), but the call structure and revert points are still visible.

Decoding Raw Selectors

When working with Huff contracts, cast run shows raw function selectors. Decode them manually:

# Compute selector for balanceOf(address)
cast sig "balanceOf(address)"
# Output: 0x70a08231

# Or decode calldata
cast 4byte-decode 0x70a0823100000000000000000000000042069abcdef
# Output: balanceOf(address)(0x42069abcdef)

Keep a reference table of your contract’s selectors when debugging Huff:

0x70a08231 -> balanceOf(address)
0xa9059cbb -> transfer(address,uint256)
0x23b872dd -> transferFrom(address,address,uint256)
0x095ea7b3 -> approve(address,uint256)

forge debug: Interactive Step-Through

forge debug provides a TUI (terminal user interface) for stepping through EVM execution opcode by opcode:

forge debug --debug test/SimpleToken.t.sol \
  --sig "test_transfer()" -vvvv

The interface shows four panels:

Opcodes — The current instruction with a cursor, showing the bytecode being executed
Stack — The current stack state with all 32-byte words
Memory — Raw memory contents in hex
Storage — Storage slot changes during execution

Navigation keys:

j/k — Step forward/backward
g/G — Jump to start/end
c — Continue to next call boundary
C — Continue to next test
q — Quit

Reading the Stack During Debugging

The EVM stack is last-in-first-out with a maximum depth of 1024. When debugging Huff, you must track the stack mentally to understand what each opcode consumes and produces.

Consider this Huff snippet:

0x04 calldataload   // Stack: [address]
BALANCES_SLOT       // Stack: [slot, address]

After calldataload, the stack has the address parameter. After pushing the storage pointer, we have [slot, address]. If you see the wrong value at position 0 on the stack, you know the bug is in how the storage slot is computed.

Understanding Opcode Traces

For production debugging (when you cannot reproduce the issue locally), raw opcode traces from archive nodes are your primary tool. Services like Tenderly, Etherscan, and Alchemy provide trace APIs:

# Get trace via cast
cast run TX_HASH --rpc-url $RPC -vvvvv 2>&1 | head -200

The verbose trace format shows each opcode with gas cost and stack state:

[0] PUSH1 0x00          gas: 29234  stack: []
[2] CALLDATALOAD        gas: 29231  stack: [0x00]
[3] PUSH1 0xe0          gas: 29228  stack: [0xa9059cbb...]
[5] SHR                 gas: 29225  stack: [0xa9059cbb..., 0xe0]
[6] DUP1                gas: 29222  stack: [0xa9059cbb]
[7] PUSH4 0x70a08231    gas: 29219  stack: [0xa9059cbb, 0xa9059cbb]
[12] EQ                 gas: 29216  stack: [0xa9059cbb, 0xa9059cbb, 0x70a08231]
[13] PUSH2 0x0040       gas: 29213  stack: [0xa9059cbb, 0x00]
[16] JUMPI              gas: 29210  stack: [0xa9059cbb, 0x00, 0x0040]

This trace shows the function dispatcher checking if the selector matches balanceOf(address). The EQ produces 0x00 (false) because the actual selector is 0xa9059cbb (transfer), so JUMPI does not jump.

Common Debugging Patterns for Huff and Yul

Pattern 1: Stack Underflow

If execution reverts with an out-of-gas error at a seemingly cheap opcode, you likely have a stack underflow. The EVM does not have a dedicated “stack underflow” error — it just consumes all gas.

// Bug: pop when stack is empty
#define macro BROKEN() = takes (0) returns (0) {
    pop  // Stack underflow! No items to pop
}

Detection: In forge debug, watch the stack panel. If it shows 0 items before a consuming opcode, that is your bug.

Pattern 2: Incorrect JUMP Destination

Huff uses labels for jump destinations. If a label resolves to a non-JUMPDEST opcode, the transaction reverts:

#define macro MAIN() = takes (0) returns (0) {
    0x01 success jumpi
    0x00 0x00 revert
    success:                    // Must be JUMPDEST
        0x00 0x00 return
}

Detection: In the trace, look for JUMP or JUMPI followed by immediate gas exhaustion. The target PC is on top of the stack before the jump.

Pattern 3: Incorrect ABI Encoding

Huff does not auto-encode return values. If you return raw bytes without proper ABI encoding, the calling contract’s decoder will revert:

// Wrong: returning raw uint256 without offset
0x00 mstore
0x20 0x00 return

// Correct for dynamic types: include offset
0x20 0x00 mstore     // offset
0x05 0x20 mstore     // length
// ... data at 0x40

Detection: The calling contract’s abi.decode reverts. The trace shows a successful return from your contract but a revert in the parent context.

Pattern 4: Storage Collision

Huff uses FREE_STORAGE_POINTER() to allocate storage slots. If two macros accidentally use the same slot, they overwrite each other:

#define constant BALANCES_SLOT = FREE_STORAGE_POINTER()  // slot 0
#define constant ALLOWANCES_SLOT = FREE_STORAGE_POINTER() // slot 1
#define constant TOTAL_SUPPLY_SLOT = FREE_STORAGE_POINTER() // slot 2

Detection: In forge debug, watch the storage panel. If writing to one mapping changes another variable, you have a collision.

Building a Debugging Workflow

Here is a systematic approach to debugging Huff contracts:

Reproduce — Write a failing test in Foundry that triggers the bug
Trace — Run with -vvvvv to get the full opcode trace
Narrow — Identify the exact opcode where behavior diverges from expectation
Compare — Run the same scenario against your Solidity reference implementation
Fix — Correct the Huff macro and verify the differential test passes
Regress — Add the failing case to your permanent test suite

# Step 1: Run the failing test
forge test --match-test test_brokenTransfer -vvvvv

# Step 2: Interactive debugging
forge debug --debug test/Token.t.sol --sig "test_brokenTransfer()"

# Step 3: After fix, verify
forge test --match-contract DifferentialTest
forge snapshot --check

Production Debugging with Tenderly

For contracts already deployed, Tenderly provides a visual debugger that shows the execution trace with decoded function calls, state changes, and gas usage:

# Export transaction for analysis
cast run TX_HASH --rpc-url $RPC --json > trace.json

# Or use Tenderly's API directly
curl -X POST "https://api.tenderly.co/api/v1/account/YOU/project/PROJ/simulate" \
  -H "X-Access-Key: $TENDERLY_KEY" \
  -d '{ "network_id": "1", "from": "0x...", "to": "0x...", "input": "0x..." }'

Tenderly’s visual debugger is especially useful for Huff because it annotates each opcode with its effect on the stack, letting you spot errors without manually tracking stack state.

Conclusion

Debugging EVM bytecode is a skill that separates hobbyist Huff developers from production-ready ones. Master cast run for quick transaction replay, forge debug for interactive analysis, and manual trace reading for production incidents. Build a systematic workflow: reproduce, trace, narrow, compare, fix, regress. The lower you go in the EVM stack, the more disciplined your debugging process must be.

标签

EVM Foundry Huff