#38502 [BC-Low] Pending pool subtraction overflow causes node halt/shutdown

Submitted on Jan 5th 2025 at 04:36:41 UTC by @Blobism for Attackathon | Ethereum Protocol

  • Report ID: #38502

  • Report Type: Blockchain/DLT

  • Report severity: Low

  • Target: https://github.com/paradigmxyz/reth

  • Impacts:

    • Causing less than 25% of network processing nodes to process transactions from the mempool beyond set parameters (e.g. prevents processing transactions from the mempool)

    • Shutdown of less than 10% of network processing nodes without brute force actions, but does not shut down the network

Description

Brief/Intro

The latest reth release (v1.1.4) contains a subtraction overflow vulnerability in the pending transaction pool which can lead to node halt/shutdown, given the right set of transactions in the pool. A crafted set of transaction inputs can lead to an infinite loop in the pending pool for release builds, as a result of this subtraction overflow. The node will continue to run but will be unable to process transactions further.

Vulnerability Details

The vulnerability is found in crates/transaction-pool/src/pool/pending.rs, in the function remove_to_limit. The actual line where the subtraction overflow can occur is: non_local_senders -= unique_removed.

Consider the case where this function receives the argument remove_locals=False. The logic is flawed in how non_local_senders is decremented, when one of the inner loops encounters a local transaction ( non_local_senders -= 1).

The desired behavior should be that when a local sender is encountered, the non_local_senders variable is decremented only ONCE for this particular sender. This allows the function to return when non_local_senders == 0.

The current behavior is that a single local sender can cause this non_local_senders variable to be decremented multiple times over the course of multiple iterations of the outer loop. This leads to incorrect tracking of the non_local_senders.

Now consider a case where a local sender has been double-counted due to this flawed counting. This can lead to a case where 2 external senders have all of their transactions removed during one outer loop iteration, but non_local_senders=1 at the end of the outer loop, so the loop does not exit (assuming the pool still exceeds limits). At the start of the next iteration, we end up with non_local_senders -=2 because unique_removed=2, overflowing non_local_senders.

If the local transactions are enough to exceed the limits of the pending pool, we are now stuck in an infinite outer loop for a release build, because the exit conditions of the loop will never be met. Transaction processing will thus halt.

If this is a debug build, the subtraction overflow will result in a panic, shutting down the node.

The relevant code is shown below, modified with comments to illustrate how the overflow occurs:

Impact Details

This exploit has the ability to silently halt production reth nodes via a crafted input of transactions. The conditions could certainly be met by accident as well to halt a node. The exploit appears to require the presence of local transactions. The percentage of reth execution nodes is 2% according to clientdiversity.org. Therefore, this vulnerability falls best under the scope of preventing less than 25% of processing nodes from processing transactions from the mempool.

This may also fall under the scope of shutdown of less than 10% of network processing nodes without brute force actions, given that debug build nodes can be crashed by the subtraction overflow.

This vulnerability has the potential to lead to increasing less than 25% of network processing node resource consumption by at least 30% without brute force actions. This concern comes from the fact that remove_to_limit is responsible for keeping pool memory consumption under set limits, but the tracking logic of the function is flawed. I have not found an exact approach to trigger undesired memory growth with this bug.

References

https://github.com/paradigmxyz/reth/blob/v1.1.4/crates/transaction-pool/src/pool/pending.rs

https://gist.github.com/knagaitsev/c4e91f828f2e32c33987dc481cafbf73

Proof of Concept

Proof of Concept

The simplest example of how the infinite loop can be induced in production reth nodes is when the following senders and transactions are in the pending pool:

3 senders (1 local, 2 external): sender A (local) - enough transactions to exceed pool limits on their own sender B (external) - 2 transactions sender C (external) - 2 transactions

non_local_senders=3 at the start. The above transaction set will lead to the following non_local_senders decrements within the loop:

Iteration 0

Iteration 1

Iteration 2

Now we enter an infinite outer loop since the limit.is_exceeded(...) remains true due to too many local transactions and non_local_senders never reaches zero due to the overflow.

Unit test PoC

Adding the following unit test in crates/transaction-pool/src/pool/pending.rs can demonstrate the overflow:

You can run it as a debug build to see the overflow:

Or a release build to see the infinite loop:

Private testnet PoC

To further confirm that this exploit could actually occur, a test with a private Kurtosis testnet is provided below.

Use the following network_params.yaml:

Note that we have substantially reduced the pending max count to make the demonstration more clear. Attacks with a larger/default configuration can still be accomplished, and I can supplement with such a PoC if needed.

Run with:

Now run the following script with NodeJS (Be sure to do npm init and npm install web3 beforehand):

There are some timing assumptions involved in this script, but if it works successfully, it should force the first release reth node into the pending pool infinite loop condition.

Was this helpful?