(Specifications) A bug in specifications with no direct impact on client implementations
Description
Brief/Intro
RLP serialization is widely used in the Ethereum clients, which provides a standard format for the data transfer between nodes.
In Erigon (https://github.com/erigontech/erigon/ ), the decode RLP of legacy tx misses check the tailing bytes, which allows extra tailing bytes during the RLP decoding. This could possibly be a consensus issue as other Ethereum clients reject such legacy transactions.
func UnmarshalTransactionFromBinary(data []byte, blobTxnsAreWrappedWithBlobs bool) (Transaction, error) {
if len(data) <= 1 {
return nil, fmt.Errorf("short input: %v", len(data))
}
s := rlp.NewStream(bytes.NewReader(data[1:]), uint64(len(data)-1))
var t Transaction
switch data[0] {
case AccessListTxType:
t = &AccessListTx{}
case DynamicFeeTxType:
t = &DynamicFeeTransaction{}
case BlobTxType:
if blobTxnsAreWrappedWithBlobs {
t = &BlobTxWrapper{}
} else {
t = &BlobTx{}
}
case SetCodeTxType:
t = &SetCodeTransaction{}
default:
if data[0] >= 0x80 {
// Tx is type legacy which is RLP encoded
return DecodeTransaction(data)
}
return nil, ErrTxTypeNotSupported
}
if err := t.DecodeRLP(s); err != nil {
return nil, err
}
if s.Remaining() != 0 {
return nil, fmt.Errorf("trailing bytes after rlp encoded transaction")
}
return t, nil
}
However, if it’s the legacy transaction, it calls tx.DecodeRLP(s) to directly decode the legacy transaction, in which the tailing bytes is unchecked. This oversight would allow tailing bytes to sneak into the legacy transaction when calling DecodeRLPTransaction() to decode the legacy transaction.
func (tx *LegacyTx) DecodeRLP(s *rlp.Stream) error {
_, err := s.List()
if err != nil {
return fmt.Errorf("legacy tx must be a list: %w", err)
}
if tx.Nonce, err = s.Uint(); err != nil {
return fmt.Errorf("read Nonce: %w", err)
}
var b []byte
if b, err = s.Uint256Bytes(); err != nil {
return fmt.Errorf("read GasPrice: %w", err)
}
tx.GasPrice = new(uint256.Int).SetBytes(b)
if tx.Gas, err = s.Uint(); err != nil {
return fmt.Errorf("read Gas: %w", err)
}
if b, err = s.Bytes(); err != nil {
return fmt.Errorf("read To: %w", err)
}
if len(b) > 0 && len(b) != 20 {
return fmt.Errorf("wrong size for To: %d", len(b))
}
if len(b) > 0 {
tx.To = &libcommon.Address{}
copy((*tx.To)[:], b)
}
if b, err = s.Uint256Bytes(); err != nil {
return fmt.Errorf("read Value: %w", err)
}
tx.Value = new(uint256.Int).SetBytes(b)
if tx.Data, err = s.Bytes(); err != nil {
return fmt.Errorf("read Data: %w", err)
}
if b, err = s.Uint256Bytes(); err != nil {
return fmt.Errorf("read V: %w", err)
}
tx.V.SetBytes(b)
if b, err = s.Uint256Bytes(); err != nil {
return fmt.Errorf("read R: %w", err)
}
tx.R.SetBytes(b)
if b, err = s.Uint256Bytes(); err != nil {
return fmt.Errorf("read S: %w", err)
}
tx.S.SetBytes(b)
if err = s.ListEnd(); err != nil {
return fmt.Errorf("close tx struct: %w", err)
}
return nil
}
Impact Details
The function DecodeRLPTransaction() has been widely utilized in the Erigon codebase to decode the transactions from block, block body and user’s submitted raw transaction. Since tailing bytes of a transaction is disallowed in other Ethereum clients, it could introduce consensus issues with this flawed decoding function.
References
https://github.com/erigontech/erigon/tree/v2.61.0
Proof of Concept
Proof of Concept
For simplicity, we provide the following unit test to compare two transaction decoding functions, DecodeRLPTransaction() vs UnmarshalTransactionFromBinary().
The test results show that the decoding of this legacy transaction with tailing bytes succeeds with DecodeRLPTransaction() but fails with UnmarshalTransactionFromBinary().
=== RUN TestDecodeRLPTransactionTailing
Unmarshal Tx with UnmarshalTransactionFromBinary() fails: trailing bytes after rlp encoded transaction
Decode RLP of Tx DecodeRLPTransaction() succeeds: 0x74487f78cbc2fd511acd00d2d5ec65d9a19fb8eaad38c33b4e5737d780c7cc60
--- PASS: TestDecodeRLPTransactionTailing (0.00s)
PASS
Process finished with the exit code 0