Submitted on Dec 9th 2024 at 20:30:36 UTC by @anatomist for
Report ID: #37583
Report Type: Smart Contract
Report severity: Low
Target: https://github.com/vyperlang/vyper
Impacts:
(Compiler) Unexpected behavior
Description
Brief/Intro
For loop extracts annotation parsing and constructs ast separately for it. However, insufficient validation allows otherwise illegal code to compile, and create ambiguity in the intended behavior of contracts.
Vulnerability Details
Python for loops does not support type annotation for iterator variable. However, since vyper is a strongly typed language, it requires the type to be specified. This introduces the need for pre-parsing of for loops to extract out the type annotation, "fix" the code.
class ForParser:
def __init__(self, code):
self._code = code
self.annotations = {}
self._current_annotation = None
self._state = ParserState.NOT_RUNNING
self._current_for_loop = None
def consume(self, token):
# state machine: we can start slurping tokens soon
if token.type == NAME and token.string == "for":
# note: self._state should be NOT_RUNNING here, but we don't sanity
# check here as that should be an error the parser will handle.
self._state = ParserState.START_SOON
self._current_for_loop = token.start
if self._state == ParserState.NOT_RUNNING:
return False
# state machine: start slurping tokens
if token.type == OP and token.string == ":":
self._state = ParserState.RUNNING
# sanity check -- this should never really happen, but if it does,
# try to raise an exception which pinpoints the source.
if self._current_annotation is not None:
raise SyntaxException(
"for loop parse error", self._code, token.start[0], token.start[1]
)
self._current_annotation = []
return True # do not add ":" to tokens.
# state machine: end slurping tokens
if token.type == NAME and token.string == "in":
self._state = ParserState.NOT_RUNNING
self.annotations[self._current_for_loop] = self._current_annotation or []
self._current_annotation = None
return False
if self._state != ParserState.RUNNING:
return False
# slurp the token
self._current_annotation.append(token)
return True
The extracted types are later manually injected back into the python parsed ast.
However, the implementation does not check that the extracted code is purely an annotation. For example, the following code extracts uint256 = 1 as the annotation, and when parsed along with dummy_target: uint256 = 1, it also appears to be valid python syntax. However, only the type is used in code generation, and the value associated with it will be discarded.
for i: uint256 = 1 in range(1, 2):
pass
This means users may specify arbitrary "legal looking" python code as part of the annotation and the code will still compile, which may create ambiguity in what the contract is intended to do. For example, the following code will confuse users on whether the internal function is called.
x: uint256
@deploy
def __init__(init_x: uint256):
#set_x won't be called since it's ignored in codegen
for i: uint256 = self.set_x(init_x) in range(5):
pass
@internal
def set_x(init_x: uint256) -> uint256:
self.x = init_x
return init_x
To make things worse, while the value is ignored in codegen, it is actually respected in certain semantic analysis checks (such as self recursion), which adds to the ambiguity on what the code does.
@internal
def recursion(a: uint256) -> uint256:
for i: uint256 = self.recursion(a) in range(5):
pass