-
-
Notifications
You must be signed in to change notification settings - Fork 801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore[venom]: expand venom docs #4314
Merged
Merged
Changes from 6 commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
9da540d
describe structures representing venom program
sandbubbles e0fee99
briefly describe jump instructions
sandbubbles 6d79eac
describe more instructions
sandbubbles 22ca43d
Merge branch 'master' into docs/venom
sandbubbles 2b88a32
add some updates
charles-cooper 7c9bcf6
describe log, phi..
sandbubbles 64f6410
describe dbname, db and sha
sandbubbles 8de9a93
add some translations into asm
sandbubbles 1f6dbdc
fix typos
sandbubbles f8aff7e
formatting, clarifications
charles-cooper 50084c3
add offset instruction
charles-cooper fa942b1
Merge branch 'master' into docs/venom
charles-cooper File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -160,3 +160,257 @@ A number of passes that are planned to be implemented, or are implemented for im | |
### Function inlining | ||
|
||
### Load-store elimination | ||
|
||
--- | ||
|
||
## Structure of a venom program | ||
|
||
### IRContext | ||
An `IRContext` consists of multiple `IRFunctions`, with one designated as the main entry point of the program. | ||
Additionally, the `IRContext` maintains its own representation of the data segment. | ||
|
||
### IRFunction | ||
An `IRFunction` is composed of a name and multiple `IRBasicBlocks`, with one marked as the entry point to the function. | ||
|
||
### IRBasicBlock | ||
An `IRBasicBlock` contains a label and a sequence of `IRInstructions`. | ||
Each `IRBasicBlock` has a single entry point and exit point. | ||
The exit point must be one of the following terminator instructions: | ||
- `jmp` | ||
- `djmp` | ||
- `jnz` | ||
- `ret` | ||
- `return` | ||
- `stop` | ||
- `exit` | ||
|
||
Normalized basic blocks can not have multiple predecessors and successors. It has either one (or zero) predecessors and potentially multiple successors or vice versa. | ||
|
||
### IRInstruction | ||
An `IRInstruction` consists of an opcode, a list of operands, and an optional return value. | ||
An operand can be a label, a variable, or a literal. | ||
|
||
## Instructions | ||
|
||
### Special instructions | ||
|
||
- `invoke` | ||
- Cause control flow to jump to a function denoted by the label. | ||
- Return values are passed in the return buffer at the offset address. | ||
- Practically only used for internal functions. | ||
- Effectively translates to `JUMP` and therefore changes the program counter value. | ||
- ``` | ||
invoke offset, label | ||
``` | ||
- `alloca` | ||
- Allocates memory of a given size at a given offset in memory. | ||
- The output is the offset itself. | ||
- Because the SSA form does not allow changing values of registers, handling mutable variables can be tricky. The `alloca` instruction is meant to simplify that. | ||
- ``` | ||
out = alloca size, offset | ||
``` | ||
- `palloca` | ||
- Like the `alloca` instruction but only used for parameters of internal functions. | ||
- ``` | ||
out = palloca size, offset | ||
``` | ||
- `iload` | ||
- Load value at immutable section of memory denoted by `offset` into `out` variable. | ||
- The operand can be either a literal, which is a statically computed offset, or a variable. | ||
- ``` | ||
out = iload offset | ||
``` | ||
- `istore` | ||
- The instruction represents a store into immutable section of memory. | ||
- Like in `iload`, the offset operand can be a literal. | ||
- ``` | ||
istore offset value | ||
``` | ||
- `phi` | ||
- Because in SSA form each variable is assigned just once, it is tricky to handle that variables may be assigned to something different based on which program path was taken. | ||
- Therefore, we use `phi` instructions. They are used in basic blocks where the control flow path merges. | ||
- So essentially the `out` variable is set to `var_a` if the program entered this block from `label_a` or to `var_b` when it went through `label_b`. | ||
- ``` | ||
out = phi var_a, label_a, var_b, label_b | ||
``` | ||
- `offset` | ||
- Statically compute offset. Useful for `mstore`, `mload` and such. | ||
- Basically `label` + `op`. | ||
- ``` | ||
ret = offset label, op | ||
``` | ||
- `param` | ||
- The `param` instruction is used to represent function arguments passed by the stack. | ||
- We assume the argument is on the stack and the `param` instruction is used to ensure we represent the argument by the `out` variable. | ||
- ``` | ||
out = param | ||
``` | ||
- `store` | ||
- Store variable value or literal into `out` variable. | ||
- ``` | ||
out = op | ||
``` | ||
- dbname | ||
- make and mark a data segment (one data segment in context - so maybe section it?) dunno | ||
- db | ||
- db stores into the data segment some label? hmm | ||
- `dloadbytes` | ||
- Alias for `codecopy` for legacy reasons. May be removed in future versions. | ||
- `ret` | ||
- Represents a return from an internal call. | ||
- Jumps to a location given by `op`, hence modifies the program counter. | ||
- ``` | ||
ret op | ||
``` | ||
- `exit` | ||
- Similar to `stop`, but used for constructor exit. The assembler is expected to jump to a special initcode sequence which returns the runtime code. | ||
- ``` | ||
exit | ||
``` | ||
- sha3_64 | ||
- `assert` | ||
- Assert that `op` is zero. If it is not, revert. | ||
- Calls that terminate this way do receive a gas refund. | ||
- ``` | ||
assert op | ||
``` | ||
- `assert_unreachable` | ||
- Check that `op` is zero. If it is not, terminate with `0xFE` ("INVALID" opcode). | ||
- Calls that end this way do not receive a gas refund. | ||
- ``` | ||
assert_unreachable op | ||
``` | ||
- `log` | ||
- Similar to the `LOGX` instruction in EVM. | ||
- Depending on the `topic_count` value (which can be only from 0 to 4) translates to `LOG0` ... `LOG4`. | ||
- The rest of the operands correspond to the `LOGX` instructions. | ||
- ``` | ||
log offset, size, [topic] * topic_count , topic_count | ||
``` | ||
- For example | ||
``` | ||
log %53, 32, 64, %56, 2 | ||
``` | ||
would translate to: | ||
``` | ||
LOG2 %53, 32, 64, %56 | ||
``` | ||
- `nop` | ||
- No operation, does nothing. | ||
- ``` | ||
nop | ||
``` | ||
|
||
### Jump instructions | ||
|
||
- `jmp` | ||
- Unconditional jump to code denoted by given `label`. | ||
- ``` | ||
jmp label | ||
``` | ||
- `jnz` | ||
- A conditional jump depending on `op` value. | ||
- Jumps to `label2` when `op` is not zero, otherwise jumps to `label1`. | ||
- ``` | ||
jnz label1, label2, op | ||
``` | ||
- `djmp` | ||
- Dynamic jump to an address specified by the variable operand. | ||
- The target is not a fixed label but rather a value stored in a variable, making the jump dynamic. | ||
- ``` | ||
djmp var | ||
``` | ||
|
||
### EVM instructions | ||
The following instructions map one-to-one with [EVM instructions](https://www.evm.codes/). | ||
Operands correspond to stack inputs in the same order. Stack outputs are instruction output. | ||
Instructions have the same effects. | ||
- `return` | ||
- `revert` | ||
- `coinbase` | ||
- `calldatasize` | ||
- `calldatacopy` | ||
- `mcopy` | ||
- `calldataload` | ||
- `gas` | ||
- `gasprice` | ||
- `gaslimit` | ||
- `chainid` | ||
- `address` | ||
- `origin` | ||
- `number` | ||
- `extcodesize` | ||
- `extcodehash` | ||
- `extcodecopy` | ||
- `returndatasize` | ||
- `returndatacopy` | ||
- `callvalue` | ||
- `selfbalance` | ||
- `sload` | ||
- `sstore` | ||
- `mload` | ||
- `mstore` | ||
- `tload` | ||
- `tstore` | ||
- `timestamp` | ||
- `caller` | ||
- `blockhash` | ||
- `selfdestruct` | ||
- `signextend` | ||
- `stop` | ||
- `shr` | ||
- `shl` | ||
- `sar` | ||
- `and` | ||
- `xor` | ||
- `or` | ||
- `add` | ||
- `sub` | ||
- `mul` | ||
- `div` | ||
- `smul` | ||
- `sdiv` | ||
- `mod` | ||
- `smod` | ||
- `exp` | ||
- `addmod` | ||
- `mulmod` | ||
- `eq` | ||
- `iszero` | ||
- `not` | ||
- `lt` | ||
- `gt` | ||
- `slt` | ||
- `sgt` | ||
- `create` | ||
- `create2` | ||
- `msize` | ||
- `balance` | ||
- `call` | ||
- `staticcall` | ||
- `delegatecall` | ||
- `codesize` | ||
- `basefee` | ||
- `blobhash` | ||
- `blobbasefee` | ||
- `prevrandao` | ||
- `difficulty` | ||
- `invalid` | ||
- `sha3` | ||
--- | ||
|
||
### TODO | ||
- Describe the architecture of analyses and passes a bit more. mention the distiction between analysis and pass (optimisation or transformation). | ||
- mention how to compile into it , bb(deploy), bb_runtime | ||
- perhaps add some flag to skip the store expansion pass? for readers of the code | ||
- if it is meant for using venom, then i should mention api for passes and analyses - should i do that? | ||
- analysis by ir_analysis_cache - request, invalidate, force - type of analysis and additional params | ||
- pass - run_pass | ||
|
||
Perhaps mention that functions: | ||
- each function starts as if with empty stack | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. not exactly -- they take (optional) output buffer and return pc |
||
- alloca and palloca(interf) for some args | ||
- param for args by stack | ||
|
||
ask harry or someone: | ||
- _mem_deploy_end is it immutable after that?? |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why it may be in fallback? there is the revert before it anyway, but it confused me as it doesn't seem to do much with constructor exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should be in runtime code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is:
--experimental-codegen -f bb_runtime
onThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah i see what you mean. dead code! but it shouldn't be there, i would consider that a bug in our venom generation.