-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
intops: core integer primitives #187
base: master
Are you sure you want to change the base?
Conversation
This, together with bitops2 and endians2, forms the core primitive offering for working with integers as the computer sees them. The focus of intops is to expose a number of common integer operations typically used to build more complex abstractions such as bigints, mp-ints etc while having access to the best performance the compiler and cpu can offer. There is more to do here, but this provides an outline of what this module could look like. Obviously, there are no exceptions or defects around - the point of these utilities is to stay as close as possible to bare metal. They could be used to implement such features however (similar to how `system/integerops` works).
# https://doc.rust-lang.org/std/primitive.u32.html#implementations | ||
|
||
func addOverflow*(x, y: SomeUnsignedInt): | ||
tuple[result: SomeUnsignedInt, overflow: bool] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this really simplify the implementation of BigInt libraries? If I try to imagine the loop that will be used there, it seems to me that it will be more complicated and less performant when based on this helper function.
In particular, the reliance on a tuple that gets translated to a C struct is what makes me nervous. If the carry is communicated with an output parameter, the compiler is a bit more free to perform register allocations in more optimal ways.
Also, ultimately, the carry should probably be obtained from the CPU itself, but I guess your plan is to replace the bodies of these functions in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the carry is communicated with an output parameter, the compiler is a bit more free to perform register allocations in more optimal ways.
Typically modern compilers are able to deal with this - also, the fact that it's a return value and not a pointer gives the compiler some freedoms it otherwise doesn't have - in llvm for example, this is typically handled by the SROA pass that decomposes structs into individual elements then assigns registers based on the lifetimes of the fields themselves (which in this case are trivial).
see https://gcc.godbolt.org/z/Ex8P76fWr for an example of how it works with a struct ret type.
These implementations are meant for the VM mainly - the actual (future) implementations would use compiler builtins which unfortunately differ in their actual API between platforms and compilers, but yes, the ideal is that the compiler maps a function like this to its ADC instruction that does a 3-operand addition returning the carry in a flag.
For bigints, the 3-parameter carry form in particular is interesting - for saturating arithmetic, the 2-paremeter version without carry is more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed a bug and improved example: the code now shows that using the builtin actually has no advantage over the no-builtin code in this particular case - both end up using the right ADC
instruction for a 192-bit integer for example:
This PR makes bncurve less slow by reusing stint integer primtivies and unrolling a few loops and arrays to avoid array length checks and the like. To give an idea, it brings down processing 8k nimbus-eth1 blocks around the 18M block height mark from 24 to 16 minutes - this is quite significant given that a lot of time in eth1 is spent reading the database - this is at least an order of magnitude of bncurve improvement but probably quite a lot more - how much doesn't greatly matter but now there's at least a decent baseline for any future performance work ;) Of course, reusing private primitives from `stint` is not pretty - the plan is to extract them to a separate library, work started in status-im/nim-stew#187.
This, together with bitops2 and endians2, forms the core primitive offering for working with integers as the computer sees them.
The focus of intops is to expose a number of common integer operations typically used to build more complex abstractions such as bigints, mp-ints etc while having access to the best performance the compiler and cpu can offer.
There is more to do here, but this provides an outline of what this module could look like.
Obviously, there are no exceptions or defects around - the point of these utilities is to stay as close as possible to bare metal. They could be used to implement such features however (similar to how
system/integerops
works).