intops: core integer primitives #187

arnetheduck · 2023-05-09T19:31:01Z

This, together with bitops2 and endians2, forms the core primitive offering for working with integers as the computer sees them.

The focus of intops is to expose a number of common integer operations typically used to build more complex abstractions such as bigints, mp-ints etc while having access to the best performance the compiler and cpu can offer.

There is more to do here, but this provides an outline of what this module could look like.

Obviously, there are no exceptions or defects around - the point of these utilities is to stay as close as possible to bare metal. They could be used to implement such features however (similar to how system/integerops works).

This, together with bitops2 and endians2, forms the core primitive offering for working with integers as the computer sees them. The focus of intops is to expose a number of common integer operations typically used to build more complex abstractions such as bigints, mp-ints etc while having access to the best performance the compiler and cpu can offer. There is more to do here, but this provides an outline of what this module could look like. Obviously, there are no exceptions or defects around - the point of these utilities is to stay as close as possible to bare metal. They could be used to implement such features however (similar to how `system/integerops` works).

zah · 2023-05-10T09:00:01Z

stew/intops.nim

+# https://doc.rust-lang.org/std/primitive.u32.html#implementations
+
+func addOverflow*(x, y: SomeUnsignedInt):
+    tuple[result: SomeUnsignedInt, overflow: bool] =


Does this really simplify the implementation of BigInt libraries? If I try to imagine the loop that will be used there, it seems to me that it will be more complicated and less performant when based on this helper function.

In particular, the reliance on a tuple that gets translated to a C struct is what makes me nervous. If the carry is communicated with an output parameter, the compiler is a bit more free to perform register allocations in more optimal ways.

Also, ultimately, the carry should probably be obtained from the CPU itself, but I guess your plan is to replace the bodies of these functions in the future?

If the carry is communicated with an output parameter, the compiler is a bit more free to perform register allocations in more optimal ways.

Typically modern compilers are able to deal with this - also, the fact that it's a return value and not a pointer gives the compiler some freedoms it otherwise doesn't have - in llvm for example, this is typically handled by the SROA pass that decomposes structs into individual elements then assigns registers based on the lifetimes of the fields themselves (which in this case are trivial).

see https://gcc.godbolt.org/z/Ex8P76fWr for an example of how it works with a struct ret type.

These implementations are meant for the VM mainly - the actual (future) implementations would use compiler builtins which unfortunately differ in their actual API between platforms and compilers, but yes, the ideal is that the compiler maps a function like this to its ADC instruction that does a 3-operand addition returning the carry in a flag.

For bigints, the 3-parameter carry form in particular is interesting - for saturating arithmetic, the 2-paremeter version without carry is more appropriate.

Fixed a bug and improved example: the code now shows that using the builtin actually has no advantage over the no-builtin code in this particular case - both end up using the right ADC instruction for a 192-bit integer for example:

https://gcc.godbolt.org/z/bvW6aTr5a

This PR makes bncurve less slow by reusing stint integer primtivies and unrolling a few loops and arrays to avoid array length checks and the like. To give an idea, it brings down processing 8k nimbus-eth1 blocks around the 18M block height mark from 24 to 16 minutes - this is quite significant given that a lot of time in eth1 is spent reading the database - this is at least an order of magnitude of bncurve improvement but probably quite a lot more - how much doesn't greatly matter but now there's at least a decent baseline for any future performance work ;) Of course, reusing private primitives from `stint` is not pretty - the plan is to extract them to a separate library, work started in status-im/nim-stew#187.

arnetheduck added 3 commits May 9, 2023 21:28

add references

9ff33ed

docs

403b5de

zah reviewed May 10, 2023

View reviewed changes

arnetheduck added 4 commits May 24, 2023 17:30

Merge branch 'master' into intops

72d5338

test

1e2fbe1

Merge branch 'intops' of github.com:status-im/nim-stew into intops

422b154

fix 32-bit

e9b2277

arnetheduck mentioned this pull request Sep 19, 2023

items, pairs and friends now use unCheckedInc nim-lang/Nim#22729

Merged

arnetheduck mentioned this pull request Aug 8, 2024

Reuse stint primitives for limbs status-im/nim-bncurve#10

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

intops: core integer primitives #187

intops: core integer primitives #187

arnetheduck commented May 9, 2023

zah May 10, 2023 •

edited

Loading

arnetheduck May 10, 2023 •

edited

Loading

arnetheduck May 11, 2023

intops: core integer primitives #187

Are you sure you want to change the base?

intops: core integer primitives #187

Conversation

arnetheduck commented May 9, 2023

zah May 10, 2023 • edited Loading

Choose a reason for hiding this comment

arnetheduck May 10, 2023 • edited Loading

Choose a reason for hiding this comment

arnetheduck May 11, 2023

Choose a reason for hiding this comment

zah May 10, 2023 •

edited

Loading

arnetheduck May 10, 2023 •

edited

Loading