You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've compiled the C example with and without NEON optimizations setting -DBLAKE3_USE_NEON=1 -O3 -mfpu=neon-vfpv4 compiler flags, and to my surprise the non NEON variant seems to perform better. I've tested on a ~30MB file (both from RAM and flash, to rule out I/O) and here are the results:
Without NEON:
time ./b3sum < /dev/mtd5ro
5420676b03e59d74cd44331c200ea841cd247374f307ce838dc6a0d367f73774
real 0m 2.11s
user 0m 0.67s
sys 0m 0.08s
With NEON:
time ./b3sum < /dev/mtd5ro
5420676b03e59d74cd44331c200ea841cd247374f307ce838dc6a0d367f73774
real 0m 2.17s
user 0m 0.76s
sys 0m 0.04s
I've saw that there were some changes in the 1.5.1 release, so I tried the 1.5.0, but the results are the same.
Any suggestions on what might cause this?
The text was updated successfully, but these errors were encountered:
I've compiled the C example with and without NEON optimizations setting
-DBLAKE3_USE_NEON=1 -O3 -mfpu=neon-vfpv4
compiler flags, and to my surprise the non NEON variant seems to perform better. I've tested on a ~30MB file (both from RAM and flash, to rule out I/O) and here are the results:Without NEON:
With NEON:
I've saw that there were some changes in the
1.5.1
release, so I tried the1.5.0
, but the results are the same.Any suggestions on what might cause this?
The text was updated successfully, but these errors were encountered: