You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking for guidance when running on Crusher for different workloads.
The hip variant runs fine up to 2.2B lookups.
To reproduce use the srun command within a BATCH script: srun -n1 --ntasks-per-node=1 --gpus-per-task=1 --gpu-bind=closest $XSBENCH_HIP -l 2200000000 -m event
where $XSBench_HIP is the hip executable built with rocm-hip/4.3.0 module.
It only prints the usage instructions. XSBench runs find for smaller numbers (e.g. 2B lookups).
My question is: is this supposed to be an "real" science case run? Am I hitting a memory limit in event mode?
Any help is appreciated.
The text was updated successfully, but these errors were encountered:
The number of lookups in XSBench is stored as a 32 bit signed integer, so the maximum number of lookups is 2,147,483,647.
If you wanted to run more, you could convert the variable to 64 bits (along with any functions that use that variable (e.g., `atoi())). However, in a real MC application the event width is likely never going to be more than the low tens of millions due to restrictions on how many particles can be in-flight at once because of memory constraints. As such, it's not really meaningful in terms of real-world physics for XSBench to have an event with billions of items. That said, if one is only interested in the kernel in a purely abstract way (e.g., testing random memory access patterns), then definitely feel free to tweak that variable. It might be good though to check in and understand why the value is being increased to 2 billion.
@jtramm thanks for the quick response and guidance. Makes sense. Overall, we are using XSBench as a meaningful proxy app to port to the Julia programming language, so I was just trying to generate runtime workloads for comparison. As you said, probably tweaking the variable will help in a purely abstract way and not meaningful physics. Thanks!
Looking for guidance when running on Crusher for different workloads.
The hip variant runs fine up to 2.2B lookups.
To reproduce use the srun command within a BATCH script:
srun -n1 --ntasks-per-node=1 --gpus-per-task=1 --gpu-bind=closest $XSBENCH_HIP -l 2200000000 -m event
where
$XSBench_HIP
is the hip executable built withrocm-hip/4.3.0
module.It only prints the usage instructions. XSBench runs find for smaller numbers (e.g. 2B lookups).
My question is: is this supposed to be an "real" science case run? Am I hitting a memory limit in event mode?
Any help is appreciated.
The text was updated successfully, but these errors were encountered: