Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running hip on Crusher up to 2.2B lookups #23

Open
williamfgc opened this issue Mar 7, 2022 · 2 comments
Open

Running hip on Crusher up to 2.2B lookups #23

williamfgc opened this issue Mar 7, 2022 · 2 comments

Comments

@williamfgc
Copy link

williamfgc commented Mar 7, 2022

Looking for guidance when running on Crusher for different workloads.
The hip variant runs fine up to 2.2B lookups.

To reproduce use the srun command within a BATCH script:
srun -n1 --ntasks-per-node=1 --gpus-per-task=1 --gpu-bind=closest $XSBENCH_HIP -l 2200000000 -m event

where $XSBench_HIP is the hip executable built with rocm-hip/4.3.0 module.
It only prints the usage instructions. XSBench runs find for smaller numbers (e.g. 2B lookups).

My question is: is this supposed to be an "real" science case run? Am I hitting a memory limit in event mode?
Any help is appreciated.

@jtramm
Copy link
Contributor

jtramm commented Mar 7, 2022

The number of lookups in XSBench is stored as a 32 bit signed integer, so the maximum number of lookups is 2,147,483,647.

If you wanted to run more, you could convert the variable to 64 bits (along with any functions that use that variable (e.g., `atoi())). However, in a real MC application the event width is likely never going to be more than the low tens of millions due to restrictions on how many particles can be in-flight at once because of memory constraints. As such, it's not really meaningful in terms of real-world physics for XSBench to have an event with billions of items. That said, if one is only interested in the kernel in a purely abstract way (e.g., testing random memory access patterns), then definitely feel free to tweak that variable. It might be good though to check in and understand why the value is being increased to 2 billion.

@williamfgc
Copy link
Author

@jtramm thanks for the quick response and guidance. Makes sense. Overall, we are using XSBench as a meaningful proxy app to port to the Julia programming language, so I was just trying to generate runtime workloads for comparison. As you said, probably tweaking the variable will help in a purely abstract way and not meaningful physics. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants