-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Installation issue: dla-future #41511
Comments
Thanks @yizeyi18 for the report! We've so far successfully used ROCm 5.2-4. We haven't tested 5.5, and 5.6 we know is broken (at least on some systems, see eth-cscs/DLA-Future#1049). We haven't tested 5.7 either yet. This looks like a HIP issue, but I don't exclude that we're doing something silly in DLA-Future. I'd accept a PR to add a conflict with ROCm 5.7 in DLA-Future for the moment until we understand what is actually going on here. By the way, don't be afraid to open issues like these directly on the DLA-Future repo. We'll happily look into these and apply fixes on both the DLA-Future side and here, but opening the issue there takes a bit of noise away from core spack issues. The important thing is of course reporting it somewhere though, so I appreciate you reporting it here as well. |
@msimberg I suppose the issue faults by HIP too, but solving it in dla-future also makes sense: some supercomputers, like the MI50 in my hand, relies on ROCm-5.7.0 and would never get new release to fix it; they still runs. |
Absolutely, if we can find a workaround we'll definitely include that in DLA-Future itself as well. I'm just not sure yet what the problem is, much less what the workaround looks like.
No need to apologize, I fully understand why you're opening it here and appreciate it either way. |
@yizeyi18 I've created an issue on the HIP repository with a reproducer of the problem independently of DLA-Future: ROCm/HIP#3377. I can't see a straightforward way to patch DLA-Future to work around this issue. I think if/when there is an upstream fix we can try to patch HIP 5.7 in spack, but for now I would add a conflict. Do you need 5.7 specifically, or would an older version of HIP work for you as well (since it seems like this is a new problem in 5.7)? |
@msimberg Actually I use HIP-5.7.0 only because I upgraded it in apt XwX |
Ok, that's good to hear. As I mentioned earlier 5.2-5.4 are the only relatively well tested versions for the moment, so if you can stick to those your chances of success will be higher. 5.5 is unknown and 5.6 is also known to be somewhat broken. 5.7 is obviously broken as you've found out.
I'm not sure I understand what you mean. Do you have a patch for DLA-Future to not use |
try.patch |
Ah, perfect! I think we can take it from here and turn that into a PR for you. I can't promise that it'll still happen this year, but latest early next year we should have time to deal with this. Thank you for investigating! |
Steps to reproduce the issue
Error message
Error message
Information on your system
Additional information
@albestro @aurianer @msimberg @rasolca
ROCm-5.7.0 compiler get hipFloatComplex/hipDoubleComplex obscured with HIP_vector_type<float, 2>/HIP_vector_type<double, 2>, which makes operator '*' ambiguous.
I manually instantiated the function and made it compile but that should not be a right way to solve it. Considering rocm-5.7.0 is the last version supports gfx906, I have to stay at here; is there any solution better?
spack-build-out.txt
spack-build-env.txt
CMakeCache.txt
try.patch
General information
spack debug report
and reported the version of Spack/Python/Platformspack maintainers <name-of-the-package>
and @mentioned any maintainersThe text was updated successfully, but these errors were encountered: