-
-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Installation]: AMD MI60 (gfx906) installation errors with ROCm 6.1 and 6.2 #774
Comments
it looks like it is failing to compile paged attention from above logs:
|
It's actually issue with ROCM. There's a fix though. Also aphrodite doesn't work on amd right now anyway |
@Naomiusearch I see. Does your latest pull #775 fix this issue for AMD GPUs? Also, I see you have a fork of aphrodite for amd - https://github.com/Naomiusearch/aphrodite-engine/tree/amd-fix - were you able to compile the engine and run models successfully with it? Thanks! |
It somewhat fixes the issue. You just have to run ./amdpatch.sh |
thanks! What might be the reason for it to take a long time to profile? Also, what GPUs are you using? |
I use 7800 xt + 7900 xtx, no idea why profiling takes so long |
I see. I was able to install Aphrodite thanks to your fix. However, I stumbled on another issue when loading llama3 8b fp16. After profiling started, a minute later I saw this error: |
Works fine on my PC, llama3 8b FP16 loads in like a minute and generates about 70t/s. GFX906 looks to be deprecated, so that might be the reason why it doesn't work nicely with Triton. Maybe running with APHRODITE_USE_TRITON_FLASH_ATTN=0 would work better for you? |
interesting. For me, Flash attention works fine but triton has some issues. Did you also compile triton from https://github.com/ROCm/triton/blob/triton-mlir/python/setup.py ? Also, you mentioned you could not load models bigger than 70m in aphrodite. But llama3 8b is running fast for you. So, you were able to figure out how to load 70m+ models, right? |
I just didn't have a bigger model than 70m to test FP16 before(I was trying to load GPTQ earlier), I had to download one. Also I have pytorch-triton 3.1.0+cf34004b8a |
Your current environment
How did you install Aphrodite?
I have 2x AMD MI60 and 1xRTX 3060 for video output. I want to install aphrodite-engine to use with those 2x AMD GPUs. I installed rocm and pytorch with all the dependencies.
I spent a few hours to find out that I needed to change setup.py line 20 to APHRODITE_TARGET_DEVICE = os.getenv("APHRODITE_TARGET_DEVICE", "rocm"). After that, I struggled with thrust library being incorrect. cmake was using NVIDIA's thrust from my NVIDIA GPUs. Then I figured out where AMD's thrust folder was and replaced Nvidias thrust with AMDs.
At last, the engine was compiling but at the end it failed with multiple warnings and errors. I tried both ROCm 6.1 and 6.2. Both failed with the same error. The error text is around 6k lines, so attaching as txt file here.
errors6_2_w_thrust.txt
Sharing some warning and error messages below from that text file:
Please, let me know if this is a version mismatch issue or a bug in the engine. Looking forward to a fix.
Thank you!
The text was updated successfully, but these errors were encountered: