-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output all are "!" #121
Comments
This is very weird. What CPU/OS? |
Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz |
Which model? |
dllama_model_llama3_8b_q40.m |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As I shows in the picture, all outputs of inference is "!".
I tried different approaches and found that I could solve this problem only if I turned on O0 optimization. But it runs too slowly with O0 optimizations turned on, and this can happen with O1 O2 O3 optimizations turned on.
Has anyone ever had the same problem? How should we solve it.
The text was updated successfully, but these errors were encountered: