-
-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added streaming langchain example. #68
base: master
Are you sure you want to change the base?
Conversation
So, I can't actually get this to produce any output? If I just run it as is, with a prompt of "Hello?" and a breakpoint in the stream() function, the context passed to the model looks like this:
It looks like there are two nested prompt formats there. I would expect the generation to start from "### Response:" following the Alpaca template, and at least with the models I've tried the model starts by generating " \n ###", which becomes a stop condition. |
I don't exactly know why the model wouldn't generate anything, potentially this was an issue with models being temperamental about the formats. I made the following changes: I'm unsure why nothing would be generated, but potentially the models were being confused by the mixed formats. If the issue persists that's more troubling as I'd have no idea what would cause nothing to be generated here. I've tested on about 10 models and they're all performing quite well. At any rate, let me know if the issues continue and I'll investigate if necessary. |
I think adding this as an example makes the most sense, this is a relatively complete example of a conversation model setup using Exllama and langchain. I've probably made some dumb mistakes as I'm not extremely familiar with the inner workings of Exllama, but this is a working example.
I should note, this is meant to serve as an example for streaming, it falls back to generate_simple on non-streaming and isin't meant to be used here.