-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue when streaming LLM response #1523
base: main
Are you sure you want to change the base?
Conversation
@@ -153,7 +153,13 @@ def call(self, messages: List[Dict[str, str]], callbacks: List[Any] = []) -> str | |||
params = {k: v for k, v in params.items() if v is not None} | |||
|
|||
response = litellm.completion(**params) | |||
return response["choices"][0]["message"]["content"] | |||
if params.get("stream", False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're trying to listen to the stream, wouldn't you want this to be True
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"stream" is set to False in params by default but can be overridden through kwargs.
Despite this, here we pass a default value in get() function, to make the code more readable and independent of preceding code.
So, stream option can be enabled by passing stream=True when creating crewai.LLM instance.
Hey @MottoX! Thank you for submitting this PR! Could you please elaborate more on this use case? It looks like you are trying to create an LLM that is going to stream a response. However, within crewAI, we don't really support streaming results. Are you trying to add support for streaming because you're going to use the same LLM with streaming elsewhere? |
Hello @bhancockio However, after upgrading to the latest crewAI version, we found that crewAI, now using litellm for LLM invocation, only supports non-streaming requests. This PR aims to enhance crewAI by enabling users to interact with their LLMs in streaming mode via passing |
Problem
Currently, if
stream
flag is set to True in the LLM params, there will be an exception raised fromllm.py
. The reason is that litellm will return an instance of litellm'sCustomStreamWrapper
which cannot be accessed in the current way. This is a real issue for our use case where we stream LLM response for special handling of lengthy content as well as optimizing application performance.Proposed Change
To solve this, we can check the
stream
flag when callinglitellm.completion
and have a separated handling for streaming mode.Test Case
The following sample code fails due to
TypeError: 'CustomStreamWrapper' object is not subscriptable
and succeeds after the PR.