Fix issue when streaming LLM response #1523

MottoX · 2024-10-28T18:01:16Z

Problem

Currently, if stream flag is set to True in the LLM params, there will be an exception raised fromllm.py. The reason is that litellm will return an instance of litellm's CustomStreamWrapper which cannot be accessed in the current way. This is a real issue for our use case where we stream LLM response for special handling of lengthy content as well as optimizing application performance.

Proposed Change

To solve this, we can check the stream flag when calling litellm.completion and have a separated handling for streaming mode.

Test Case

The following sample code fails due to TypeError: 'CustomStreamWrapper' object is not subscriptable and succeeds after the PR.

from crewai import Agent, Task, LLM
from crewai import Crew, Process


api_key = "my_key"
my_llm = LLM(model="gpt-4o-mini", api_key=api_key, stream=True)

# Create a researcher agent
researcher = Agent(
    role='Senior Researcher',
    goal='Discover groundbreaking technologies',
    verbose=True,
    llm=my_llm,
    backstory='A curious mind fascinated by cutting-edge innovation and the potential to change the world, you know everything about tech.'
)

# Task for the researcher
research_task = Task(
    description='Identify the next big trend in AI',
    expected_output="A single short sentence.",
    async_execution=False,
    agent=researcher  # Assigning the task to the researcher
)

# Instantiate your crew
crew = Crew(
    agents=[researcher],
    tasks=[research_task],
    process=Process.sequential  # Tasks will be executed one after the other
)

# Begin the task execution
crew.kickoff()

bhancockio · 2024-10-28T20:39:01Z

src/crewai/llm.py

@@ -153,7 +153,13 @@ def call(self, messages: List[Dict[str, str]], callbacks: List[Any] = []) -> str
                params = {k: v for k, v in params.items() if v is not None}

                response = litellm.completion(**params)
-                return response["choices"][0]["message"]["content"]
+                if params.get("stream", False):


If you're trying to listen to the stream, wouldn't you want this to be True?

"stream" is set to False in params by default but can be overridden through kwargs.
Despite this, here we pass a default value in get() function, to make the code more readable and independent of preceding code.
So, stream option can be enabled by passing stream=True when creating crewai.LLM instance.

bhancockio · 2024-10-28T20:41:17Z

Hey @MottoX!

Thank you for submitting this PR! Could you please elaborate more on this use case?

It looks like you are trying to create an LLM that is going to stream a response. However, within crewAI, we don't really support streaming results.

Are you trying to add support for streaming because you're going to use the same LLM with streaming elsewhere?

MottoX · 2024-10-29T23:01:11Z

Hello @bhancockio
Our team is developing applications using crewAI and in-house LLMs to process extensive data and generate or proofread articles. In some tasks, particularly those with lengthy request messages, non-streaming requests can lead to server timeouts. Invoking LLM in streaming fashion can address the issue and ensure that sufficient content is generated in the response. We have implemented engineering optimizations for LLM streaming, enhancing its efficiency, stability, and fault tolerance.

However, after upgrading to the latest crewAI version, we found that crewAI, now using litellm for LLM invocation, only supports non-streaming requests.

This PR aims to enhance crewAI by enabling users to interact with their LLMs in streaming mode via passing stream=True in crewai.LLM. This feature will benefit users like us and make crewAI more flexible and adaptable.

Fix issue when stream is True

15cddce

bhancockio reviewed Oct 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue when streaming LLM response #1523

Fix issue when streaming LLM response #1523

MottoX commented Oct 28, 2024

bhancockio Oct 28, 2024

MottoX Oct 29, 2024 •

edited

Loading

bhancockio commented Oct 28, 2024

MottoX commented Oct 29, 2024

Fix issue when streaming LLM response #1523

Are you sure you want to change the base?

Fix issue when streaming LLM response #1523

Conversation

MottoX commented Oct 28, 2024

Problem

Proposed Change

Test Case

bhancockio Oct 28, 2024

Choose a reason for hiding this comment

MottoX Oct 29, 2024 • edited Loading

Choose a reason for hiding this comment

bhancockio commented Oct 28, 2024

MottoX commented Oct 29, 2024

MottoX Oct 29, 2024 •

edited

Loading