Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update module4, research assistant #34

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rlancemartin
Copy link
Collaborator

@rlancemartin rlancemartin commented Sep 24, 2024

1/ Add alternating Human -> AI messages in interview (generate_question should be a HumanMessage) for compatibility w/ some LLMs that enforce it.

2/ Update Tavily.

3/ Also add an input schema for the research assistant graph in Studio.

@rlancemartin
Copy link
Collaborator Author

Also @shiv248 this is an interesting one that came up on Module 4.

For some models, messages need to alternate AI -> Human. This should resolve.

Also, this adds an input filter to Studio schema in Module 4, which is nice for visualization.

@robbiemu
Copy link
Contributor

robbiemu commented Sep 24, 2024

my review notes, obviously I'll use my local model again :) (updating these live over the next bit of time):

1 just a prompt engineering issue

# Generate question 

the change to this prompt (the content property) actually makes it harder for my smallish models (mistal-nemo, but also qwen-2.5 instruct 32b) to succeed with the tool call to fill this structured output. So we get:

ValidationError: 3 validation errors for Perspectives
analysts.0.description
  Field required ...

2 minor code change and prompt engineering issue

def search_wikipedia(state: InterviewState):
    
    """ Retrieve docs from wikipedia """

    # Search query
...
from IPython.display import Markdown
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": [], "max_num_turns": 2}, thread)
Markdown(interview['sections'][0])

is giving me:

...
Cell In[32], [line 44](vscode-notebook-cell:?execution_count=32&line=44)
     [41](vscode-notebook-cell:?execution_count=32&line=41) search_query = structured_llm.invoke([search_instructions]+state['messages'])
     [43](vscode-notebook-cell:?execution_count=32&line=43) # Search
---> [44](vscode-notebook-cell:?execution_count=32&line=44) search_docs = WikipediaLoader(query=search_query.search_query, 
     [45](vscode-notebook-cell:?execution_count=32&line=45)                               load_max_docs=2).load()
     [47](vscode-notebook-cell:?execution_count=32&line=47)  # Format
     [48](vscode-notebook-cell:?execution_count=32&line=48) formatted_search_docs = "\n\n---\n\n".join(
     [49](vscode-notebook-cell:?execution_count=32&line=49)     [
     [50](vscode-notebook-cell:?execution_count=32&line=50)         f'<Document source="{doc.metadata["source"]}" page="{doc.metadata.get("page", "")}"/>\n{doc.page_content}\n</Document>'
     [51](vscode-notebook-cell:?execution_count=32&line=51)         for doc in search_docs
     [52](vscode-notebook-cell:?execution_count=32&line=52)     ]
     [53](vscode-notebook-cell:?execution_count=32&line=53) )

AttributeError: 'NoneType' object has no attribute 'search_query'

this appears to be related to the messages being used during invocation. This works:

from IPython.display import Markdown
messages = [HumanMessage(f"So you said you were writing an article on {topic}?")]
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": messages, "max_num_turns": 3}, thread)
Markdown(interview['sections'][0])

But then the tavily search happens to fail. I see a human message is coming back blank so, I'm not sure at the moment why. edit: I forgot I had rewritten these prompts in the preceding graph, its not just search instructions. Although it is not necessary, prompt engineering is a part of the journey especially for local models, please feel free to lift my prompts from the notebook I sent you.

@shiv248
Copy link
Contributor

shiv248 commented Sep 24, 2024

Hmm.. this is interesting. It seems that the adjustment of inclusive to general models is causing issues within the ipynb for some? @robbiemu. I just tried @rlancemartin research assistant notebook in this branch rlm/m4_assistant_message, using fireworks and openai, both worked without a hitch, I'm assuming @rlancemartin tried with gemini.

might have to dig deeper into this.

@robbiemu
Copy link
Contributor

Hmm.. this is interesting. It seems that the adjustment of inclusive to general models is causing issues within the ipynb for some? @robbiemu. I just tried @rlancemartin research assistant notebook in this branch rlm/m4_assistant_message, using fireworks and openai, both worked without a hitch, I'm assuming @rlancemartin tried with gemini.

might have to dig deeper into this.

hi Shiv, I was asked to give this a spin to see if it was more compatible with local models (I had gotten the original working with local models and written somewhere here about it). It essentially is, there's one code change outside of prompts that I see mattering.

@rlancemartin
Copy link
Collaborator Author

rlancemartin commented Sep 24, 2024

Hmm.. this is interesting. It seems that the adjustment of inclusive to general models is causing issues within the ipynb for some? @robbiemu. I just tried @rlancemartin research assistant notebook in this branch rlm/m4_assistant_message, using fireworks and openai, both worked without a hitch, I'm assuming @rlancemartin tried with gemini.

might have to dig deeper into this.

i didn't test w/ Gemini, actually. but i do think these changes will enable compatibility w/ any models (Anthropic I know has this requirement, but also some Fireworks-hosted model and Gemini) that require alternating user-assistant messages.

@rlancemartin
Copy link
Collaborator Author

Hmm.. this is interesting. It seems that the adjustment of inclusive to general models is causing issues within the ipynb for some? @robbiemu. I just tried @rlancemartin research assistant notebook in this branch rlm/m4_assistant_message, using fireworks and openai, both worked without a hitch, I'm assuming @rlancemartin tried with gemini.
might have to dig deeper into this.

hi Shiv, I was asked to give this a spin to see if it was more compatible with local models (I had gotten the original working with local models and written somewhere here about it). It essentially is, there's one code change outside of prompts that I see mattering.

ya, local models will remain a problem for other reasons beyond alternating message types.

@rlancemartin
Copy link
Collaborator Author

my review notes, obviously I'll use my local model again :) (updating these live over the next bit of time):

1 just a prompt engineering issue

# Generate question 

the change to this prompt (the content property) actually makes it harder for my smallish models (mistal-nemo, but also qwen-2.5 instruct 32b) to succeed with the tool call to fill this structured output. So we get:

ValidationError: 3 validation errors for Perspectives
analysts.0.description
  Field required ...

2 minor code change and prompt engineering issue

def search_wikipedia(state: InterviewState):
    
    """ Retrieve docs from wikipedia """

    # Search query
...
from IPython.display import Markdown
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": [], "max_num_turns": 2}, thread)
Markdown(interview['sections'][0])

is giving me:

...
Cell In[32], [line 44](vscode-notebook-cell:?execution_count=32&line=44)
     [41](vscode-notebook-cell:?execution_count=32&line=41) search_query = structured_llm.invoke([search_instructions]+state['messages'])
     [43](vscode-notebook-cell:?execution_count=32&line=43) # Search
---> [44](vscode-notebook-cell:?execution_count=32&line=44) search_docs = WikipediaLoader(query=search_query.search_query, 
     [45](vscode-notebook-cell:?execution_count=32&line=45)                               load_max_docs=2).load()
     [47](vscode-notebook-cell:?execution_count=32&line=47)  # Format
     [48](vscode-notebook-cell:?execution_count=32&line=48) formatted_search_docs = "\n\n---\n\n".join(
     [49](vscode-notebook-cell:?execution_count=32&line=49)     [
     [50](vscode-notebook-cell:?execution_count=32&line=50)         f'<Document source="{doc.metadata["source"]}" page="{doc.metadata.get("page", "")}"/>\n{doc.page_content}\n</Document>'
     [51](vscode-notebook-cell:?execution_count=32&line=51)         for doc in search_docs
     [52](vscode-notebook-cell:?execution_count=32&line=52)     ]
     [53](vscode-notebook-cell:?execution_count=32&line=53) )

AttributeError: 'NoneType' object has no attribute 'search_query'

this appears to be related to the messages being used during invocation. This works:

from IPython.display import Markdown
messages = [HumanMessage(f"So you said you were writing an article on {topic}?")]
thread = {"configurable": {"thread_id": "1"}}
interview = interview_graph.invoke({"analyst": analysts[0], "messages": messages, "max_num_turns": 3}, thread)
Markdown(interview['sections'][0])

But then the tavily search happens to fail. I see a human message is coming back blank so, I'm not sure at the moment why. edit: I forgot I had rewritten these prompts in the preceding graph, its not just search instructions. Although it is not necessary, prompt engineering is a part of the journey especially for local models, please feel free to lift my prompts from the notebook I sent you.

for local models --

@robbiemu what would help:

do you mind putting up a PR with the precise prompt changes (or other changes) needed?

i can test to --

1/ make sure it STILL works w/ OAI (default for the course)
2/ confirm it works w/ Ollama

if there are code changes that break 1, then i can't merge but i can simply link to the closed PR in a issue we'll create to explain how to work with local LLMs.

@shiv248
Copy link
Contributor

shiv248 commented Sep 24, 2024

imo, the issue with opening up more accessibility for specific local solutions on this repo through using ChatOllama or ChatLlamacpp is it is highly model-dependent and the variance of parameters is always going to be there.
Maybe if we narrow it down to one local foundational model that has the highest responsive rate to the ipynbs we should recommend, possibly Llama 3.1 model (instruct)? @rlancemartin reference
but then we still have the issue of having to tailor for model parameters which would be somewhat of a struggle because we can't assume everyone can run bigger models which is recommended for high responses.
open to ideas though!

I want to think the goal of this repo as it grows is to be as smooth as possible for learners as they work through concepts and try those concepts/paradigms out with different approaches, without having to deal with unexpected issues. Kind of like a it just works mentality, as this is an entry-level course.

i do like the idea of having a simple ipynb/wiki demo of how to convert from commercial models to local models, and a brief overview of what hurdles they might face.

@robbiemu
Copy link
Contributor

robbiemu commented Sep 24, 2024

imo, the issue with opening up more accessibility for specific local solutions on this repo through using ChatOllama or ChatLlamacpp is it is highly model-dependent and the variance of parameters is always going to be there. Maybe if we narrow it down to one local foundational model that has the highest responsive rate to the ipynbs we should recommend, possibly Llama 3.1 model (instruct)? @rlancemartin reference but then we still have the issue of having to tailor for model parameters which would be somewhat of a struggle because we can't assume everyone can run bigger models which is recommended for high responses. open to ideas though!

I want to think the goal of this repo as it grows is to be as smooth as possible for learners as they work through concepts and try those concepts/paradigms out with different approaches, without having to deal with unexpected issues. Kind of like a it just works mentality, as this is an entry-level course.

i do like the idea of having a simple ipynb/wiki demo of how to convert from commercial models to local models, and a brief overview of what hurdles they might face.

I don't disagree with this. My initial comment was that the structure (not using AIMessages as final messages) often lead local models to not respond and that this was an issue even with the mistral models I run locally (this would remain so when using the mistral API). Fixing that (much as I did in my local notebooks) I believe was the motivation for this PR. The rest is just prompt engineering, which we can expect to change over time.

The one code change I suggested above was to mirror details from the video that happen to greatly improve interviews on my local models .. but ultimately this could probably also be resolved with prompt engineering. However, since the solution that I have on hand is not strictly prompt engineering, I felt it worth noting in my review above.

if you all decide you are in agreement and interested in my creating a PR for the prompts (and that code change I indicated above) I am happy to provide it. Let me know.

@shiv248
Copy link
Contributor

shiv248 commented Sep 24, 2024

ah my apologies @robbiemu, that's understandable, if it is just prompt engineering changes, and the OAI models still work, I think it is a step in the right direction!

I would also in the PR mention how you, yourself, switch between (ie changes you make to) the OAI integration within the notebooks to your local model solutions and your thoughts about which ones you like (model & B parameter) and have found most successful with as you work through the course. That way we can always link that PR # to future issues that might occur or users interested in the same thing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants