KeyError: 'user_input' when calculating RAGAS metric #2056

PierreMesure · 2024-10-29T12:24:58Z

Issue Type

Bug

Source

source

Giskard Library Version

2.15.3

OS Platform and Distribution

No response

Python version

No response

Installed python packages

ragas==0.2.2

Current Behaviour?

When trying to evaluate a RAG assistant with some RAGAS metrics (context recall), the evaluation fails. See stacktrace below.
This happens when trying to provide the answer as AgentAnswer. We're not super clear about what should be in the documents parameter, the documentation doesn't give any clear example. We're using LlamaIndex so agent_output.source_nodes doesn't return a list of strings. Here's what we've tried:

Standalone code OR list down the steps to reproduce the issue

def answer_fn(question: str, history: List[dict] = []) -> AgentAnswer:
    
    chat_history = [ChatMessage(role=MessageRole.USER, content=msg["content"]) for msg in history]
    agent_output = chat_engine.chat(question, chat_history=chat_history)

    answer = agent_output.response
    # documents = agent_output.source_nodes
    documents = [node.text for node in agent_output.source_nodes if hasattr(node.node, 'text')]

    return AgentAnswer(message=answer,documents=documents)

evaluate(
            answer_fn,
            testset=testset,
            knowledge_base=knowledge_base,
            metrics=[ragas_context_recall, ragas_context_precision, ragas_faithfulness, ragas_answer_relevancy]
        )

Relevant log output

  File "/evaluation_manager.py", line 310, in _run_giskard_evaluation_and_return_generated_report
    return evaluate(
           ^^^^^^^^^
  File "/giskard/rag/evaluate.py", line 105, in evaluate
    metrics_results[sample["id"]].update(metric(sample, answer))
                                         ^^^^^^^^^^^^^^^^^^^^^^
  File "/giskard/rag/metrics/ragas_metrics.py", line 119, in __call__
    return {self.name: self.metric.score(ragas_sample)}
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ragas/utils.py", line 159, in emit_warning
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/ragas/metrics/base.py", line 121, in score
    raise e
  File "/ragas/metrics/base.py", line 117, in score
    score = loop.run_until_complete(self._ascore(row=row, callbacks=group_cm))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/nest_asyncio.py", line 98, in run_until_complete
    return f.result()
           ^^^^^^^^^^
  File "/asyncio/futures.py", line 203, in result
    raise self._exception.with_traceback(self._exception_tb)
  File "/asyncio/tasks.py", line 314, in __step_run_and_handle_result
    result = coro.send(None)
             ^^^^^^^^^^^^^^^
  File "/ragas/metrics/_context_recall.py", line 191, in _ascore
    return await super()._ascore(row, callbacks)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/ragas/metrics/_context_recall.py", line 156, in _ascore
    question=row["user_input"],
             ~~~^^^^^^^^^^^^^^
KeyError: 'user_input'

The text was updated successfully, but these errors were encountered:

Snow31ind · 2024-10-31T07:42:43Z

I'm experiencing the same issue, and I believe this is a quite critical bug. I couldn't run evaluations with additional metrics. Any update on this?

alexcombessie · 2024-10-31T08:54:54Z

Hey @PierreMesure and @Snow31ind - This should be solved by #2052

Can you try again with Giskard latest release?

Snow31ind · 2024-10-31T09:14:19Z

@alexcombessie Thank for replying to us. Let me input more context on this issue. The version of giskard and ragas in my requirements.txt file is:

giskard==2.15.3
ragas==0.2.2

I believe 2.15.3 is the latest release, and I still see the same error thrown as above.

After inspecting the stack traceback, I'm wondering if the ragas sample in the giskard ragas metric wrapper matches the required interface in the base ragas metric score method, as the sample doesn't contain the user_input key. That's why I strongly believe it's the root cause.

Could you help double check on that? And is there any tests being run to make sure there's no data interface mismatch?

PierreMesure · 2024-10-31T09:30:19Z

I just reverted ragas to 0.1.21 and it works. 😊

@alexcombessie, I reported another problem fixed by #2052, I don't think this PR will fix it.
I think the problem stemmed from a change in the name of the parameters by RAGAS. In this commit, you can see the change in the documentation. I think the change in variable names comes from this PR

Snow31ind · 2024-10-31T09:34:49Z

@PierreMesure Awesome! You make my day. Anyway, this issue is worth having a fixed soon. Thanks team!

alexcombessie · 2024-11-08T16:17:04Z

Thanks! @henchaves could you have a look when you have time next week 🙏 ?

henchaves · 2024-11-14T21:09:59Z

Hello @PierreMesure and @Snow31ind, thanks a lot for reporting this bug.
Indeed RAGAS v0.2 has changed the name of parameters, which are breaking the RagasMetric call.
I opened a PR to make it compatible with the old version (v0.1) and also the last one (v0.2): #2073

It should be reviewed and merged soon!
Thanks again and sorry for the delay!

alexcombessie assigned henchaves Oct 31, 2024

henchaves mentioned this issue Nov 14, 2024

[GSK-3940] Make RagasMetric compatible with Ragas v0.2 #2073

Merged

6 tasks

henchaves added the bug Something isn't working label Nov 14, 2024

henchaves closed this as completed Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KeyError: 'user_input' when calculating RAGAS metric #2056

KeyError: 'user_input' when calculating RAGAS metric #2056

PierreMesure commented Oct 29, 2024 •

edited

Loading

Snow31ind commented Oct 31, 2024

alexcombessie commented Oct 31, 2024

Snow31ind commented Oct 31, 2024 •

edited

Loading

PierreMesure commented Oct 31, 2024 •

edited

Loading

Snow31ind commented Oct 31, 2024

alexcombessie commented Nov 8, 2024

henchaves commented Nov 14, 2024

KeyError: 'user_input' when calculating RAGAS metric #2056

KeyError: 'user_input' when calculating RAGAS metric #2056

Comments

PierreMesure commented Oct 29, 2024 • edited Loading

Issue Type

Source

Giskard Library Version

OS Platform and Distribution

Python version

Installed python packages

Current Behaviour?

Standalone code OR list down the steps to reproduce the issue

Relevant log output

Snow31ind commented Oct 31, 2024

alexcombessie commented Oct 31, 2024

Snow31ind commented Oct 31, 2024 • edited Loading

PierreMesure commented Oct 31, 2024 • edited Loading

Snow31ind commented Oct 31, 2024

alexcombessie commented Nov 8, 2024

henchaves commented Nov 14, 2024

PierreMesure commented Oct 29, 2024 •

edited

Loading

Snow31ind commented Oct 31, 2024 •

edited

Loading

PierreMesure commented Oct 31, 2024 •

edited

Loading