Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add support for OPEA LLMs in Llama-Index #16666

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

logan-markewich
Copy link
Collaborator

@logan-markewich logan-markewich commented Oct 24, 2024

Description

This PR adds a llama-index wrapper for LLMs hosted by OPEA's GenAIComps library

A few unknowns/notes from my part

  • Will any hosted LLM work with only chat messages? (if so, then I can keep the implementation as is!)
  • I still need to try and spin up an OPEA LLM micro service to test this
  • It seems like OPEA LLMs try to bundle some RAG functionality into the LLM micro service, but I skipped that, since llama-index is driving the prompting

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • TODO: I added new unit tests to cover this change

@ftian1
Copy link

ftian1 commented Oct 30, 2024

thanks for this PR. I am one of contributors in OPEA. I have a comment here: such code should have an assumption that the OPEA microservice has been deployed in somewhere, either in local or in a cloud env. Then user could pass down the service ip to access. I would suggest to have words in document to mention this prerequisite.

from llama_index.embeddings.opea import OPEAEmbedding

embed_model = OPEAEmbedding(
model="<model_name>",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be model_name=

@rbrugaro
Copy link
Contributor

rbrugaro commented Nov 6, 2024

Thanks @logan-markewich!

I tested your PR with OPEA microservices. No issues with llm but there is an issue w embeddings.

If you want to try yourself you can launch the opea microservice following one of the options of this README: https://github.com/opea-project/GenAIComps/tree/main/comps/embeddings/tei/llama_index

Here are some checks after launching OPEA embedding microservice:
Healthcheck
$ curl [http://localhost:6000/v1/health_check](http://localhost:6000/v1/health_check/)
-X GET
-H 'Content-Type: application/json'
{"Service Title":"opea_service@embedding_tei_langchain/MicroService","Service Description":"OPEA Microservice Infrastructure"}

test with TextDoc input
$ curl [http://localhost:6000/v1/embeddings](http://localhost:6000/v1/embeddings/)
-X POST
-d '{"text":"Hello, world!"}'
-H 'Content-Type: application/json'
{"id":"99f4c65b5d0400e6e7872c1c40ccca22","text":"Hello, world!","embedding":[0.0072799586,0.03128243,0.042269558,-0.0063485163,-0.0077530267,0.04410854,0.0590988,0.045488138,-0.02227289,-0.037855495,-0.0067130174,-0.003363003,-0.08337993,-0.0051587764,0.018290782,0.05117798,0.031379085,0.013451039,0.006205124,0.03658391,-0.034309622,0.004967953,0.03619922,0.013945511,0.02123839

test with EmbeddingRequest input
$ curl http://localhost:6000/v1/embeddings
-H "Content-Type: application/json"
-d '{
"input": "The food was delicious and the waiter...",
"model": "BAAI/bge-base-en-v1.5",
"encoding_format": "float"
}'
{"object":"list","model":null,"data":[{"index":0,"object":"embedding","embedding":[-0.047599554,-0.0055408874,0.018720696,-0.02567272,0.06591436,0.010764824,-0.017667083,0.05037338,-0.0014907457,-0.015081286,0.017558929,-0.013840443,-0.008849931,0.020852309,0.0034780612,0.008818663,0.0742632,-0.031206904,0.039376773,-0.037804432,-0.022693759,-0.014756503,-0.011191109,0.012695486,-0.014381637,-0.022319226,-0.007750338,0.024972478,-0.031080812,-0.044731475,0.050228924,0.031221831,-0.013695359,-0.030721791,-0.012819336,-0.0092149945,0.04743948,-0.0103669595,-0.03255411,0.03777905,-0.0052547073,0.00875531,0.024079613,0.041530963,-0.

Input in embeddingRequest is String
this is input model='BAAI/bge-base-en-v1.5' input='The food was delicious and the waiter...' encoding_format='float' dimensions=None user=None request_type='embedding' and type <class 'comps.cores.proto.api_proto
col.EmbeddingRequest'>

Using your code:

from llama_index.embeddings.opea import OPEAEmbedding

embed_model = OPEAEmbedding(
    model_name="BAAI/bge-base-en-v1.5",
    api_base="http://localhost:6000/v1",
    embed_batch_size=10,
)

embeddings = embed_model.get_text_embedding("text")

Input of embeddingRequest is converting the string to a list
[2024-11-06 05:55:00,090] [ INFO] - embedding_tei_langchain - this is input model='BAAI/bge-base-en-v1.5' input=['text'] encoding_format='base64' dimensions=None user=None request_type='embedding' and type <class 'comps.cores.proto.api_protocol.EmbeddingRequest'>

To make it work I had to remove the square brackets in the openai get_embedding function:

client.embeddings.create(input=[text], model=engine, **kwargs).data[0].embedding

Please have a check!

@logan-markewich
Copy link
Collaborator Author

Thanks for testing @rbrugaro -- taking a look

@logan-markewich
Copy link
Collaborator Author

logan-markewich commented Nov 7, 2024

@rbrugaro it seems like the OPEA embeddings are not fully openai-compatible?

Only being able to embed a single piece of text at a time is probably not ideal 😅 -- I think I raised a slightly related issue in the opea repo (mentioned above)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants