Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request to Upgrade milvus-sdk-java to 2.4.2 for SparseVector Support in kafka-connect-milvus #28

Open
AGulshan opened this issue Nov 4, 2024 · 4 comments

Comments

@AGulshan
Copy link

AGulshan commented Nov 4, 2024

Hi,

I'm writing to request an upgrade to the milvus-sdk-java to version 2.4.2 in the kafka-connect-milvus project. The newer version supports SparseVector, which is essential for efficiently managing sparse data.

Advantages of Upgrading:
SparseVector Support: Key for handling sparse, high-dimensional data.
Improved Features and Performance: New SDK versions typically offer enhanced performance, more features, and fixes.

Suggested Changes:
Update the SDK version in pom.xml to 2.4.2.
Test to ensure the new SDK version works well with the project.

Upgrading can significantly improve the project's capability in processing sparse data types.
I look forward to your thoughts and am ready to help with updating and testing the new version.

Thanks!
Gulshan

@codingjaguar
Copy link
Collaborator

codingjaguar commented Nov 5, 2024

Hi @AGulshan, thanks for raising the request! We will plan this improvement. To help me better assess its priority, could you please share a bit more details on how sparse vector feature is used in your flow with milvus kafka connector? For example, is this for hybrid search with bm25 or splade?

@nianliuu can you help take a look at this feature request?

@nianliuu
Copy link
Collaborator

nianliuu commented Nov 5, 2024

Sure, I will take a look and make a release soon. Thank you! @AGulshan

@AGulshan
Copy link
Author

AGulshan commented Nov 5, 2024

Hi, @codingjaguar, @nianliuu!

Thank you for your prompt response!

We are currently experimenting with both dense and sparse embeddings using the BGE-M3 algorithm to optimize our search capabilities.
We are also planning to explore other algorithms that leverage sparse vector capabilities for hybrid search scenarios, possibly including techniques like bm25 or splade in the future.

Integrating the kafka-connect-milvus with support for SparseVector is essential for us to build and optimize our data pipelines effectively and would significantly enhance our flexibility and performance in managing complex embeddings.

Your consideration of this upgrade is much appreciated!

Best regards,
Gulshan

@codingjaguar
Copy link
Collaborator

Thanks for the context! We are also launching more support for hybrid search in Milvus 2.5 (will release by mid Nov), such as support of Elasticsearch equivalent full-text search capability. Please feel free to email me for any questions you have during the experiment. My email is [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants