[Enhancement]: Add a property to ExtraList for extracting data #2362

CaoHaiNam · 2024-11-20T04:27:32Z

Is there an existing issue for this?

I have searched the existing issues

What would you like to be added?

I would like a property to be added to the ExtraList class that allows users to extract its data. This property should make it easy to use the underlying data without any additional structure or formatting.

Why is this needed?

Currently, when using the latest master branch version, search or query functions return an ExtraList object. While this is helpful for additional metadata, it complicates situations where the underlying data can not be accessed directly for further processing or calculations.

Adding this feature will improve usability and make the class more intuitive for users who need direct access to the data.

Anything else?

Current behavior:

result = ExtraList([1, 2, 3], extra={"total": 3})
print(result)  # Output: data: ['1', '2', '3'], extra_info: {'total': 3}

# Attempting to process data
sum(result)  # Fails, as `ExtraList` doesn't directly behave like a list for some operations.

Desired behavior:

result = ExtraList([1, 2, 3], extra={"total": 3})
print(result.data)  # Output: ['1', '2', '3']
sum(result.data)    # Works as expected.

Here’s an example of the current issue when using ExtraList with the Milvus query functionality:

from pymilvus import Collection, FieldSchema, CollectionSchema, DataType, connections
connections.connect("default", host="localhost", port="19530")
import random

# Define schema and collection
schema = CollectionSchema([
    FieldSchema("film_id", DataType.INT64, is_primary=True),
    FieldSchema("film_date", DataType.INT64),
    FieldSchema("films", dtype=DataType.FLOAT_VECTOR, dim=2)
])
collection = Collection("test_collection_query", schema)

# Insert sample data
data = [
    [i for i in range(10)],
    [i + 2000 for i in range(10)],
    [[random.random() for _ in range(2)] for _ in range(10)],
]
collection.insert(data)

# Create index and load collection
index_param = {"index_type": "FLAT", "metric_type": "L2", "params": {}}
collection.create_index("films", index_param)
collection.load()

# Perform query
expr = "film_id == 3"
res = collection.query(expr, output_fields=["film_date"])

# Current behavior
print(res)  
# Output: data: ["{'film_id': 3, 'film_date': 2003}"], cannot access data directly

# Desired behavior with a new property
print(res.data)  
# Output: ["{'film_id': 3, 'film_date': 2003}"], a plain list object accessible for further calculations.

In the example above, when querying the Milvus collection, it returns an ExtraList. This structure makes it difficult to directly extract the query result for further processing. Adding a .data property or a similar method would allow users to directly access the query results, making them easier to work with for calculations, transformations, or other downstream processes.

The text was updated successfully, but these errors were encountered:

CaoHaiNam · 2024-11-20T05:05:02Z

/assign @CaoHaiNam

CaoHaiNam added the kind/enhancement New feature or request label Nov 20, 2024

CaoHaiNam mentioned this issue Nov 20, 2024

[Enhancement]: Add a property to ExtraList for extracting data #2363

Open

sre-ci-robot assigned CaoHaiNam Nov 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement]: Add a property to ExtraList for extracting data #2362

[Enhancement]: Add a property to ExtraList for extracting data #2362

CaoHaiNam commented Nov 20, 2024

CaoHaiNam commented Nov 20, 2024

[Enhancement]: Add a property to ExtraList for extracting data #2362

[Enhancement]: Add a property to ExtraList for extracting data #2362

Comments

CaoHaiNam commented Nov 20, 2024

Is there an existing issue for this?

What would you like to be added?

Why is this needed?

Anything else?

CaoHaiNam commented Nov 20, 2024