Vector cube design: property preservation when using aggregate spatial #466

soxofaan · 2023-08-02T16:52:44Z

(Related to use case experiments discussed in #448 #449)

Set up:

vector cube vc1, loaded from a GeoJSON feature collection, where each feature is some polygon with some properties, e.g. crop type, population, an ML target value or class, ...
raster cube cube with e.g. NDVI data
vc2 = aggregate_spatial(data=cube, geometries=vc1, reducer="mean")

Question: are the original GeoJSON-style properties of vc1 still available in vc2?

Yes -> E.g. so that vc2 can directly be used to train a ML model?
No -> aggregate_spatial only considers vc1's geometry and ignores any existing cube data? The user then has to take some tedious steps to "join"/merge vc1 and vc2 again in order to use it for ML applications.

I kind of remember vector cube discussions where we wanted preservation of properties (the "Yes" approach), e.g. using aggregate_spatial to "enrich" a vector cube with additional "columns" of aggregation data.
However, I think the current design of vector cubes enforces the "No" approach because there are just cube values and you can not generically/automatically combine pre-existing cube data with new (aggregation) cube data.

The text was updated successfully, but these errors were encountered:

- flattening: move options to export phase iso vector cube constructor - introduce `VectorCube.from_geodataframe` wiith support for promoting selected columns to cube values - regardless of promotion: all properties are still associated with `VectorCube.geometries` for now (otherwise properties can not be preserved when using `aggregate_spatial`, see Open-EO/openeo-api#504) - only promote numerical values by default for now

jdries · 2023-08-03T10:49:27Z

Don't have a good answer to this one. For now, we kind of implement the 'no' approach, but we do try to preserve things like the feature identifier, as this is relevant to keep track of which timeseries belongs to which geometry.

m-mohr · 2023-09-30T08:31:38Z

The feature identifier is not part of the (GeoJSON) properties and belongs to the "core metadata" (as it resides at the top-level). That was at least always my "mental" model, based on GeoJSON. I'd think it's probably a good idea to keep track of it anyway. Maybe we need to clarify this?

My aim in 2.0.0 was to clearly communicate whether properties are preserved or not. Maybe it's not clear enough in all processes, but at least aggregate_spatial mentioned in the geometries parameter:

Feature properties are preserved for vector data cubes and all GeoJSON Features.

load_geojson, vector_to_random_points, vector_to_regular_points and vector_buffer similarly say:

Feature properties are preserved.

soxofaan · 2023-10-16T13:08:40Z

The feature identifier is not part of the (GeoJSON) properties and belongs to the "core metadata" (as it resides at the top-level).

I assume you are talking here about a "id" member of a Feature object, e.g.

{
  "type": "Feature", 
  "id": "abc123", 
  "geometry": {...}, 
  "properties": {...}

While this seems to be part of the GeoJSON RFC (If a Feature has a commonly used identifier, that identifier SHOULD be included as a member of the Feature object with the name "id"), I think I've seen quite some cases in the wild where the "id" is under the "properties" instead of more at the "Feature" top level. Maybe this can be fixed by adding an option to GeoJSON loading processes to promote a given property to more standard "id".

soxofaan · 2023-10-16T13:50:36Z

My aim in 2.0.0 was to clearly communicate whether properties are preserved or not. Maybe it's not clear enough in all processes, but at least aggregate_spatial mentioned in the geometries parameter:

Feature properties are preserved for vector data cubes and all GeoJSON Features.

Well, part of the problem I'm trying to raise here is that there is a conflict here regarding vector cube design:

the consensus has risen to "load" GeoJSON properties as vector cube values (like pixel values in a raster cube)
when using a vector cube as geometry in a process like aggregate_spatial, you can not preserve its original cube values because you are generating (aggregating) new cube values, which can not combined, generically, with the original cube values.

For example:

initial vector cube with dimensions ["geometry", "property"], where the "property" dimension has labels "population" and "cropclass"
you use that vector cube with aggregate_spatial on a raster cube with dimensions ["time", "bands", "x, "y"] (e.g. bands "R", "G", "B")
the output of aggregate_spatial is a vector cube with dimensions ["time", "bands", "geometry"]

You can not combine the original cube data ["geometry", "property"] with the aggregated cube data ["time", "bands", "geometry"] in a single cube, e.g. because the number of dimensions is different. The dimension type of "property" (type "other"?) and "bands" (type "bands") is probably also not compatible strictly speaking, but that could be adapted to relatively easy I guess.

So what I'm trying to say is this current statement in aggregate_spatial

Feature properties are preserved for vector data cubes and all GeoJSON Features.

is incompatible with the current consensus for vector cube design (store properties as cube values).

soxofaan mentioned this issue Aug 3, 2023

Running UDF on Vector Cube Open-EO/openeo-python-driver#197

Open

m-mohr transferred this issue from Open-EO/openeo-api Sep 30, 2023

m-mohr added vector discussion needed labels Jan 3, 2024

m-mohr added this to the 2.0.0 milestone Jan 3, 2024

soxofaan mentioned this issue Feb 2, 2024

VectorCube support Open-EO/openeo-python-driver#114

Open

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vector cube design: property preservation when using aggregate spatial #466

Vector cube design: property preservation when using aggregate spatial #466

soxofaan commented Aug 2, 2023 •

edited

Loading

jdries commented Aug 3, 2023

m-mohr commented Sep 30, 2023

soxofaan commented Oct 16, 2023

soxofaan commented Oct 16, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Vector cube design: property preservation when using aggregate spatial #466

Vector cube design: property preservation when using aggregate spatial #466

Comments

soxofaan commented Aug 2, 2023 • edited Loading

jdries commented Aug 3, 2023

m-mohr commented Sep 30, 2023

soxofaan commented Oct 16, 2023

soxofaan commented Oct 16, 2023

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

soxofaan commented Aug 2, 2023 •

edited

Loading