-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vector cube design: property preservation when using aggregate spatial #466
Comments
- flattening: move options to export phase iso vector cube constructor - introduce `VectorCube.from_geodataframe` wiith support for promoting selected columns to cube values - regardless of promotion: all properties are still associated with `VectorCube.geometries` for now (otherwise properties can not be preserved when using `aggregate_spatial`, see Open-EO/openeo-api#504) - only promote numerical values by default for now
- flattening: move options to export phase iso vector cube constructor - introduce `VectorCube.from_geodataframe` wiith support for promoting selected columns to cube values - regardless of promotion: all properties are still associated with `VectorCube.geometries` for now (otherwise properties can not be preserved when using `aggregate_spatial`, see Open-EO/openeo-api#504) - only promote numerical values by default for now
- flattening: move options to export phase iso vector cube constructor - introduce `VectorCube.from_geodataframe` wiith support for promoting selected columns to cube values - regardless of promotion: all properties are still associated with `VectorCube.geometries` for now (otherwise properties can not be preserved when using `aggregate_spatial`, see Open-EO/openeo-api#504) - only promote numerical values by default for now
Don't have a good answer to this one. For now, we kind of implement the 'no' approach, but we do try to preserve things like the feature identifier, as this is relevant to keep track of which timeseries belongs to which geometry. |
The feature identifier is not part of the (GeoJSON) properties and belongs to the "core metadata" (as it resides at the top-level). That was at least always my "mental" model, based on GeoJSON. I'd think it's probably a good idea to keep track of it anyway. Maybe we need to clarify this? My aim in 2.0.0 was to clearly communicate whether properties are preserved or not. Maybe it's not clear enough in all processes, but at least aggregate_spatial mentioned in the geometries parameter:
load_geojson, vector_to_random_points, vector_to_regular_points and vector_buffer similarly say:
|
I assume you are talking here about a "id" member of a Feature object, e.g. {
"type": "Feature",
"id": "abc123",
"geometry": {...},
"properties": {...} While this seems to be part of the GeoJSON RFC ( |
Well, part of the problem I'm trying to raise here is that there is a conflict here regarding vector cube design:
For example:
You can not combine the original cube data ["geometry", "property"] with the aggregated cube data ["time", "bands", "geometry"] in a single cube, e.g. because the number of dimensions is different. The dimension type of "property" (type "other"?) and "bands" (type "bands") is probably also not compatible strictly speaking, but that could be adapted to relatively easy I guess. So what I'm trying to say is this current statement in aggregate_spatial
is incompatible with the current consensus for vector cube design (store properties as cube values). |
(Related to use case experiments discussed in #448 #449)
Set up:
vc1
, loaded from a GeoJSON feature collection, where each feature is some polygon with some properties, e.g. crop type, population, an ML target value or class, ...cube
with e.g. NDVI datavc2 = aggregate_spatial(data=cube, geometries=vc1, reducer="mean")
Question: are the original GeoJSON-style properties of
vc1
still available invc2
?vc2
can directly be used to train a ML model?aggregate_spatial
only considersvc1
's geometry and ignores any existing cube data? The user then has to take some tedious steps to "join"/mergevc1
andvc2
again in order to use it for ML applications.I kind of remember vector cube discussions where we wanted preservation of properties (the "Yes" approach), e.g. using
aggregate_spatial
to "enrich" a vector cube with additional "columns" of aggregation data.However, I think the current design of vector cubes enforces the "No" approach because there are just cube values and you can not generically/automatically combine pre-existing cube data with new (aggregation) cube data.
The text was updated successfully, but these errors were encountered: