You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now titiler-cmr can handle the case where granules are defined by a distinct time point and where each granule has the same set of variables. There many datasets in CMR where each granule represents the same timestep but have a different variable. For example, the Regridded Harmonized World Soil Database v1.2 dataset has 27 granules that each contain estimates of a different soil property.
To handle this case we could use the bands_regex parameter for the xarray backend case that would filter the granule results down to one that matches the regex. We would need to change the format of the mosaic_assets in this case since the ZarrReader can't handle the dictionary of {band: url} keys get_assets produces when you provide band_regex.
I hacked a solution together just to see if it is feasible, here are some tiles from the soil PH layer of that dataset:
$ git diff titiler/.
diff --git a/titiler/cmr/backend.py b/titiler/cmr/backend.py
index 50307e7..1d2a676 100644
--- a/titiler/cmr/backend.py+++ b/titiler/cmr/backend.py@@ -231,12 +231,21 @@ class CMRBackend(BaseBackend):
access=s3_auth_config.access,
bands_regex=bands_regex,
)
-
if not mosaic_assets:
raise NoAssetFoundError(
f"No assets found for tile {tile_z}-{tile_x}-{tile_y}"
)
+ # reformat the mosaic_assets to match expectation for xarray backend+ # would only want to do this for the backend="xarray" case...+ if bands_regex:+ asset = mosaic_assets[0]+ if len(asset) > 1:+ raise ValueError("bands_regex returned multiple assets!")+ url = list(asset["url"].values())[0]+ provider = asset["provider"]+ mosaic_assets = [{"url": url, "provider": provider}]+
def _reader(asset: Asset, x: int, y: int, z: int, **kwargs: Any) -> ImageData:
if (
s3_auth_config.strategy == "environment"
diff --git a/titiler/cmr/factory.py b/titiler/cmr/factory.py
index e10d3af..8e7b73d 100644
--- a/titiler/cmr/factory.py+++ b/titiler/cmr/factory.py@@ -121,7 +121,9 @@ def parse_reader_options(
if reader_params.backend == "xarray":
reader = ZarrReader
- read_options = {}+ read_options = {+ "bands_regex": rasterio_params.bands_regex,+ }
options = {
"variable": zarr_params.variable,
The text was updated successfully, but these errors were encountered:
netcdf stored as multiple assets (file per variable)
First, I think we should rename bands_regex -> assets_regex
I think I lost tracks but for xarray dataset we need a Variable= option, right? I'm not sure why we need to pass bands_regex to the reader (with read_options), the variable should be one asset from the list of assets returned
The mixed xarray/rasterio logic is starting to get a bit messy with conditional checks in the single CMRBackend class. Maybe we are at the point where it would be cleaner to have several backends: CMRRasterioBackend and CMRXarrayBackend. There could be some shared utility functions but this structure might make it easier to do the right thing for each of these cases.
Right now titiler-cmr can handle the case where granules are defined by a distinct time point and where each granule has the same set of variables. There many datasets in CMR where each granule represents the same timestep but have a different variable. For example, the Regridded Harmonized World Soil Database v1.2 dataset has 27 granules that each contain estimates of a different soil property.
To handle this case we could use the
bands_regex
parameter for thexarray
backend case that would filter the granule results down to one that matches the regex. We would need to change the format of themosaic_assets
in this case since theZarrReader
can't handle the dictionary of{band: url}
keysget_assets
produces when you provideband_regex
.I hacked a solution together just to see if it is feasible, here are some tiles from the soil PH layer of that dataset:
The text was updated successfully, but these errors were encountered: