Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRS comparison fails due to WKT format differences when using rioxarray raster and geopandas vector (WKT1 vs. WKT2) #159

Open
AlexDo1 opened this issue Nov 18, 2024 · 2 comments · May be fixed by #160

Comments

@AlexDo1
Copy link

AlexDo1 commented Nov 18, 2024

Hello and thank you for the amazing package that computes (exact) zonal statistics in a fast and reliable way. I am using Python with geopandas and xarray/rioxarray together with exactextract and have found an issue in your code.

The CRS comparison:

def crs_matches(a, b):
if a.srs_wkt() is None or b.srs_wkt() is None:
return True
if a.srs_wkt() == b.srs_wkt():
return True
try:
from osgeo import osr
srs_a = osr.SpatialReference()
srs_b = osr.SpatialReference()
srs_a.ImportFromWkt(a.srs_wkt())
srs_b.ImportFromWkt(b.srs_wkt())
try:
srs_a.StripVertical()
srs_b.StripVertical()
except AttributeError:
pass
return srs_a.IsSame(srs_b)
except ImportError:
return False

relies on comparing the WKT representation of the CRS. However, geopandas and rioxarray handle WKT strings differently:

  • geopandas uses pyproj, which defaults to WKT2 (WKT2_2019).
  • rioxarray uses rasterio, which defaults to WKT1 (WKT1_GDAL).
    This results in discrepancies even when both datasets have the same CRS (e.g., EPSG:4326). As a result, the CRS comparison gives a warning, even though the underlying coordinate systems are identical ("Spatial reference system of input features does not exactly match raster.")

This behavior can be reproduced with the following example:

import geopandas as gpd
import rioxarray
import xarray as xr
import numpy as np
from pyproj import CRS
from exactextract.exact_extract import crs_matches
from exactextract.feature import GeoPandasFeatureSource
from exactextract.raster import XArrayRasterSource

# Define the CRS as EPSG:4326
crs = CRS.from_epsg(4326)

# Create a simple GeoDataFrame
gdf = gpd.GeoDataFrame(geometry=[], crs=crs)

# Create a simple xarray Dataset
data = np.random.rand(5, 5)
lat = np.linspace(-90, 90, 5)  # y-dimension (latitude)
lon = np.linspace(-180, 180, 5)  # x-dimension (longitude)

# Create the Dataset
ds = xr.Dataset(
    {
        "band1": (["y", "x"], data)
    },
    coords={
        "y": lat,
        "x": lon,
    },
)
ds = ds.rio.write_crs(crs)  # Write the CRS to the dataset

# Print WKT representations
print("GeoDataFrame CRS (WKT):")
print(gdf.crs.to_wkt())  # WKT2
# GEOGCRS["WGS 84",ENSEMBLE["World Geodetic System 1984 ensemble",MEMBER["World Geodetic System 1984 (Transit)"],MEMBER["World Geodetic System 1984 (G730)"],MEMBER["World Geodetic System 1984 (G873)"],MEMBER["World Geodetic System 1984 (G1150)"],MEMBER["World Geodetic System 1984 (G1674)"],MEMBER["World Geodetic System 1984 (G1762)"],MEMBER["World Geodetic System 1984 (G2139)"],ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]],ENSEMBLEACCURACY[2.0]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],CS[ellipsoidal,2],AXIS["geodetic latitude (Lat)",north,ORDER[1],ANGLEUNIT["degree",0.0174532925199433]],AXIS["geodetic longitude (Lon)",east,ORDER[2],ANGLEUNIT["degree",0.0174532925199433]],USAGE[SCOPE["Horizontal component of 3D system."],AREA["World."],BBOX[-90,-180,90,180]],ID["EPSG",4326]]

print("\nDataset CRS (WKT):")
print(ds.rio.crs.to_wkt())  # WKT1
# GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]

# Attempt to compare WKT strings
if gdf.crs.to_wkt() == ds.rio.crs.to_wkt():
    print("\nCRS match.")
else:
    print("\nCRS mismatch.")

# Attempt to compare CRS objects as exactextract does
if not crs_matches(GeoPandasFeatureSource(gdf), XArrayRasterSource(ds)):
    print("\nCRS mismatch (exactextract).")
else:
    print("\nCRS match (exactextract).")

So even though the crs of the GeoDataFrame and the Dataset were created from the same pyproj.CRS object, exactextracts gives a warning that the is a CRS mismatch.

As a possible solution I would propose to just compare EPSG codes instead of WKT representations.

@dbaston
Copy link
Member

dbaston commented Nov 18, 2024

If you have the GDAL Python bindings available, it will use those to check CRS equality. I guess another fallback could be added to use pyproj to do the same.

@AlexDo1
Copy link
Author

AlexDo1 commented Nov 19, 2024

I use exactextract in a Docker container where I don't have the GDAL Python bindings available.
So without the Python GDAL bindings, exactextract will always warn that the spatial reference systems do not match, which is a very concerning warning for users.
I will just ignore the warning for now, thank you for the clarification.

dbaston added a commit to dbaston/exactextract that referenced this issue Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants