-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exactextract Python, no coverage_fraction? #125
Comments
At the moment, this function only exists on the R side. It would not be difficult to expose in Python, but I haven't done so yet. Here are some questions that come to mind on the implementation. A limitation of the R function is that it returns a list of in-memory rasters, which is simple but greatly limits the size of the data you can process. Should Python do the same thing, and if so, what format? A list of GDAL in-memory rasters? A list of NumPy matrices with associated extents? Alternatively (or additionally) should it write the outputs directly to disk to avoid memory limitations? As individual rasters - how would they be named? Or as a multiband raster with one band per input feature? |
Hey @dbaston, thanks for the information! I am not sure how it would be applied, but I enjoy using xarray(which is backed by dask and numpy) to work with larger-than-memory and multi-dimensional raster files. On the vector side there is dask support for geopandas. For larger-then-memory operations, could there be some process that runs on chunks of the input vector file and produces raster tiles - saved on disk... that could later be combined with xarray operations? I am not sure about naming, generally I am interested in rasterizing to a single layer (e.g., not different %s for different vector attributes). Alternatively - if the operation will use too much memory the function could just error out, and it be up the user to chunk up their vector file to a size that will fit in their memory and combine the rasterized outputs later? |
This could be OK but the structure of the output seems to most naturally fit a nD sparse array (e.g. https://sparse.pydata.org/en/stable/) i.e. we could add a "polygon" dimension - slices along this dimension contain fractional weights for that polygon. I bet we could construct this sparse arrays if you returned dense numpy arrays and associated extents from the C++ layer. Potentially it'd be better to return dense arrays that mapped to COO, CSR, or CSC representations of sparse arrays. Then you could imagine chunking along the "polygon" dimension that maps to a chunked vector of geometries (e.g. dask-geopandas). PS: your talk at FOSS4G-NA was quite nice! |
Hello,
I have a shapefile (GeoDataFrame in Python) that I want to convert to raster (e.g., xarray in Python) with a certain pixel grid with each pixel quantifying the % coverage of that shapefile for the pixel area. After some searching, I came across exactextract.
I was able to able to install the Python version of exactextract using
pip install exactextract
and confirmed it is functioning properly:Where xras is an xarray DataArray with the pixel grid I want to match and gdf is the GeoDataFrame where I want the % coverage. I know this test code does not get that result, but it does show the package is installed and working.
From what I can tell, there is no equivalent of coverage_fraction for the current Python version of exactextract?
I know they Python version is pretty new, so wanted to confirm this functionality is not yet available. If not, does anyone have any options for achieving this result currently in Python? I know about rasterio's rasterize but it is classifies pixels based on centroids or intersection ("all_touched"), rather than % coverage.
Thanks!
The text was updated successfully, but these errors were encountered: