exactextract Python, no coverage_fraction? #125

ZZMitch · 2024-06-24T16:23:50Z

Hello,

I have a shapefile (GeoDataFrame in Python) that I want to convert to raster (e.g., xarray in Python) with a certain pixel grid with each pixel quantifying the % coverage of that shapefile for the pixel area. After some searching, I came across exactextract.

I was able to able to install the Python version of exactextract using pip install exactextract and confirmed it is functioning properly:

import exactextract as ee
ee.exact_extract(rast = xras, vec = gdf, ops = 'mean')

Where xras is an xarray DataArray with the pixel grid I want to match and gdf is the GeoDataFrame where I want the % coverage. I know this test code does not get that result, but it does show the package is installed and working.

From what I can tell, there is no equivalent of coverage_fraction for the current Python version of exactextract?

I know they Python version is pretty new, so wanted to confirm this functionality is not yet available. If not, does anyone have any options for achieving this result currently in Python? I know about rasterio's rasterize but it is classifies pixels based on centroids or intersection ("all_touched"), rather than % coverage.

Thanks!

The text was updated successfully, but these errors were encountered:

dbaston · 2024-06-24T16:46:55Z

At the moment, this function only exists on the R side. It would not be difficult to expose in Python, but I haven't done so yet.

Here are some questions that come to mind on the implementation.

A limitation of the R function is that it returns a list of in-memory rasters, which is simple but greatly limits the size of the data you can process. Should Python do the same thing, and if so, what format? A list of GDAL in-memory rasters? A list of NumPy matrices with associated extents?

Alternatively (or additionally) should it write the outputs directly to disk to avoid memory limitations? As individual rasters - how would they be named? Or as a multiband raster with one band per input feature?

ZZMitch · 2024-06-25T12:58:19Z

Hey @dbaston, thanks for the information!

I am not sure how it would be applied, but I enjoy using xarray(which is backed by dask and numpy) to work with larger-than-memory and multi-dimensional raster files. On the vector side there is dask support for geopandas.

For larger-then-memory operations, could there be some process that runs on chunks of the input vector file and produces raster tiles - saved on disk... that could later be combined with xarray operations? I am not sure about naming, generally I am interested in rasterizing to a single layer (e.g., not different %s for different vector attributes).

Alternatively - if the operation will use too much memory the function could just error out, and it be up the user to chunk up their vector file to a size that will fit in their memory and combine the rasterized outputs later?

dcherian · 2024-09-11T18:39:36Z

A list of NumPy matrices with associated extents?

This could be OK but the structure of the output seems to most naturally fit a nD sparse array (e.g. https://sparse.pydata.org/en/stable/) i.e. we could add a "polygon" dimension - slices along this dimension contain fractional weights for that polygon. I bet we could construct this sparse arrays if you returned dense numpy arrays and associated extents from the C++ layer. Potentially it'd be better to return dense arrays that mapped to COO, CSR, or CSC representations of sparse arrays.

Then you could imagine chunking along the "polygon" dimension that maps to a chunked vector of geometries (e.g. dask-geopandas).

PS: your talk at FOSS4G-NA was quite nice!

dbaston added the enhancement New feature or request label Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exactextract Python, no coverage_fraction? #125

exactextract Python, no coverage_fraction? #125

ZZMitch commented Jun 24, 2024 •

edited

Loading

dbaston commented Jun 24, 2024

ZZMitch commented Jun 25, 2024

dcherian commented Sep 11, 2024 •

edited

Loading

exactextract Python, no coverage_fraction? #125

exactextract Python, no coverage_fraction? #125

Comments

ZZMitch commented Jun 24, 2024 • edited Loading

dbaston commented Jun 24, 2024

ZZMitch commented Jun 25, 2024

dcherian commented Sep 11, 2024 • edited Loading

ZZMitch commented Jun 24, 2024 •

edited

Loading

dcherian commented Sep 11, 2024 •

edited

Loading