Releases: zarr-developers/VirtualiZarr
v1.1.0
This release adds Icechunk support!!
It also brings a complete refactoring of the system of readers and writers internally, which allowed us to make Kerchunk an optional dependency. There are also many other bugfixes and smaller improvements.
What's Changed
- xr.testing with ManifestArray fix (update isnan ufunc) by @ayushnag in #188
- Clarify that virtualizarr is a user-level replacement for kerchunk by @TomNicholas in #192
- Exclude empty
paths
onChunkDict
creation by @ghidalgo3 in #198 - Extend refspec support to [path] entries (without offset/length) by @maresb in #187
- Conformant ZarrV3 codecs and fill values by @ghidalgo3 in #193
- https access fix by @ayushnag in #196
- Handle scalar dataset variables by @ghidalgo3 in #205
- Set ZArray default fill_value as NaT for datetime64 by @thodson-usgs in #206
- Update .pre-commit-config mypy + bump ruff version by @norlandrhagen in #211
- Update static typing by @TomAugspurger in #213
- Adds concurrency to CI w/ cancel-in-progress=True by @norlandrhagen in #214
- Implement pydantic models as dataclasses by @TomAugspurger in #210
- use the theme options for
pydata_sphinx_theme
by @keewis in #223 - Removes default storage options by @norlandrhagen in #228
- open_virtual_dataset with dmr++ by @ayushnag in #113
- Internal refactor to separate reading and writing concerns by @TomNicholas in #231
- Let Xarray handle
decode_times
by @norlandrhagen in #232 - Support specifying single HDF Group in open_virtual_dataset by @scottyhq in #165
- Adds defaults in
open_virtual_dataset_from_v3_store
by @norlandrhagen in #234 - Virtualizarr + Coiled Serverless Example Notebook by @norlandrhagen in #233
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #236
- Add example to create a virtual dataset using lithops by @thodson-usgs in #203
- Update backend.py (tiny typo) by @mdsumner in #240
- Makes mypy a seperate CI job by @norlandrhagen in #254
- Fix mypy errors around numpy functions not being strictly type hinted by @TomNicholas in #252
- Allow
open_virtual_dataset
to read existing Kerchunk references by @norlandrhagen in #251 - Skip tests that require kerchunk by @TomNicholas in #259
- allow creating references for empty archival datasets by @keewis in #260
- Split kerchunk reader up by @TomNicholas in #261
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #250
- Add CI job for testing upstream versions of dependencies by @TomNicholas in #264
- Add Icechunk Support by @mpiannucci in #256
New Contributors
- @ayushnag made their first contribution in #188
- @ghidalgo3 made their first contribution in #198
- @maresb made their first contribution in #187
- @thodson-usgs made their first contribution in #206
- @keewis made their first contribution in #223
- @mdsumner made their first contribution in #240
- @mpiannucci made their first contribution in #256
Full Changelog: v1.0.0...v1.1.0
v1.0.0
This release marks VirtualiZarr as mostly feature-complete, in the sense of achieving feature parity with kerchunk's logic for combining datasets, providing an easier way to manipulate kerchunk references in memory and generate kerchunk reference files on disk.
Future VirtualiZarr development will focus on generalizing and upstreaming useful concepts into the Zarr specification, the Zarr-Python library, Xarray, and possibly some new packages. See the roadmap in the documentation for details.
What's Changed
- Hypothesis test broadcasting by @TomNicholas in #139
- Empty release notes for v0.2 by @TomNicholas in #145
- Mark tests which require network access by @TomNicholas in #144
- Install dependencies for tests via mamba by @maxrjones in #148
- Use default version scheme for setuptools_scm by @maxrjones in #149
- Use 3 numpy arrays for manifest internally by @TomNicholas in #107
- Rename paths in manifest by @TomNicholas in #152
- Ensure _ARRAY_DIMENSIONS get dropped from attrs by @TomNicholas in #153
- Ensure attributes on coordinate variables are preserved during round-tripping by @TomNicholas in #154
- Identify non dimension coords by @TomNicholas in #156
- Also test exporting references to in-memory kerchunk reference dict by @TomNicholas in #158
- Use magic bytes to identify file formats by @scottyhq in #143
- Decoding
cftime_variables
by @jsignell in #122 - Fix opening tiff and fits files by @TomNicholas in #162
- Clarify that virtual datasets are not normal xarray datasets by @TomNicholas in #173
- Warn on index creation by @TomNicholas in #170
- Update roadmap for v1.0 by @TomNicholas in #164
- Add example of using cftime_variables to usage docs by @TomNicholas in #174
- Future-proof offset and size records in chunkmanifest by @moradology in #177
- Use a set to avoid duplicate var names from kerchunk by @moradology in #179
- v1.0 release notes by @TomNicholas in #181
New Contributors
- @scottyhq made their first contribution in #143
- @moradology made their first contribution in #177
Full Changelog: v0.1.0...v1.0
v0.1.0
The first release of VirtualiZarr!
This release presents the basic MVP of this library, including the ability to inspect netCDF4/HDF5 files, store the byte ranges in an xarray.Dataset
via ManifestArray
objects, concatenate those objects, then serialize the result to disk as kerchunk-formatted reference files.
Expect more features and significant optimizations soon.
What's Changed
- Xarray accessor to create kerchunk reference dict by @TomNicholas in #28
- Equality checking by @TomNicholas in #30
- Support xarray concat, including broadcasting by @TomNicholas in #34
- CI for running tests by @TomNicholas in #36
- Roughout of Sphinx Docs by @norlandrhagen in #27
- Updated doc.yml to include pip by @norlandrhagen in #40
- Adds netCDF3 vs netCDF4 distinction to _automatically_determine_filetype. by @norlandrhagen in #43
- Test concat of dimension coordinate not backed by an index by @TomNicholas in #44
- Rename open_dataset_via_kerchunk to open_virtual_dataset by @TomNicholas in #47
- Narrative docs by @TomNicholas in #48
- More narrative docs by @TomNicholas in #50
- Update CI checks with ruff by @norlandrhagen in #54
- open_virtual_dataset with and without indexes by @TomNicholas in #52
- API docs by @TomNicholas in #56
- Updated NetCDF IO path by @norlandrhagen in #55
- Created conftest.py and moved two fixtures into conftest by @norlandrhagen in #57
- Ab/filters dtype by @abarciauskas-bgse in #66
- Switching netcdf3 & netcdf4 filetype detection to file magic 🧙 by @norlandrhagen in #64
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #70
- Load selected variables instead of making them virtual by @TomNicholas in #69
- pin_kerchunk_0.2.2 by @norlandrhagen in #75
- Remove python 3.12 from CI matrix by @TomNicholas in #76
- Convert user defined filetype to FileType by @norlandrhagen in #79
- FAQ page by @TomNicholas in #81
- Try to remove sidebar in docs by @TomNicholas in #82
main.yml
CI installs from pyproject.toml by @norlandrhagen in #90- Update installation instructions by @jbusecke in #91
- Write manifests to zarr store by @TomNicholas in #45
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #101
- Bump Ruff version and add formatting by @jbusecke in #98
- Opening 0D scalars by @TomNicholas in #102
- Fix bug with expand dims of a scalar array by @TomNicholas in #103
- Install xarray from main by @jsignell in #106
- Unpin kerchunk (set floor) and enable Python 3.12 by @jsignell in #108
- Depend on latest version of xarray by @TomNicholas in #109
- Remove python 3.9 by @jsignell in #112
- Adding
reader_options
kwargs to open_virtual_dataset. by @norlandrhagen in #67 - Write to parquet by @jsignell in #110
- Test fsspec roundtrip by @TomNicholas in #42
- Inline loaded variables into kerchunk references by @TomNicholas in #73
- Release notes page by @TomNicholas in #120
- requires-python = ">=3.10" by @abarciauskas-bgse in #127
- Pass args and add test by @abarciauskas-bgse in #128
- [pre-commit.ci] pre-commit autoupdate by @pre-commit-ci in #129
- Allow other fsspec protocols than local and s3 by @TomAugspurger in #126
- Add dunder version to top-level init.py by @maxrjones in #133
- Add release workflow by @maxrjones in #136
- Only run distribution workflow on releases by @maxrjones in #140
- change default reader_options to None by @TomNicholas in #137
- Replace np.NaN with np.nan in preparation for numpy 2.0 by @TomNicholas in #138
New Contributors
- @TomNicholas made their first contribution in #28
- @abarciauskas-bgse made their first contribution in #66
- @pre-commit-ci made their first contribution in #70
- @jbusecke made their first contribution in #91
- @jsignell made their first contribution in #106
- @TomAugspurger made their first contribution in #126
- @maxrjones made their first contribution in #133
Full Changelog: https://github.com/zarr-developers/VirtualiZarr/commits/v0.1