52.1.0 (2024-07-02)
Implemented enhancements:
- Implement
eq
comparison for StructArray #5960 [arrow] - A new feature as a workaround hack to unavailable offset support in Arrow Java #5959 [arrow]
- Add
min_bytes
andmax_bytes
toPageIndex
#5949 [parquet] - Error message in ArrowNativeTypeOp::neg_checked doesn't include the operation #5944 [arrow]
- Add object_store_opendal as related projects #5925
- Opaque retry errors make debugging difficult #5923
- Implement arrow-row en/decoding for GenericByteView types #5921 [arrow]
- The arrow-rs repo is very large #5908
- [DISCUSS] Release arrow-rs / parquet patch release
52.0.1
#5906 [arrow] - Implement
compare_op
forGenericBinaryView
#5897 [arrow] - New null with view types are not supported #5893 [arrow]
- Cleanup ByteView construction #5878 [parquet] [arrow]
cast
kernel support forStringViewArray
andBinaryViewArray
\<--\>
DictionaryArray` #5861 [arrow]- parquet::ArrowWriter show allow writing Bloom filters before the end of the file #5859 [parquet]
- API to get memory usage for parquet ArrowWriter #5851 [parquet]
- Support writing
IntervalMonthDayNanoArray
to parquet via Arrow Writer #5849 [parquet] - Write parquet statistics for
IntervalDayTimeArray
,IntervalMonthDayNanoArray
andIntervalYearMonthArray
#5847 [parquet] - Make
RowSelection::from_consecutive_ranges
public #5846 [parquet] Schema::try_merge
should be able to merge List of any data type with List of Null data type #5843 [arrow]- Add a way to move
fields
out of parquetRow
#5841 [parquet] - Make
TimeUnit
andIntervalUnit
Copy
#5839 [arrow] - Limit Parquet Page Row Count By Default to reduce writer memory requirements with highly compressable columns #5797 [parquet]
- Report / blog on parquet metadata sizes for "large" (1000+) numbers of columns #5770 [parquet] [arrow]
- Structured ByteView Access (underlying StringView/BinaryView representation) #5736 [arrow]
- [parquet_derive] support OPTIONAL (def_level = 1) columns by default #5716
- Maps cast to other Maps with different Elements, Key and Value Names #5702 [arrow]
- Provide Arrow Schema Hint to Parquet Reader #5657 [parquet] [arrow]
Fixed bugs:
- Wrong error type in case of invalid amount in Interval components #5986 [arrow]
- Empty and Null structarray fails to IPC roundtrip #5920
- FixedSizeList got out of range when the total length of the underlying values over i32::MAX #5901 [arrow]
- Out of range when extending on a slice of string array imported through FFI #5896 [arrow]
- cargo msrv test is failing on main for
object_store
#5864 [parquet]
Documentation updates:
- chore: update RunArray reference in run_iterator.rs #5892 [arrow] (Weijun-H)
- Minor: Clarify when page index structures are read #5886 [parquet] (alamb)
- Improve Parquet reader/writer properties docs #5863 [parquet] (alamb)
- Refine documentation for
unary_mut
andbinary_mut
#5798 [arrow] (alamb)
Closed issues:
Merged pull requests:
- fix: error in case of invalid amount interval component #5987 [arrow] (DDtKey)
- Minor: fix clippy complaint in parquet_derive #5984 (alamb)
- Reduce repo size by removing accumulative commits in CI job #5982 (Owen-CH-Leung)
- Add operation in ArrowNativeTypeOp::neg_check error message (#5944) #5980 [arrow] (zhao-gang)
- Implement directly build byte view array on top of parquet buffer #5972 [parquet] (XiangpengHao)
- Handle flight dictionary ID assignment automatically #5971 [arrow] [arrow-flight] (thinkharderdev)
- Add view buffer for parquet reader #5970 [parquet] [arrow] (XiangpengHao)
- Add benchmark for reading binary/binary view from parquet #5968 [parquet] (XiangpengHao)
- feat(5851): ArrowWriter memory usage #5967 [parquet] (wiedld)
- Add ParquetMetadata::memory_size size estimation #5965 [parquet] (alamb)
- Fix FFI array offset handling #5964 [arrow] (tustvold)
- Implement sort for String/BinaryViewArray #5963 [arrow] (XiangpengHao)
- Improve error message for unsupported nested comparison #5961 [arrow] (alamb)
- chore(5797): change default parquet data_page_row_limit to 20k #5957 [parquet] (wiedld)
- Document process for PRs with breaking changes #5953 (alamb)
- Minor: fixup contribution guide about clippy #5952 (alamb)
- feat: add max_bytes and min_bytes on PageIndex #5950 [parquet] (tshauck)
- test: Add unit test for extending slice of list array #5948 [arrow] (viirya)
- minor: row format benches for bool & nullable int #5943 [arrow] (korowa)
- Better document support for nested comparison #5942 [arrow] (tustvold)
- Provide Arrow Schema Hint to Parquet Reader - Alternative 2 #5939 [parquet] (efredine)
like
benchmark for StringView #5936 [arrow] (alamb)- Fix typo in benchmark name
egexp
-->regexp
#5935 [arrow] (alamb) - Revert "Write Bloom filters between row groups instead of the end " #5932 [parquet] (alamb)
- Implement like/ilike etc for StringViewArray #5931 [arrow] (XiangpengHao)
- docs: Fix broken links of object_store_opendal README #5929 (Xuanwo)
- Expose
IntervalMonthDayNano
andIntervalDayTime
and update docs #5928 [arrow] (alamb) - Update proc-macro2 requirement from =1.0.85 to =1.0.86 #5927 [arrow] [arrow-flight] (dependabot[bot])
- docs: Add object_store_opendal as related projects #5926 (Xuanwo)
- Add eq benchmark for StringArray/StringViewArray #5924 [arrow] (XiangpengHao)
- Implement arrow-row encoding/decoding for view types #5922 [arrow] (XiangpengHao)
- fix(ipc): set correct row count when reading struct arrays with zero fields #5918 [arrow] (kawadakk)
- Update zstd-sys requirement from >=2.0.0, <2.0.10 to >=2.0.0, <2.0.12 #5913 [parquet] (dependabot[bot])
- fix: prevent potential out-of-range access in FixedSizeListArray #5902 [arrow] (BubbleCal)
- Implement compare operations for view types #5900 [arrow] (XiangpengHao)
- minor: use as_primitive replace downcast_ref #5898 [arrow] (Kikkon)
- fix: Adjust FFI_ArrowArray offset based on the offset of offset buffer #5895 [arrow] (viirya)
- implement
new_null_array
for view types #5894 [arrow] (XiangpengHao) - chore: add view type single column tests #5891 [parquet] (ariesdevil)
- Minor: expose timestamp_tz_format for csv writing #5890 [arrow] (tmi)
- chore: implement parquet error handling for object_store #5889 [parquet] (abhiaagarwal)
- Document when the ParquetRecordBatchReader will re-read metadata #5887 [parquet] (alamb)
- Add simple GC for view array types #5885 [arrow] (XiangpengHao)
- Update for new clippy rules #5881 [parquet] [arrow] (XiangpengHao)
- clean up ByteView construction #5879 [parquet] [arrow] (XiangpengHao)
- Avoid copy/allocation when read view types from parquet #5877 [parquet] (XiangpengHao)
- Document parquet ArrowWriter type limitations #5875 [parquet] (alamb)
- Benchmark for casting view to dict arrays (and the reverse) #5874 [arrow] (XiangpengHao)
- Implement Take for Dense UnionArray #5873 [arrow] (gstvg)
- Improve performance of casting
StringView
/BinaryView
toDictionaryArray
#5872 [arrow] (XiangpengHao) - Improve performance of casting
DictionaryArray
toStringViewArray
#5871 [arrow] (XiangpengHao) - fix: msrv CI for object_store #5866 (korowa)
- parquet: Fix warning about unused import #5865 [parquet] (progval)
- Preallocate for
FixedSizeList
inconcat
#5862 [arrow] (judahrand) - Faster primitive arrays encoding into row format #5858 [arrow] (korowa)
- Added panic message to docs. #5857 [arrow] (SeeRightThroughMe)
- feat: call try_merge recursively for list field #5852 [arrow] (mnpw)
- Minor: refine row selection example more #5850 [parquet] (alamb)
- Make RowSelection's from_consecutive_ranges public #5848 [parquet] (advancedxy)
- Add exposing fields from parquet row #5842 [parquet] (SHaaD94)
- Derive
Copy
forTimeUnit
andIntervalUnit
#5840 [arrow] (mbrobbel) - feat: support reading OPTIONAL column in parquet_derive #5717 (double-free)
- Add the ability for Maps to cast to another case where the field names are different #5703 [arrow] (HawaiianSpork)
* This Changelog was automatically generated by github_changelog_generator