Skip to content

Releases: googleapis/python-bigquery-dataframes

v1.16.0

04 Sep 20:53
6fdb6b1
Compare
Choose a tag to compare

1.16.0 (2024-09-04)

Features

  • Add DataFrame.struct.explode to add struct subfields to a DataFrame (#916) (ad2f75e)
  • Implement bigframes.bigquery.json_extract_array (#910) (575a29e)
  • Recover struct column from exploded Series (#904) (7dd304c)

Bug Fixes

  • Fix issue with iterating on >10gb dataframes (#949) (2b0f0fa)
  • Improve Series.replace for dict input (#907) (4208044)
  • NullIndex in ML model.predict error (#917) (612271d)
  • Struct field non-nullable type issue. (#914) (149d5ff)
  • Unordered mode errors in ml train_test_split (#925) (85d7c21)

Performance Improvements

Dependencies

  • Re-introduce support for numpy 1.24.x (#931) (3d71913)
  • Update minimum support to Pandas 1.5.3 and Pyarrow 10.0.1 (#903) (7ed3962)

Documentation

  • Add Claude3 ML and RemoteFunc notebooks (#930) (cfd16c1)
  • Create sample notebook to manipulate struct and array data (#883) (3031903)
  • Update struct examples. (#953) (d632cd0)
  • Use unstack() from BigQuery DataFrames instead of pandas in the PyPI sample notebook (#890) (d1883cc)

v1.15.0

20 Aug 18:44
e43e0e5
Compare
Choose a tag to compare

1.15.0 (2024-08-20)

Features

  • Add llm.TextEmbeddingGenerator to support new embedding models (#905) (6bc6a41)
  • Add ml.llm.Claude3TextGenerator model (#901) (7050038)

Documentation

  • Add columns for "requires ordering/index" to supported APIs summary (#892) (d2fc51a)
  • Remove duplicate description for kms_key_name (#898) (1053d56)
  • Update embedding model notebooks (#906) (d9b8ef5)

v1.14.0

14 Aug 02:21
ae07274
Compare
Choose a tag to compare

1.14.0 (2024-08-14)

Features

  • Implement bigframes.bigquery.json_extract (#868) (3dbf84b)
  • Implement Series.str.__getitem__ (#897) (e027b7e)

Bug Fixes

  • Fix caching from generating row numbers in partial ordering mode (#872) (52b7786)

Performance Improvements

  • Generate SQL with fewer CTEs (#877) (eb60804)
  • Speed up compilation by reducing redundant type normalization (#896) (e0b11bc)

Documentation

v1.13.0

05 Aug 22:43
5317327
Compare
Choose a tag to compare

1.13.0 (2024-08-05)

Features

  • df.apply(axis=1) to support remote function with mutiple params (#851) (2158818)
  • Allow windowing in 'partial' ordering mode (#861) (ca26fe5)
  • Create a separate OrderingModePartialPreviewWarning for more fine-grained warning filters (#879) (8753bdd)

Bug Fixes

  • Fix issue with invalid sql generated by ml distance functions (#865) (9959fc8)

Documentation

  • Create sample notebook using ordering_mode="partial" (#880) (c415eb9)
  • Update streaming notebook (#875) (e9b0557)

v1.12.0

31 Jul 22:11
8e00fe2
Compare
Choose a tag to compare

1.12.0 (2024-07-31)

Features

  • Add bigframes-mode label to query jobs (#832) (c9eaff0)
  • Add config option to set partial ordering mode (#855) (823c0ce)
  • Add stratify param support to ml.model_selection.train_test_split method (#815) (27f8631)
  • Add streaming.StreamingDataFrame class (#864) (a7d7197)
  • Allow DataFrame.join for self-join on Null index (#860) (e950533)
  • Support remote function cleanup with session.close (#818) (ed06436)
  • Support to_csv/parquet/json to local files/objects (#858) (d0ab9cc)

Bug Fixes

  • Fewer relation joins from df self-operations (#823) (0d24f73)
  • Fix 'sql' property for null index (#844) (1b6a556)
  • Fix unordered mode using ordered path to print frame (#839) (93785cb)
  • Reduce redundant remote_function deployments (#856) (cbf2d42)

Documentation

  • Add partner attribution steps to integrations sample notebook (#835) (d7b333f)
  • Make get_global_session/close_session/reset_session appears in the docs (#847) (01d6bbb)

v1.11.1

09 Jul 01:28
ee2b660
Compare
Choose a tag to compare

1.11.1 (2024-07-08)

Documentation

  • Remove session and connection in llm notebook (#821) (74170da)
  • Remove the experimental flask icon from the public docs (#820) (067ff17)

v1.11.0

01 Jul 19:57
6d947a2
Compare
Choose a tag to compare

1.11.0 (2024-07-01)

Features

  • Add .agg support for size (#792) (87e6018)
  • Add bigframes.bigquery.json_set (#782) (1b613e0)
  • Add bigframes.streaming.to_pubsub method to create continuous query that writes to Pub/Sub (#801) (b47f32d)
  • Add DataFrame.to_arrow to create Arrow Table from DataFrame (#807) (1e3feda)
  • Add PolynomialFeatures support to to_gbq and pipelines (#805) (57d98b9)
  • Add Series.peek to preview data efficiently (#727) (580e1b9)
  • Expose gcf memory param in remote_function (#803) (014765c)
  • More informative error when query plan too complex (#811) (136dc24)

Bug Fixes

  • Include internally required packages in remote_function hash (#799) (4b8fc15)

Documentation

  • Document dtype limitation on row processing remote_function (#800) (487dff6)

v1.10.0

25 Jun 23:08
2e692e9
Compare
Choose a tag to compare

1.10.0 (2024-06-21)

Features

  • Add dataframe.insert (#770) (e8bab68)
  • Add groupby head API (#791) (44202bc)
  • Add ml.preprocessing.PolynomialFeatures class (#793) (b4fbb51)
  • Bigframes.streaming module for continuous queries (#703) (0433a1c)
  • Include index columns in DataFrame.sql if they are named (#788) (c8d16c0)

Bug Fixes

  • Allow __repr__ to work with uninitialed DataFrame/Series/Index (#778) (e14c7a9)
  • Df.loc with the 2nd input as bigframes boolean Series (#789) (a4ac82e)
  • Ensure numpy version matches in remote_function deployment (#798) (324d93c)
  • Fix temp table creation retries by now throwing if table already exists. (#787) (0e57d1f)
  • Self-join optimization doesn't needlessly invalidate caching (#797) (1b96b80)

v1.9.0

10 Jun 22:39
b7b134e
Compare
Choose a tag to compare

1.9.0 (2024-06-10)

Features

  • Allow functions returned from bpd.read_gbq_function to execute outside of apply (#706) (ad7d8ac)
  • Support bigquery.vector_search() (#736) (dad66fd)
  • Support score() in GeminiTextGenerator (#740) (b2c7d8b)
  • Support bytes type in remote_function (#761) (4915424)
  • Support fit() in GeminiTextGenerator (#758) (d751f5c)

Bug Fixes

  • ARIMAPlus loads auto_arima_min_order param (#752) (39d7013)
  • Improve to_pandas_batches for large results (#746) (61f18cb)
  • Resolve issue with unset thread-local options (#741) (d93dbaf)

Documentation

  • Fix ML.EVALUATE spelling (#749) (7899749)
  • Remove LogisticRegression normal_equation strategy (#753) (ea5d367)

v1.8.0

03 Jun 17:11
b5a3928
Compare
Choose a tag to compare

1.8.0 (2024-05-31)

Features

  • merge only generates a default index if both inputs already have an index (#733) (25d049c)
  • Add +, - as unary ops, ^ binary op (#724) (968d825)
  • Add GroupBy.size() to get number of rows in each group (#479) (1fca588)
  • Add DataFrame ~ operator (#721) (354abc1)
  • Add GeminiText 1.5 Preview models (#737) (56cbd3b)
  • Add slot_millis and add stats to session object (#725) (72e9583)
  • Adds bigframes.bigquery.array_to_string to convert array elements to delimited strings (#731) (f12c906)
  • Allow functions decorated with bpd.remote_function() to execute locally (#704) (d850da6)
  • Ensure "bigframes-api" label is always set on jobs, even if the API is unknown (#722) (1832778)
  • Support ml.SimpleImputer in bigframes (#708) (4c4415f)
  • Support type annotations to supply input and output types to bpd.remote_function() decorator (#717) (4a12e3c)
  • Support type annotations with bpd.remote_function() and axis=1 (a preview feature) (#730) (e5a2992)

Bug Fixes

  • Correct index labels in multiple aggregations for DataFrameGroupBy (#723) (6a78c89)
  • Fix Null index assign series to column (#711) (ffb4b57)
  • Set bpd.remote_function()s input_types and output_types default to None to allow omitting them when type annotations are present (#729) (0e25a3b)
  • Warn and disable time travel for linked datasets (#712) (085fa9d)

Performance Improvements

  • Optimize dataframe-series alignment on axis=1 (#732) (3d39221)

Documentation

  • Add examples to DataFrameGroupBy and SeriesGroupBy (#701) (e7da0f0)