19 Nov 00:23

sfc-gh-bli

3c85815

v3.1.0 Latest

Latest

Improvements

Upgraded JDBC to 3.19.0.
Changed the internal file format from Json to Parquet when loading structured data.
- Introduced a new parameter use_json_in_structured_data, which default to false. Once enabled, this change will be revoked.

New Features

Supported Parquet file format when loading data from Spark to Snowflake.
- Introduced a new parameter use_parquet_in_write, which default to false. When enabled, Spark connector will only use Parquet file format when loading data from Spark to Snowflake.
- Introduced a new dependency parquet-avro. The default version is 1.13.1. Since its dependency, parquet-column, is a Spark built-in lib, an incompatible issue may be occurred during runtime. Please manually adjust the version of parquet-avro to fix this issue.

Assets 2

31 Jul 19:31

sfc-gh-bli

v3.0.0

384dccf

v3.0.0

Improvements
- Upgraded JDBC to 3.17.0 to Support LOB
- Supports Spark 3.5.0
- Removed the Advanced Query Pushdown feature
  - Since version 3.0.0, Spark connector will only have one artifact in each release, which will be compatible with most Spark versions.
  - The old version of Spark connector (2.x.x) will continue to be supported up to 2 years.
  - A conversion tool which can convert DataFrames between Spark and Snowpark will be introduced in the future Spark connector release soon. It will be an alternative of Advanced Query Pushdown feature.
Bug Fixes
- Remove the requirement of SFUSER parameter when using OAUTH

Assets 2

03 Jun 17:10

sfc-gh-bli

v2.16.0-spark_3.4

0468f3e

Release Spark Connector 2.16.0

Bug Fixes

Fix proxy protocol accidentally impacts s3 protocol.

Improvements

Upgrade JDBC to 3.16.1
Clean up legacy spark streaming code
Disable abort_detached_query at Session level by default

Assets 2

23 Feb 18:47

sfc-gh-bli

v2.15.0-spark_3.4

3a26f61

Release Spark Connector 2.15.0

Bug Fixes

Fixed "cancelled queries can be restarted in the Spark retries after application closed"

New Features

Introduce a new parameter trim_space which defaults to false. When turn it on, Spark connector will automatically trim values of StringType columns when saving to Snowflake table.

Assets 2

18 Jan 00:04

sfc-gh-bli

v2.14.0-spark_3.4

d7f8e98

Release Spark Connector 2.14.0

Improvement

Upgraded JDBC to 3.14.4

New Features

New parameter string_timestamp_format
- specifies the timestamp format when saving from string columns of Spark Dataframe to timestamp columns of Snowflake table
- the default value is TZHTZM YYYY-MM-DD HH24:MI:SS.FF9
- the detail of timestamp format can be found from here
- If the source Dataframe contains timestamp columns, this parameter will be reset to the default value and can't be overwritten.

Assets 2

19 Sep 18:10

sfc-gh-bli

v2.13.0-spark_3.4

9428e03

Release Spark Connector 2.13.0

Bug Fixes

Can't upload binary data from Spark to Snowflake if the source Dataframe contains structure columns.

Assets 2

23 May 20:30

sfc-gh-mrui

v2.12.0-spark_3.4

fedb7a3

Release Spark Connector 2.12.0

Support Spark 3.4:

Added support for Spark 3.4

NOTE:

Starting from version 2.12.0, the Snowflake Connector for Spark supports Spark 3.2, 3.3 and 3.4.
Version 2.12.0 of the Snowflake Connector for Spark does not support Spark 3.1. Note that previous versions of the connector continue to support Spark 3.1.

Assets 2

21 Apr 16:23

sfc-gh-mrui

v2.11.3-spark_3.3

812ace9

Release Spark Connector 2.11.3

Updated the mechanism for writing DataFrames to accounts on GCP:

Updated the mechanism for writing DataFrames to accounts on GCP. After December 2023, previous versions of the Spark Connector will no longer be able to write DataFrames, due to changes in GCP.
Added the option to disable preactions and postactions validation for session sharing.
- To disable validation, set the option FORCE_SKIP_PRE_POST_ACTION_CHECK_FOR_SHARED_SESSION to true. The default is false.
- Important: Before setting this option, make sure that the queries in preactions and postactions don't affect the session settings. Otherwise, you may encounter issues with results.
Fixed an issue when performing a join or union across different schemas when the two DataFrames are accessing tables with different sfSchema and the same name table in sfSchema is in the left DataFrame.
Updated the connector to use the Snowflake JDBC driver 3.13.30.

Assets 2

20 Mar 22:54

sfc-gh-mrui

v2.11.2-spark_3.3

8257da4

Release Spark Connector 2.11.2

Added support sharing JDBC connection:

Added support for using the same JDBC connection for different jobs and actions when the same Spark Connector options are used to access Snowflake.
In previous versions, the Spark Connector created a new JDBC connector for each job or action.
The Spark Connector supports the following options and API methods for enabling and disabling this feature:
- To specify that the connector should not use the same JDBC connection, set the support_share_connection connector option to false. (The default value is true, which means that the feature is enabled.)
- To enable or disable the feature programmatically, call one of the following global static functions: SparkConnectorContext.disableSharedConnection() / SparkConnectorContext.enableSharingJDBCConnection().
- Note: In the following special cases, the Spark Connector will not use the shared connection:
  - If preactions or postactions are set, and those preactions or postactions are not CREATE TABLE, DROP TABLE, or MERGE INTO, the Spark Connector will not use the shared connection.
  - Utility functions in Utils such as Utils.runQuery(), Utils.getJDBCConnection() will not use the shared connection.
Updated the connector to use the Snowflake JDBC driver 3.13.29.

Assets 2

13 Dec 21:28

sfc-gh-mrui

v2.11.1-spark_3.3

995a23b

Release Spark Connector 2.11.1

Added support for AWS VPCE deployments and fixed some bugs:

Support AWS VPCE. Added the configuration option S3_STAGE_VPCE_DNS_NAME for specifying the VPCE DNS name at the session level.
Updated the connector to close JDBC connections to avoid connection leakage.
Fixed a NullPointerException issue when sending telemetry messages.
Added a new configuration option treat_decimal_as_long to enable the Spark Connector to return Long values instead of BigDecimal values, if the query returns Decimal(<any_precision>, 0). WARNING: If the value is greater than the maximum value of Long, an error will be raised.
Added a new option proxy_protocol for specifying the proxy protocol (http or https) with AWS deployments. (The option has no effect on Azure and GCP deployments.)
Added support for counting rows in a table where the row count is greater than the maximum value of Integer.
Updated the connector to use the Snowflake JDBC driver 3.13.24.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements

New Features

Bug Fixes

Improvements

Bug Fixes

New Features

Improvement

New Features

Bug Fixes

Releases: snowflakedb/spark-snowflake

v3.1.0

Improvements

New Features

v3.0.0

Release Spark Connector 2.16.0

Bug Fixes

Improvements

Release Spark Connector 2.15.0

Bug Fixes

New Features

Release Spark Connector 2.14.0

Improvement

New Features

Release Spark Connector 2.13.0

Bug Fixes

Release Spark Connector 2.12.0

Release Spark Connector 2.11.3

Release Spark Connector 2.11.2

Release Spark Connector 2.11.1