Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added integration test #483

Closed
wants to merge 14 commits into from
Closed

Added integration test #483

wants to merge 14 commits into from

Conversation

ravit-db
Copy link
Contributor

  • Created the setup for Reconciliation Integration Test
  • Added integration tests for Databricks to Databricks data reconciliation

@ravit-db ravit-db requested a review from a team as a code owner June 24, 2024 06:33
@ravit-db ravit-db requested a review from himanishk June 24, 2024 06:33
@ravit-db ravit-db self-assigned this Jun 24, 2024
Copy link

codecov bot commented Jun 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.19%. Comparing base (96d3ed5) to head (b7ce93e).
Report is 18 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #483      +/-   ##
==========================================
+ Coverage   98.03%   98.19%   +0.16%     
==========================================
  Files          64       63       -1     
  Lines        4371     4491     +120     
  Branches      502      507       +5     
==========================================
+ Hits         4285     4410     +125     
+ Misses         46       41       -5     
  Partials       40       40              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

github-actions bot commented Jun 24, 2024

Coverage tests results

394 tests  ±0   353 ✅ +2   5s ⏱️ ±0s
  2 suites ±0     2 💤 ±0 
  2 files   ±0    39 ❌  - 2 

For more details on these failures, see this check.

Results for commit a23edbe. ± Comparison against base commit 951638b.

♻️ This comment has been updated with latest results.



@pytest.fixture
def setup_databricks_src(setup_teardown, spark, test_config):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setup can me moved into setup.py or conf.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

)


def test_execute_report_type_is_data_with_all_match(setup_databricks_src, spark, ws, test_config, reconcile_config):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are all the scenarios we are testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has all four report types and has tested all the configurations. Let me know if anything needs to be added or if something should be covered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should be testing other configuration like scenarios, complex types, thresholds, transformation, filters etc.

.github/workflows/acceptance.yml Outdated Show resolved Hide resolved

def _get_test_config() -> TestConfig:
return TestConfig(
db_table_catalog="samples",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep in mind that all catalogs are wiped out by our test infrastructure, so you have to pre-create randomly-named catalogs for every tests and tear them down. actually... let's create a test catalog just for you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tear down process is already in place to cleanup the catalog. Let me know if I missed out anything or any alternatives on that.

Copy link

github-actions bot commented Jul 5, 2024

❌ 835/856 passed, 21 failed, 25m24s total

❌ test_execute_report_type_is_data_with_all_match: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (3m25.506s)
pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS. SQLSTATE: 42704
[gw0] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_execute_report_type_is_all: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.686s)
pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS. SQLSTATE: 42704
[gw0] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_execute_report_type_is_schema: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.37s)
pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS. SQLSTATE: 42704
[gw0] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_execute_report_type_is_row: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.546s)
pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS. SQLSTATE: 42704
[gw0] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_execute_fail_for_tables_not_available: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.384s)
pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS. SQLSTATE: 42704
[gw0] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_execute_report_type_is_data_with_all_without_keys: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.184s)
pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog.
If you did not qualify the name with a catalog, verify the current_schema() output, or qualify the name with the correct catalog.
To tolerate the error on drop use DROP SCHEMA IF EXISTS. SQLSTATE: 42704
[gw0] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_recon_for_report_type_is_data: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (33.483s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:30 INFO [databricks.labs.remorph.reconcile.execute] report_type: data, data_source: databricks 
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for source: SELECT LOWER(SHA2(CONCAT(TRIM(s_address), TRIM(s_name), COALESCE(TRIM(s_nationkey), '_null_recon_'), TRIM(s_phone), COALESCE(TRIM(s_suppkey), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey AS s_nationkey, s_suppkey AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a'
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for target: SELECT LOWER(SHA2(CONCAT(TRIM(s_address_t), TRIM(s_name), COALESCE(TRIM(s_nationkey_t), '_null_recon_'), TRIM(s_phone_t), COALESCE(TRIM(s_suppkey_t), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey_t AS s_nationkey, s_suppkey_t AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a'
09:30 WARNING [databricks.labs.remorph.reconcile.compare] Unmatched data is written to /tmp/pytest-of-runner/pytest-0/popen-gw6/test_recon_for_report_type_is_0 successfully
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 44 AS s_nationkey, 4 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 33 AS s_nationkey, 3 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'data' report completed.
09:30 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:30 INFO [databricks.labs.remorph.reconcile.execute] report_type: data, data_source: databricks 
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for source: SELECT LOWER(SHA2(CONCAT(TRIM(s_address), TRIM(s_name), COALESCE(TRIM(s_nationkey), '_null_recon_'), TRIM(s_phone), COALESCE(TRIM(s_suppkey), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey AS s_nationkey, s_suppkey AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a'
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for target: SELECT LOWER(SHA2(CONCAT(TRIM(s_address_t), TRIM(s_name), COALESCE(TRIM(s_nationkey_t), '_null_recon_'), TRIM(s_phone_t), COALESCE(TRIM(s_suppkey_t), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey_t AS s_nationkey, s_suppkey_t AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a'
09:30 WARNING [databricks.labs.remorph.reconcile.compare] Unmatched data is written to /tmp/pytest-of-runner/pytest-0/popen-gw6/test_recon_for_report_type_is_0 successfully
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 44 AS s_nationkey, 4 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 33 AS s_nationkey, 3 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:30 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'data' report completed.
09:30 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_generate_final_reconcile_output_row: AssertionError: assert ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[]) == ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=None, schema=None), exception_message='')]) (4m46.803s)
AssertionError: assert ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[]) == ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=None, schema=None), exception_message='')])
  
  Matching attributes:
  ['recon_id']
  Differing attributes:
  ['results']
  
  Drill down into differing attribute results:
    results: [] != [ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=None, schema=None), exception_message='')]
    Right contains one more item: ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=None, schema=None), exception_message='')
    
    Full diff:
    + []
    - [
    -     ReconcileTableOutput(
    -         target_table_name='target_test_catalog.target_test_schema.target_supplier',
    -         source_table_name='source_test_schema.supplier',
    -         status=StatusOutput(
    -                    row=False,
    -                    column=None,
    -                    schema=None,
    -                ),
    -         exception_message='',
    -     ),
    - ]
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.main successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.metrics successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Final reconcile output: ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[])
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.main successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.metrics successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:30 INFO [databricks.labs.remorph.reconcile.recon_capture] Final reconcile output: ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[])
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_generate_final_reconcile_output_data: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.metrics: [DELTA_TABLE_NOT_FOUND] Delta table `TEST_SCHEMA`.`metrics` doesn't exist.; (8.249s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.metrics: [DELTA_TABLE_NOT_FOUND] Delta table `TEST_SCHEMA`.`metrics` doesn't exist.;
AppendData RelationV2[] hive_metastore.TEST_SCHEMA.metrics hive_metastore.TEST_SCHEMA.metrics, true, false
+- Project [3938601196649161384 AS recon_table_id#16091L, named_struct(row_comparison, CASE WHEN (data IN (all,row,data) AND ( = )) THEN named_struct(missing_in_source, 3, missing_in_target, 4) ELSE cast(null as struct<missing_in_source:int,missing_in_target:int>) END, column_comparison, CASE WHEN (data IN (all,data) AND ( = )) THEN named_struct(absolute_mismatch, 2, threshold_mismatch, 2, mismatch_columns, name) ELSE cast(null as struct<absolute_mismatch:int,threshold_mismatch:int,mismatch_columns:string>) END, schema_comparison, CASE WHEN (data IN (all,schema) AND ( = )) THEN true ELSE cast(null as boolean) END) AS recon_metrics#16092, named_struct(status, false, run_by_user, remorph, exception_message, ) AS run_metrics#16093, cast(2024-07-08 09:31:03.011018 as timestamp) AS inserted_ts#16094]
   +- OneRowRelation
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.main successfully.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.metrics: [DELTA_TABLE_NOT_FOUND] Delta table `TEST_SCHEMA`.`metrics` doesn't exist.;
AppendData RelationV2[] hive_metastore.TEST_SCHEMA.metrics hive_metastore.TEST_SCHEMA.metrics, true, false
+- Project [3938601196649161384 AS recon_table_id#16091L, named_struct(row_comparison, CASE WHEN (data IN (all,row,data) AND ( = )) THEN named_struct(missing_in_source, 3, missing_in_target, 4) ELSE cast(null as struct<missing_in_source:int,missing_in_target:int>) END, column_comparison, CASE WHEN (data IN (all,data) AND ( = )) THEN named_struct(absolute_mismatch, 2, threshold_mismatch, 2, mismatch_columns, name) ELSE cast(null as struct<absolute_mismatch:int,threshold_mismatch:int,mismatch_columns:string>) END, schema_comparison, CASE WHEN (data IN (all,schema) AND ( = )) THEN true ELSE cast(null as boolean) END) AS recon_metrics#16092, named_struct(status, false, run_by_user, remorph, exception_message, ) AS run_metrics#16093, cast(2024-07-08 09:31:03.011018 as timestamp) AS inserted_ts#16094]
   +- OneRowRelation
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.main successfully.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.metrics: [DELTA_TABLE_NOT_FOUND] Delta table `TEST_SCHEMA`.`metrics` doesn't exist.;
AppendData RelationV2[] hive_metastore.TEST_SCHEMA.metrics hive_metastore.TEST_SCHEMA.metrics, true, false
+- Project [3938601196649161384 AS recon_table_id#16091L, named_struct(row_comparison, CASE WHEN (data IN (all,row,data) AND ( = )) THEN named_struct(missing_in_source, 3, missing_in_target, 4) ELSE cast(null as struct<missing_in_source:int,missing_in_target:int>) END, column_comparison, CASE WHEN (data IN (all,data) AND ( = )) THEN named_struct(absolute_mismatch, 2, threshold_mismatch, 2, mismatch_columns, name) ELSE cast(null as struct<absolute_mismatch:int,threshold_mismatch:int,mismatch_columns:string>) END, schema_comparison, CASE WHEN (data IN (all,schema) AND ( = )) THEN true ELSE cast(null as boolean) END) AS recon_metrics#16092, named_struct(status, false, run_by_user, remorph, exception_message, ) AS run_metrics#16093, cast(2024-07-08 09:31:03.011018 as timestamp) AS inserted_ts#16094]
   +- OneRowRelation
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_recon_for_report_type_schema: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (17.649s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: schema, data_source: databricks 
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Schema comparison is completed.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: schema, data_source: databricks 
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Schema comparison is completed.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_generate_final_reconcile_output_schema: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_PROTOCOL_CHANGED] ProtocolChangedException: The protocol version of the Delta table has been changed by a concurrent update. This happens when multiple writers are writing to an empty directory. Creating the table ahead of time will avoid this conflict. (7.032s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_PROTOCOL_CHANGED] ProtocolChangedException: The protocol version of the Delta table has been changed by a concurrent update. This happens when multiple writers are writing to an empty directory. Creating the table ahead of time will avoid this conflict. 
Conflicting commit: {"timestamp":1720431068601,"userId":"1425339244351829","userName":"3fe685a1-96cc-4fec-8cdb-6944f5c9787e","operation":"CREATE OR REPLACE TABLE AS SELECT","operationParameters":{"partitionBy":[],"clusterBy":[],"description":null,"isManaged":true,"properties":{"delta.enableDeletionVectors":"true"},"statsOnLoad":false},"clusterId":"DATABRICKS_CLUSTER_ID","isolationLevel":"WriteSerializable","isBlindAppend":false,"operationMetrics":{"numFiles":"0","numOutputRows":"0","numOutputBytes":"0"},"tags":{"noRowsCopied":"true","restoresDeletedRows":"false"},"engineInfo":"Databricks-Runtime/15.2.x-scala2.12","txnId":"84d46127-e533-4726-b2f1-60b405076db4"}
Refer to https://docs.microsoft.com/CLOUD_ENV/databricks/delta/concurrency-control for more details.
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_PROTOCOL_CHANGED] ProtocolChangedException: The protocol version of the Delta table has been changed by a concurrent update. This happens when multiple writers are writing to an empty directory. Creating the table ahead of time will avoid this conflict. 
Conflicting commit: {"timestamp":1720431068601,"userId":"1425339244351829","userName":"3fe685a1-96cc-4fec-8cdb-6944f5c9787e","operation":"CREATE OR REPLACE TABLE AS SELECT","operationParameters":{"partitionBy":[],"clusterBy":[],"description":null,"isManaged":true,"properties":{"delta.enableDeletionVectors":"true"},"statsOnLoad":false},"clusterId":"DATABRICKS_CLUSTER_ID","isolationLevel":"WriteSerializable","isBlindAppend":false,"operationMetrics":{"numFiles":"0","numOutputRows":"0","numOutputBytes":"0"},"tags":{"noRowsCopied":"true","restoresDeletedRows":"false"},"engineInfo":"Databricks-Runtime/15.2.x-scala2.12","txnId":"84d46127-e533-4726-b2f1-60b405076db4"}
Refer to https://docs.microsoft.com/CLOUD_ENV/databricks/delta/concurrency-control for more details.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_PROTOCOL_CHANGED] ProtocolChangedException: The protocol version of the Delta table has been changed by a concurrent update. This happens when multiple writers are writing to an empty directory. Creating the table ahead of time will avoid this conflict. 
Conflicting commit: {"timestamp":1720431068601,"userId":"1425339244351829","userName":"3fe685a1-96cc-4fec-8cdb-6944f5c9787e","operation":"CREATE OR REPLACE TABLE AS SELECT","operationParameters":{"partitionBy":[],"clusterBy":[],"description":null,"isManaged":true,"properties":{"delta.enableDeletionVectors":"true"},"statsOnLoad":false},"clusterId":"DATABRICKS_CLUSTER_ID","isolationLevel":"WriteSerializable","isBlindAppend":false,"operationMetrics":{"numFiles":"0","numOutputRows":"0","numOutputBytes":"0"},"tags":{"noRowsCopied":"true","restoresDeletedRows":"false"},"engineInfo":"Databricks-Runtime/15.2.x-scala2.12","txnId":"84d46127-e533-4726-b2f1-60b405076db4"}
Refer to https://docs.microsoft.com/CLOUD_ENV/databricks/delta/concurrency-control for more details.
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_recon_for_report_type_all: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (20.628s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: all, data_source: snowflake 
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_suppkey number)
        Parse datatype: CREATE TABLE dummy (s_suppkey DECIMAL(38,0))
        Databricks datatype: create table dummy (s_suppkey number)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_name varchar)
        Parse datatype: CREATE TABLE dummy (s_name STRING)
        Databricks datatype: create table dummy (s_name varchar)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_address varchar)
        Parse datatype: CREATE TABLE dummy (s_address STRING)
        Databricks datatype: create table dummy (s_address varchar)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_nationkey number)
        Parse datatype: CREATE TABLE dummy (s_nationkey DECIMAL(38,0))
        Databricks datatype: create table dummy (s_nationkey number)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_phone varchar)
        Parse datatype: CREATE TABLE dummy (s_phone STRING)
        Databricks datatype: create table dummy (s_phone varchar)
        
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Schema comparison is completed.
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for source: SELECT LOWER(SHA2(CONCAT(TRIM(s_address), TRIM(s_name), COALESCE(TRIM(s_nationkey), '_null_recon_'), TRIM(s_phone), COALESCE(TRIM(s_suppkey), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey AS s_nationkey, s_suppkey AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a'
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for target: SELECT LOWER(SHA2(CONCAT(TRIM(s_address_t), TRIM(s_name), COALESCE(TRIM(s_nationkey_t), '_null_recon_'), TRIM(s_phone_t), COALESCE(TRIM(s_suppkey_t), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey_t AS s_nationkey, s_suppkey_t AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a'
09:31 WARNING [databricks.labs.remorph.reconcile.compare] Unmatched data is written to /tmp/pytest-of-runner/pytest-0/popen-gw6/test_recon_for_report_type_all0 successfully
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 44 AS s_nationkey, 4 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 33 AS s_nationkey, 3 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'all' report completed.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: all, data_source: snowflake 
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_suppkey number)
        Parse datatype: CREATE TABLE dummy (s_suppkey DECIMAL(38,0))
        Databricks datatype: create table dummy (s_suppkey number)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_name varchar)
        Parse datatype: CREATE TABLE dummy (s_name STRING)
        Databricks datatype: create table dummy (s_name varchar)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_address varchar)
        Parse datatype: CREATE TABLE dummy (s_address STRING)
        Databricks datatype: create table dummy (s_address varchar)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_nationkey number)
        Parse datatype: CREATE TABLE dummy (s_nationkey DECIMAL(38,0))
        Databricks datatype: create table dummy (s_nationkey number)
        
09:31 INFO [databricks.labs.remorph.reconcile.schema_compare] 
        Source datatype: create table dummy (s_phone varchar)
        Parse datatype: CREATE TABLE dummy (s_phone STRING)
        Databricks datatype: create table dummy (s_phone varchar)
        
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Schema comparison is completed.
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for source: SELECT LOWER(SHA2(CONCAT(TRIM(s_address), TRIM(s_name), COALESCE(TRIM(s_nationkey), '_null_recon_'), TRIM(s_phone), COALESCE(TRIM(s_suppkey), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey AS s_nationkey, s_suppkey AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a'
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for target: SELECT LOWER(SHA2(CONCAT(TRIM(s_address_t), TRIM(s_name), COALESCE(TRIM(s_nationkey_t), '_null_recon_'), TRIM(s_phone_t), COALESCE(TRIM(s_suppkey_t), '_null_recon_')), 256)) AS hash_value_recon, s_nationkey_t AS s_nationkey, s_suppkey_t AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a'
09:31 WARNING [databricks.labs.remorph.reconcile.compare] Unmatched data is written to /tmp/pytest-of-runner/pytest-0/popen-gw6/test_recon_for_report_type_all0 successfully
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 22 AS s_nationkey, 2 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for target: WITH recon AS (SELECT 44 AS s_nationkey, 4 AS s_suppkey), src AS (SELECT TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey_t), '_null_recon_') AS s_nationkey, TRIM(s_phone_t) AS s_phone, COALESCE(TRIM(s_suppkey_t), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.sampling_query] Sampling Query for source: WITH recon AS (SELECT 33 AS s_nationkey, 3 AS s_suppkey), src AS (SELECT TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, COALESCE(TRIM(s_nationkey), '_null_recon_') AS s_nationkey, TRIM(s_phone) AS s_phone, COALESCE(TRIM(s_suppkey), '_null_recon_') AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a') SELECT src.s_address, src.s_name, src.s_nationkey, src.s_phone, src.s_suppkey FROM src INNER JOIN recon AS recon ON COALESCE(TRIM(src.s_nationkey), '_null_recon_') = COALESCE(TRIM(recon.s_nationkey), '_null_recon_') AND COALESCE(TRIM(src.s_suppkey), '_null_recon_') = COALESCE(TRIM(recon.s_suppkey), '_null_recon_')
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'all' report completed.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_recon_for_report_type_is_row: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (18.337s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: row, data_source: snowflake 
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for source: SELECT LOWER(SHA2(CONCAT(TRIM(s_address), TRIM(s_name), COALESCE(TRIM(s_nationkey), '_null_recon_'), TRIM(s_phone), COALESCE(TRIM(s_suppkey), '_null_recon_')), 256)) AS hash_value_recon, TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, s_nationkey AS s_nationkey, TRIM(s_phone) AS s_phone, s_suppkey AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a'
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for target: SELECT LOWER(SHA2(CONCAT(TRIM(s_address_t), TRIM(s_name), COALESCE(TRIM(s_nationkey_t), '_null_recon_'), TRIM(s_phone_t), COALESCE(TRIM(s_suppkey_t), '_null_recon_')), 256)) AS hash_value_recon, TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, s_nationkey_t AS s_nationkey, TRIM(s_phone_t) AS s_phone, s_suppkey_t AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a'
09:31 WARNING [databricks.labs.remorph.reconcile.compare] Unmatched data is written to /tmp/pytest-of-runner/pytest-0/popen-gw6/test_recon_for_report_type_is_1 successfully
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'row' report completed.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: row, data_source: snowflake 
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for source: SELECT LOWER(SHA2(CONCAT(TRIM(s_address), TRIM(s_name), COALESCE(TRIM(s_nationkey), '_null_recon_'), TRIM(s_phone), COALESCE(TRIM(s_suppkey), '_null_recon_')), 256)) AS hash_value_recon, TRIM(s_address) AS s_address, TRIM(s_name) AS s_name, s_nationkey AS s_nationkey, TRIM(s_phone) AS s_phone, s_suppkey AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address = 'a'
09:31 INFO [databricks.labs.remorph.reconcile.query_builder.hash_query] Hash Query for target: SELECT LOWER(SHA2(CONCAT(TRIM(s_address_t), TRIM(s_name), COALESCE(TRIM(s_nationkey_t), '_null_recon_'), TRIM(s_phone_t), COALESCE(TRIM(s_suppkey_t), '_null_recon_')), 256)) AS hash_value_recon, TRIM(s_address_t) AS s_address, TRIM(s_name) AS s_name, s_nationkey_t AS s_nationkey, TRIM(s_phone_t) AS s_phone, s_suppkey_t AS s_suppkey FROM :tbl WHERE s_name = 't' AND s_address_t = 'a'
09:31 WARNING [databricks.labs.remorph.reconcile.compare] Unmatched data is written to /tmp/pytest-of-runner/pytest-0/popen-gw6/test_recon_for_report_type_is_1 successfully
09:31 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'row' report completed.
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_generate_final_reconcile_output_all: AssertionError: assert ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[]) == ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=False, schema=True), exception_message='')]) (36.108s)
AssertionError: assert ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[]) == ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=False, schema=True), exception_message='')])
  
  Matching attributes:
  ['recon_id']
  Differing attributes:
  ['results']
  
  Drill down into differing attribute results:
    results: [] != [ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=False, schema=True), exception_message='')]
    Right contains one more item: ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=False, schema=True), exception_message='')
    
    Full diff:
    + []
    - [
    -     ReconcileTableOutput(
    -         target_table_name='target_test_catalog.target_test_schema.target_supplier',
    -         source_table_name='source_test_schema.supplier',
    -         status=StatusOutput(
    -                    row=False,
    -                    column=False,
    -                    schema=True,
    -                ),
    -         exception_message='',
    -     ),
    - ]
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.main successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.metrics successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Final reconcile output: ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[])
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.main successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.metrics successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Data written to TEST_SCHEMA.details successfully.
09:31 INFO [databricks.labs.remorph.reconcile.recon_capture] Final reconcile output: ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[])
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_generate_final_reconcile_output_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `TEST_SCHEMA`.`main` because it already exists. (5.762s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `TEST_SCHEMA`.`main` because it already exists.
Choose a different name, drop or replace the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, or add the OR REFRESH clause to refresh the existing streaming table. SQLSTATE: 42P07
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `TEST_SCHEMA`.`main` because it already exists.
Choose a different name, drop or replace the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, or add the OR REFRESH clause to refresh the existing streaming table. SQLSTATE: 42P07
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `TEST_SCHEMA`.`main` because it already exists.
Choose a different name, drop or replace the existing object, add the IF NOT EXISTS clause to tolerate pre-existing objects, or add the OR REFRESH clause to refresh the existing streaming table. SQLSTATE: 42P07
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_write_and_read_unmatched_df_with_volumes_with_exception: Failed: DID NOT RAISE (3.235s)
Failed: DID NOT RAISE <class 'databricks.labs.remorph.reconcile.exception.ReadAndWriteWithVolumeException'>
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_clean_unmatched_df_from_volume_with_exception: Failed: DID NOT RAISE (591ms)
Failed: DID NOT RAISE <class 'Exception'>
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 WARNING [databricks.labs.remorph.reconcile.recon_capture] Unmatched DF cleaned up from /path/that/does/not/exist successfully.
09:31 WARNING [databricks.labs.remorph.reconcile.recon_capture] Unmatched DF cleaned up from /path/that/does/not/exist successfully.
[gw7] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_schema_recon_with_data_source_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (11.099s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: all, data_source: snowflake 
09:31 WARNING [databricks.labs.remorph.reconcile.connectors.data_source] Runtime exception occurred while fetching schema using (org, data, supplier) : Mock Exception
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:31 INFO [databricks.labs.remorph.reconcile.execute] report_type: all, data_source: snowflake 
09:31 WARNING [databricks.labs.remorph.reconcile.connectors.data_source] Runtime exception occurred while fetching schema using (org, data, supplier) : Mock Exception
09:31 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_schema_recon_with_general_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (11.002s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:32 INFO [databricks.labs.remorph.reconcile.execute] report_type: schema, data_source: snowflake 
09:32 WARNING [databricks.labs.remorph.reconcile.execute] Schema comparison is completed.
09:32 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:32 INFO [databricks.labs.remorph.reconcile.execute] report_type: schema, data_source: snowflake 
09:32 WARNING [databricks.labs.remorph.reconcile.execute] Schema comparison is completed.
09:32 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_data_recon_with_general_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (8.847s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:32 INFO [databricks.labs.remorph.reconcile.execute] report_type: data, data_source: snowflake 
09:32 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'data' report completed.
09:32 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:32 INFO [databricks.labs.remorph.reconcile.execute] report_type: data, data_source: snowflake 
09:32 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'data' report completed.
09:32 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
❌ test_data_recon_with_source_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (8.819s)
databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python
09:32 INFO [databricks.labs.remorph.reconcile.execute] report_type: data, data_source: snowflake 
09:32 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'data' report completed.
09:32 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
09:32 INFO [databricks.labs.remorph.reconcile.execute] report_type: data, data_source: snowflake 
09:32 WARNING [databricks.labs.remorph.reconcile.execute] Reconciliation for 'data' report completed.
09:32 ERROR [databricks.labs.remorph.reconcile.recon_capture] Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id'
[gw6] linux -- Python 3.10.14 /home/runner/work/remorph/remorph/.venv/bin/python

Running from acceptance #13

run: pip install hatch==1.9.4

- name: Run integration tests
uses: databrickslabs/sandbox/acceptance@acceptance/v0.2.2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you also need acceptance_path in the .codegen.json https://github.com/databrickslabs/ucx/blob/main/.codegen.json#L9

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@nfx nfx added the feat/recon making sure that remorphed query produces the same results as original label Jul 19, 2024
@sundarshankar89
Copy link
Contributor

too divergent will refactor and recreate a new one.

@nfx nfx deleted the feature/integration_test branch August 21, 2024 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge feat/recon making sure that remorphed query produces the same results as original
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants