-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added integration test #483
Conversation
ravit-db
commented
Jun 24, 2024
- Created the setup for Reconciliation Integration Test
- Added integration tests for Databricks to Databricks data reconciliation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #483 +/- ##
==========================================
+ Coverage 98.03% 98.19% +0.16%
==========================================
Files 64 63 -1
Lines 4371 4491 +120
Branches 502 507 +5
==========================================
+ Hits 4285 4410 +125
+ Misses 46 41 -5
Partials 40 40 ☔ View full report in Codecov by Sentry. |
Coverage tests results394 tests ±0 353 ✅ +2 5s ⏱️ ±0s For more details on these failures, see this check. Results for commit a23edbe. ± Comparison against base commit 951638b. ♻️ This comment has been updated with latest results. |
|
||
|
||
@pytest.fixture | ||
def setup_databricks_src(setup_teardown, spark, test_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setup can me moved into setup.py or conf.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
) | ||
|
||
|
||
def test_execute_report_type_is_data_with_all_match(setup_databricks_src, spark, ws, test_config, reconcile_config): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are all the scenarios we are testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has all four report types and has tested all the configurations. Let me know if anything needs to be added or if something should be covered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be testing other configuration like scenarios, complex types, thresholds, transformation, filters etc.
|
||
def _get_test_config() -> TestConfig: | ||
return TestConfig( | ||
db_table_catalog="samples", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep in mind that all catalogs are wiped out by our test infrastructure, so you have to pre-create randomly-named catalogs for every tests and tear them down. actually... let's create a test catalog just for you
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tear down process is already in place to cleanup the catalog. Let me know if I missed out anything or any alternatives on that.
❌ 835/856 passed, 21 failed, 25m24s total ❌ test_execute_report_type_is_data_with_all_match: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (3m25.506s)
❌ test_execute_report_type_is_all: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.686s)
❌ test_execute_report_type_is_schema: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.37s)
❌ test_execute_report_type_is_row: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.546s)
❌ test_execute_fail_for_tables_not_available: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.384s)
❌ test_execute_report_type_is_data_with_all_without_keys: pyspark.errors.exceptions.connect.AnalysisException: [SCHEMA_NOT_FOUND] The schema `integration_test.reconcile` cannot be found. Verify the spelling and correctness of the schema and catalog. (1.184s)
❌ test_recon_for_report_type_is_data: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (33.483s)
❌ test_generate_final_reconcile_output_row: AssertionError: assert ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[]) == ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=None, schema=None), exception_message='')]) (4m46.803s)
❌ test_generate_final_reconcile_output_data: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.metrics: [DELTA_TABLE_NOT_FOUND] Delta table `TEST_SCHEMA`.`metrics` doesn't exist.; (8.249s)
❌ test_recon_for_report_type_schema: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (17.649s)
❌ test_generate_final_reconcile_output_schema: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_PROTOCOL_CHANGED] ProtocolChangedException: The protocol version of the Delta table has been changed by a concurrent update. This happens when multiple writers are writing to an empty directory. Creating the table ahead of time will avoid this conflict. (7.032s)
❌ test_recon_for_report_type_all: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (20.628s)
❌ test_recon_for_report_type_is_row: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (18.337s)
❌ test_generate_final_reconcile_output_all: AssertionError: assert ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[]) == ReconcileOutput(recon_id='73b44582-dbb7-489f-bad1-6a7e8f4821b1', results=[ReconcileTableOutput(target_table_name='target_test_catalog.target_test_schema.target_supplier', source_table_name='source_test_schema.supplier', status=StatusOutput(row=False, column=False, schema=True), exception_message='')]) (36.108s)
❌ test_generate_final_reconcile_output_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [TABLE_OR_VIEW_ALREADY_EXISTS] Cannot create table or view `TEST_SCHEMA`.`main` because it already exists. (5.762s)
❌ test_write_and_read_unmatched_df_with_volumes_with_exception: Failed: DID NOT RAISE (3.235s)
❌ test_clean_unmatched_df_from_volume_with_exception: Failed: DID NOT RAISE (591ms)
❌ test_schema_recon_with_data_source_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (11.099s)
❌ test_schema_recon_with_general_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (11.002s)
❌ test_data_recon_with_general_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (8.847s)
❌ test_data_recon_with_source_exception: databricks.labs.remorph.reconcile.exception.WriteToTableException: Error writing data to TEST_SCHEMA.main: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'recon_table_id' and 'recon_table_id' (8.819s)
Running from acceptance #13 |
run: pip install hatch==1.9.4 | ||
|
||
- name: Run integration tests | ||
uses: databrickslabs/sandbox/acceptance@acceptance/v0.2.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you also need acceptance_path
in the .codegen.json
https://github.com/databrickslabs/ucx/blob/main/.codegen.json#L9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
This reverts commit ab3b6f5.
…son" This reverts commit 821e00c.
too divergent will refactor and recreate a new one. |