Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clickhouse ingestion column type Map schema failure #9079

Open
spudstr opened this issue Oct 24, 2023 · 4 comments
Open

Clickhouse ingestion column type Map schema failure #9079

spudstr opened this issue Oct 24, 2023 · 4 comments
Labels
accepted An Issue that is confirmed as a bug by the DataHub Maintainers. bug Bug report

Comments

@spudstr
Copy link

spudstr commented Oct 24, 2023

Describe the bug

clickhouse tables that have the following rows does not work vs works .... or is this not supported?

column type of Map(String, Map(String, Nullable(String))) fails to produce schema
column type Map(String, Nullable(String)) produces schema as expected (and is boring).

To Reproduce
in clickhouse - make a bad table

CREATE TABLE IF NOT EXISTS bugtable on cluster 'local' (
	id Int,
	metadata Map(String, Map(String, Nullable(String)))
) ENGINE = MergeTree()
order by id

simple recipe

source:
    type: clickhouse
    config:
        env: PROD
        platform_instance: clickhouse
        host_port: '${chi_cluster01_host}:8123'
        username: '${chi_cluster01_user}'
        password: '${chi_cluster01_pass}'
        profiling:
            enabled: false
        stateful_ingestion:
            enabled: true
            remove_stale_metadata: true

Expected behavior
Expected behavior is to produce schema like other formats.

some logs

/tmp/datahub/ingest/venv-clickhouse-v0.11.0/lib/python3.10/site-packages/clickhouse_sqlalchemy/drivers/base.py:273: SAWarning: Did not recognize type 'name String' of column 'changes'
  warn("Did not recognize type '%s' of column '%s'" %
/tmp/datahub/ingest/venv-clickhouse-v0.11.0/lib/python3.10/site-packages/clickhouse_sqlalchemy/drivers/base.py:273: SAWarning: Did not recognize type 'previous_value String' of column 'changes'
  warn("Did not recognize type '%s' of column '%s'" %
/tmp/datahub/ingest/venv-clickhouse-v0.11.0/lib/python3.10/site-packages/clickhouse_sqlalchemy/drivers/base.py:273: SAWarning: Did not recognize type 'new_value String' of column 'changes'
  warn("Did not recognize type '%s' of column '%s'" %
/tmp/datahub/ingest/venv-clickhouse-v0.11.0/lib/python3.10/site-packages/clickhouse_sqlalchemy/drivers/base.py:273: SAWarning: Did not recognize type 'reason String' of column 'changes'
  warn("Did not recognize type '%s' of column '%s'" %
...  
  ... /tmp/datahub/ingest/venv-clickhouse-v0.11.0/lib/python3.10/site-packages/clickhouse_sqlalchemy/drivers/base.py:273: SAWarning: Did not recognize type 'Strin' of column 'metadata'

Warnings about the inability to produce schema.

 'at.audit_log_local': ["unable to get column information due to an error -> Map.__init__() missing 1 required positional argument: 'value_type'"]},

If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

cli_version': '0.11.0',
 'cli_entry_location': '/tmp/datahub/ingest/venv-clickhouse-v0.11.0/lib/python3.10/site-packages/datahub/__init__.py',
 'py_version': '3.10.10 (main, Mar 14 2023, 02:37:11) [GCC 10.2.1 20210110]',
 'py_exec_path': '/tmp/datahub/ingest/venv-clickhouse-v0.11.0/bin/python3',

Additional context
this seems more like a sqlalchemy issue than a datahub issue :/

@spudstr spudstr added the bug Bug report label Oct 24, 2023
@spudstr spudstr changed the title Clickhouse ingestion type Map, inside map fails to detect schema Clickhouse ingestion type Map schema failure Oct 24, 2023
@spudstr spudstr changed the title Clickhouse ingestion type Map schema failure Clickhouse ingestion column type Map schema failure Oct 24, 2023
@hsheth2
Copy link
Collaborator

hsheth2 commented Nov 1, 2023

Depends on xzkostyan/clickhouse-sqlalchemy#269

Copy link

github-actions bot commented Dec 2, 2023

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io

@github-actions github-actions bot added the stale label Dec 2, 2023
Copy link

github-actions bot commented Jan 1, 2024

This issue was closed because it has been inactive for 30 days since being marked as stale.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jan 1, 2024
@hsheth2 hsheth2 reopened this Jan 31, 2024
@hsheth2
Copy link
Collaborator

hsheth2 commented Jan 31, 2024

Leaving this open for tracking, but we can't really do much until xzkostyan/clickhouse-sqlalchemy#269 is fixed.

@hsheth2 hsheth2 added accepted An Issue that is confirmed as a bug by the DataHub Maintainers. and removed stale labels Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted An Issue that is confirmed as a bug by the DataHub Maintainers. bug Bug report
Projects
None yet
Development

No branches or pull requests

2 participants