-
Notifications
You must be signed in to change notification settings - Fork 149
feat: clickhouse safe_cast macro #552
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: clickhouse safe_cast macro #552
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 To request another review, post a new comment with "/windsurf-review".
| -- This macro allows you to set a default value for columns based on their type. | ||
|
|
||
| {% macro clickhouse__safe_cast(field, dtype) %} | ||
| {%- if field == 'null' -%} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition field == 'null' checks if the field is literally the string 'null', not if it's a SQL NULL value. For SQL NULL values, you should use {{ field }} is null instead. The current implementation will only work if 'null' is passed as a string literal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct that the condition field == 'null' checks for the literal string 'null' rather than a SQL NULL value. This is actually intentional for this macro's specific use case.
The safe_cast macro is designed specifically to solve issue #315 - handling non-nullable ClickHouse columns in dbt unit tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @MaelleBaillet5 ! Thanks for contributing this. This PR is looking nice and I'd like to merge it ASAP.
The only thing I'm missing now is a proper test for the unit-testing part. As you mention, this is specifically designed to solve the issue in #315, so it would be great to add a test to ensure #315 is not happening again. I think creating an additional test and adding there the unit-test listed in the "Steps to reproduce" section would be enough. Would you add it?
|
/windsurf-review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other comments (2)
- tests/integration/adapter/utils/test_safe_cast.py (151-151) Missing newline at the end of file. Adding a newline at the end of files is a common convention that helps with text processing tools and version control systems.
-
tests/integration/adapter/utils/test_safe_cast.py (87-87)
Use lowercase `false` for boolean literals to follow Python style conventions.
assert result[10] == false # Bool default
💡 To request another review, post a new comment with "/windsurf-review".
| {% macro clickhouse__safe_cast(field, dtype) %} | ||
| {%- if field == 'null' -%} | ||
| CAST(defaultValueOfTypeName('{{ dtype }}') AS {{ dtype }}) | ||
| {%- else -%} | ||
| CAST({{ field }} AS {{ dtype }}) | ||
| {%- endif -%} | ||
| {% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation only handles the literal string 'null', but might miss actual null values passed as variables. Consider enhancing the implementation to also check for actual null values:
| {% macro clickhouse__safe_cast(field, dtype) %} | |
| {%- if field == 'null' -%} | |
| CAST(defaultValueOfTypeName('{{ dtype }}') AS {{ dtype }}) | |
| {%- else -%} | |
| CAST({{ field }} AS {{ dtype }}) | |
| {%- endif -%} | |
| {% endmacro %} | |
| {% macro clickhouse__safe_cast(field, dtype) %} | |
| {%- if field is none or field == 'null' -%} | |
| CAST(defaultValueOfTypeName('{{ dtype }}') AS {{ dtype }}) | |
| {%- else -%} | |
| CAST({{ field }} AS {{ dtype }}) | |
| {%- endif -%} | |
| {% endmacro %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safe_cast macro is exclusively designed for dbt unit test fixtures, where
all values are passed as string literals from YAML files. When a column is omitted
in a fixture, dbt-core's get_empty_schema_sql macro passes the literal string
'null' (not a SQL NULL value).
| def test_safe_cast_types(self, project): | ||
| """Test that safe_cast preserves the expected data types""" | ||
| # Run the model | ||
| results = run_dbt(["run", "--select", "safe_cast_test"]) | ||
| assert len(results) == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test method should also use the shared fixture to avoid running the model again.
| def test_safe_cast_types(self, project): | |
| """Test that safe_cast preserves the expected data types""" | |
| # Run the model | |
| results = run_dbt(["run", "--select", "safe_cast_test"]) | |
| assert len(results) == 1 | |
| def test_safe_cast_types(self, project, run_model): | |
| """Test that safe_cast preserves the expected data types""" | |
| # Get column types from ClickHouse |
dd53a2c to
aa5534c
Compare
|
Summary
Add
clickhouse__safe_castmacro that automatically provides default values for ClickHouse types when casting null values. This eliminates the need to specify all non-nullable columns in unit test fixtures.Fixes #315
Checklist
Delete items not relevant to your PR: