data quality checks for variant data type #1188
STEFANOVIVAS
started this conversation in
Ideas
Replies: 2 comments 1 reply
-
|
It works! from databricks.labs.dqx.engine import DQEngine
from databricks.labs.dqx.rule import DQRowRule
from databricks.labs.dqx import check_funcs
from databricks.sdk import WorkspaceClient
df = (
spark
.range(5)
.selectExpr(
"id",
"to_variant_object(map('key1', '', 'key2', 'b')) as example_var",
)
)
checks = [
DQRowRule(
name="is_missing_variant_key1",
criticality="error",
check_func=check_funcs.is_not_empty,
column="variant_get(example_var, '$.key1', 'string')"
),
]
engine = DQEngine(WorkspaceClient())
display(engine.apply_checks(df, checks)) |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
Great! It works with yaml files too! Thanks |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
It is possible to apply data quality checks to columns with variant data type? The reference guide only mention Struct, Map Type and Array Type. In case it's not, could we implement it?
Beta Was this translation helpful? Give feedback.
All reactions