data quality checks for variant data type #1188

STEFANOVIVAS · 2026-05-21T13:37:34Z

STEFANOVIVAS
May 21, 2026

It is possible to apply data quality checks to columns with variant data type? The reference guide only mention Struct, Map Type and Array Type. In case it's not, could we implement it?

ghanse · 2026-05-28T01:31:30Z

ghanse
May 28, 2026
Maintainer

It works!

from databricks.labs.dqx.engine import DQEngine
from databricks.labs.dqx.rule import DQRowRule
from databricks.labs.dqx import check_funcs
from databricks.sdk import WorkspaceClient 

df = (
  spark
  .range(5)
  .selectExpr(
    "id",
    "to_variant_object(map('key1', '', 'key2', 'b')) as example_var",
  )
)

checks = [
  DQRowRule(
    name="is_missing_variant_key1",
    criticality="error",
    check_func=check_funcs.is_not_empty,
    column="variant_get(example_var, '$.key1', 'string')"
  ),
]

engine = DQEngine(WorkspaceClient())
display(engine.apply_checks(df, checks))

1 reply

ghanse May 28, 2026
Maintainer

This should also work for rules defined using configuration/YAML.

You can set column to be a PySpark function (e.g. try_variant_get) when defining the rules in code.

STEFANOVIVAS · 2026-05-29T13:02:56Z

STEFANOVIVAS
May 29, 2026
Author

Great! It works with yaml files too! Thanks

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data quality checks for variant data type #1188

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

data quality checks for variant data type #1188

Uh oh!

STEFANOVIVAS May 21, 2026

Replies: 2 comments · 1 reply

Uh oh!

Uh oh!

ghanse May 28, 2026 Maintainer

Uh oh!

ghanse May 28, 2026 Maintainer

Uh oh!

STEFANOVIVAS May 29, 2026 Author

STEFANOVIVAS
May 21, 2026

Replies: 2 comments 1 reply

ghanse
May 28, 2026
Maintainer

ghanse May 28, 2026
Maintainer

STEFANOVIVAS
May 29, 2026
Author