Fix charmap error on non-ASCII output and add warehouse ID setup step by vmariiechko · Pull Request #9 · vmariiechko/databricks-bundle-template

vmariiechko · 2026-05-13T19:21:34Z

Summary

Two small improvements to the dbx-ro-query asset triggered by real usage:

Encoding fix: query results containing non-ASCII characters (Greek, Cyrillic, emoji) caused a charmap codec error on Windows when printed to stdout. configure_text_streams() reconfigures both stdout and stderr to UTF-8 with errors="replace" at startup. The subprocess call in run_query already used UTF-8; this closes the remaining gap at the print() call in main().
Install message: first-time users had no prompt to set DATABRICKS_WAREHOUSE_ID before running the smoke check. A new step 2 in the success_message shows the databricks warehouses list lookup and the export pattern. Placed here rather than in SKILL.md because SKILL.md is read by agents during task execution, not by humans during initial setup.

Changes

scripts/dbx-ro-query.py: add configure_text_streams(), call it as first line of main()
databricks_template_schema.json: insert warehouse ID setup step (step 2) in success_message, renumber smoke check to step 3
tests/assets/test_dbx_ro_query.py: three new tests — configure_text_streams smoke test, non-ASCII scalar output, non-ASCII TSV output
CHANGELOG.md: add [1.7.1] - 2026-05-13 entry

Change Area

Asset Library (assets/<name>/)

Configuration Axes Affected

Asset Library (new asset, asset schema, or framework changes)

Testing

All tests pass (pytest tests/ -V)
New tests added for new functionality (if applicable)

Asset Changes (if applicable)

Asset installs standalone via databricks bundle init . --template-dir assets/<name> --output-dir <dir>
Asset is self-contained (no references to library/helpers.tmpl or other assets)

Checklist

Documentation updated (if behavior changed)

Reconfigure stdout and stderr to UTF-8 at startup via configure_text_streams(), preventing charmap codec errors when query results contain Greek, Cyrillic, emoji, or other non-ASCII characters. The subprocess call in run_query already used UTF-8; this closes the remaining gap at the print() call in main(). Add a "Set your warehouse ID" step to the asset success_message so first-time users see the databricks warehouses list lookup and DATABRICKS_WAREHOUSE_ID export pattern immediately after install, before the smoke-check step. The instruction belongs here rather than in SKILL.md, which is read by agents, not by humans during setup. Three new tests: configure_text_streams smoke test, and two format_rows tests covering non-ASCII scalar and TSV output.

vmariiechko merged commit 6688827 into main May 13, 2026
1 check passed

vmariiechko deleted the feature/dbx-ro-query-unicode-encoding branch May 13, 2026 19:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix charmap error on non-ASCII output and add warehouse ID setup step#9

Fix charmap error on non-ASCII output and add warehouse ID setup step#9
vmariiechko merged 1 commit into
mainfrom
feature/dbx-ro-query-unicode-encoding

vmariiechko commented May 13, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vmariiechko commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Change Area

Configuration Axes Affected

Testing

Asset Changes (if applicable)

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vmariiechko commented May 13, 2026 •

edited

Loading