A reference repo for managing Databricks Genie Spaces as code and promoting them across dev -> staging -> prod with Git and CI/CD.
It shows two approaches side by side:
- SDK + DAB (recommended, works today) — export a Genie Space to a versioned JSON artifact, rewrite environment-specific references, and create/update it in the target workspace via the Genie SDK, all orchestrated by a Databricks Asset Bundle and GitHub Actions.
- Native
genie_spacesDAB resource (preview / not GA) — what the YAML will look like once a Genie Space becomes a first-class bundle resource. See docs/future-native-resource.md.
As of mid-2026, a Genie Space is not yet a supported Databricks Asset Bundle resource type (jobs, pipelines, dashboards, apps, etc. are; Genie is not). The production-ready way to do CI/CD for Genie Spaces today is to drive the Genie REST/SDK API and wrap it in a bundle. That is exactly what this repo does. When the native resource ships, migrating is mostly mechanical (see the doc above).
DEV workspace Git repo STAGING / PROD workspace
------------- -------- ------------------------
build space in UI spaces/*.json create_space / update_space
| ^ ^
| export_space.py | commit | deploy_space.py
+--------------------------+------------------------------+
|
config/<env>.yml rewrites
(dev_sales. -> prod_sales.)
-
Export — build/tune a space in the dev workspace UI, then export its full definition to a checked-in artifact:
python src/export_space.py --space-id <dev-space-id> --out spaces/sales_assistant.json
The artifact holds the
serialized_spaceJSON (instructions, sample questions, table mappings, benchmarks) plus title/description. -
Transform —
config/<env>.ymldeclares per-environmentwarehouse_id,parent_path, and substringreplacementsapplied to the serialized JSON (e.g. rewritedev_sales.table references toprod_sales.). This keeps one artifact promotable to every environment. -
Deploy — create the space if it doesn't exist in the target, else update it in place:
python src/deploy_space.py --space spaces/sales_assistant.json --env prod --dry-run python src/deploy_space.py --space spaces/sales_assistant.json --env prod
databricks.yml bundle definition + dev/staging/prod targets
resources/deploy_job.yml Databricks Job that runs the deploy on serverless
src/genie_ops.py export / transform / deploy helpers (Genie SDK)
src/export_space.py CLI: pull a space out of a workspace
src/deploy_space.py CLI: promote an artifact to an environment
spaces/sales_assistant.json example checked-in space artifact (source of truth)
config/{dev,staging,prod}.yml per-environment warehouse + catalog rewrites
ci-github-actions-deploy.yml PR validation + promote-on-merge + gated prod
(move to .github/workflows/deploy.yml to activate)
docs/future-native-resource.md the forthcoming native genie_spaces YAML
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# Authenticate the Databricks SDK to your dev workspace (any one of):
# export DATABRICKS_HOST=https://your-dev-workspace.cloud.databricks.com
# export DATABRICKS_TOKEN=dapi...
# or use a CLI profile and pass --profile <name> to the scripts.
# 1. Export an existing dev space
PYTHONPATH=src python src/export_space.py --space-id <dev-space-id> --out spaces/sales_assistant.json
# 2. Preview a prod deploy without mutating anything
PYTHONPATH=src python src/deploy_space.py --space spaces/sales_assistant.json --env prod --dry-run
# 3. Promote for real
PYTHONPATH=src python src/deploy_space.py --space spaces/sales_assistant.json --env prodDeploy the bundle (ships the code + the promote job), then run the job:
databricks bundle validate -t staging
databricks bundle deploy -t staging
databricks bundle run deploy_genie_space -t stagingThe included GitHub Actions workflow (ci-github-actions-deploy.yml — move it to .github/workflows/deploy.yml to activate):
- on PR:
bundle validate+ a--dry-rundeploy (no mutation) - on merge to main: deploy bundle to staging and run the promote job
- manual dispatch: promote to prod, gated by a protected GitHub Environment
Set these secrets (a Databricks service principal using OAuth M2M):
DATABRICKS_HOST, DATABRICKS_CLIENT_ID, DATABRICKS_CLIENT_SECRET.
- Get the serialized JSON by exporting, don't hand-author it. Build the
space in the UI, then
export_space.py. The artifact inspaces/here is illustrative; a real export is richer. - The deploy identity needs permissions on the target:
CAN_MANAGEon the spaces / parent folder, plus access to the target SQL warehouse. - Replacements are plain substring swaps applied to the JSON string. Keep keys specific (include the trailing dot) so you don't get partial matches.
- Idempotency is by title within a parent path: deploy looks up an existing space by title and updates it, otherwise creates a new one.