Refactor databricks-apps around capability composition + warehouse mutations#132
Refactor databricks-apps around capability composition + warehouse mutations#132MarioCadenas wants to merge 5 commits into
Conversation
cf2c711 to
4c43aa3
Compare
Document Delta/UC DML via custom routes, unify write-path guidance across skills, and expand Lakebase scaffolding and deployment notes.
The prior reorder conflated Lakebase deploy-first with the full lifecycle; keep Scaffold → Develop → Validate → Deploy and call out the OLTP exception.
Add data-patterns and lifecycle guides, slim SKILL.md to a 5-step agent workflow, dedupe overview and plugin guides, and broaden skill frontmatter for multi-plugin apps.
Extract OLTP and synced-read guides from the monolithic lakebase doc, add a thin router, point data-patterns and cross-skill links at the right targets, and trim custom-endpoints/proto-first duplication.
f238e3f to
2cc1f99
Compare
Adds a Local-vs-agentic-mode split keyed to DATABRICKS_APPS_AGENTIC_MODE, plus the P1-P3 review fixes from the data-path refactor. Agentic mode (DATABRICKS_APPS_AGENTIC_MODE=true): - New references/appkit/environments.md as the canonical Local-vs-agentic delta; Step-0 detection branch in SKILL.md. - In agentic mode the app is pre-scaffolded and all plugin resources are provisioned: read enabled plugins from appkit.plugins.json / app.yaml (don't infer); ambient auth (no profile, omit --profile); run only design+discovery gates; skip provisioning gates, scaffold, deploy, and smoke tests; npm run dev hits live resources; still run databricks apps validate. Stop and surface if a needed plugin isn't wired. - Short agentic callouts in lifecycle, data-patterns, lakebase-oltp, genie, model-serving, files, jobs, overview, sql-queries, warehouse-mutations. Doc fixes: - Capability flags marked as concepts, not --features values. - Single canonical write-path table in data-patterns; custom-endpoints and warehouse-mutations now guard-and-link instead of restating it. - warehouse-mutations leads with the inline pattern; generic is optional. - Reframed the warehouse smoke test to a non-mutating check. - Simplified the lifecycle phase matrix; standardized await createApp. Co-authored-by: Isaac
|
🧪 Dev eval run kicked off for this PR Running the Setup
Status: generation in progress (~45–60 min). I'll follow up with per-app Note: a prod nightly is running concurrently and shares the Anthropic API key, so an isolated |
✅ Eval results — no generation-quality regression from this PRRun
Generation — all 1.0 (build + unit + smoke + typecheck + The one miss (
|
| Run | Skill | genie_taxi_chat |
Layout |
|---|---|---|---|
| original | this branch | 0.0 | genie-taxi-chat/genie-taxi-chat/ ✗ |
| re-run | this branch | 1.0 | genie-taxi-chat/ ✓ |
| control | stock main |
1.0 | genie-taxi-chat/ ✓ |
On re-run the PR skill produced a correct app identical to stock — the double-nesting was a one-off. Soft flag: worth a glance at whether the lifecycle/scaffold guidance reorg makes an extra wrapper directory slightly more likely, but it is not deterministic.
Edits — 0 build regressions
| Edit | Δ appeval | Note |
|---|---|---|
property_search_app · add_emoji |
−0.17 | smoke test pass→fail (build + unit OK) — likely flaky selector |
city_performance_app · fix_critical_issue |
0 | no critical issue found (legit no-op) |
taxi_zones_map · simplify_code |
0 | clean |
parts_catalog_app · drop_unrequested_feature |
0 | clean |
parts_catalog_app · multi_turn_additive |
0 | clean |
Bottom line
The capability-composition refactor generates apps on par with stock skills across warehouse-read, Lakebase OLTP, Genie, model-serving, and devhub prompts — no build or quality regression. Skill confirmed installed from this branch (Using skills version refactor-app-capability-composition, 9 skills, no rate-limit on CLI v1.2.1).
Caveat: a prod nightly shared the Anthropic API key during the main run; it didn't materially affect results (the single transient slip cleared on re-run).
🧪 Full eval set now running on this PRFollowing the
I'll follow up with the aggregate |
Summary
Unifies the former #135 + #132 stack into a single PR (based on
main). Refactorsdatabricks-appsso agents compose apps from capabilities (reads_warehouse,writes_oltp,genie,files, …) instead of monolithic archetype docs, adds the warehouse-mutations write path, and teaches the skill to handle two environments (local vs agentic mode).Capability refactor
warehouse-mutations.md— Delta/UC DML viaappkit.analytics.query()in custom routesdata-patterns.md— canonical capability catalog, conditional gates, write/read paths, recipes, checklist sliceslifecycle.md— dev / validate / deploy orderingSKILL.mdto a thin orchestratorlakebase.mdinto router +lakebase-oltp.md+lakebase-synced-reads.mdcustom-endpoints.md→ points at data-patterns; markproto-first.mdadvanced-onlyAgentic mode (
DATABRICKS_APPS_AGENTIC_MODE=true)environments.mdas the canonical Local-vs-agentic delta; Step-0 detection branch inSKILL.mdappkit.plugins.json/app.yaml(don't infer); ambient auth (no profile, omit--profile); run only design+discovery gates; skip provisioning gates, scaffold, deploy, and smoke tests;npm run devhits live resources; still rundatabricks apps validate. Stop and surface if a needed plugin isn't wired.Review fixes (P1–P3)
--featuresvalueswarehouse-mutations.mdleads with the simple inline pattern (generic optional); non-mutating smoke check; simplified lifecycle matrix; standardizedawait createAppSupersedes #135 (its commits are included here).
Test plan
python3 scripts/skills.py generate && python3 scripts/skills.py validateappkit.analytics.query()supports DML on the shipped AppKit version before relying onwarehouse-mutations.mdDATABRICKS_APPS_AGENTIC_MODE=true→ readsappkit.plugins.json, no scaffold/deploy, ambient auth