diff --git a/experimental/databricks-aibi-dashboards/SKILL.md b/experimental/databricks-aibi-dashboards/SKILL.md index 9145891..ad9bd44 100644 --- a/experimental/databricks-aibi-dashboards/SKILL.md +++ b/experimental/databricks-aibi-dashboards/SKILL.md @@ -3,7 +3,7 @@ name: databricks-aibi-dashboards description: "Create Databricks AI/BI dashboards. Must use when creating, updating, or deploying Lakeview dashboards as Databricks Dashboard have a unique json structure. CRITICAL: You MUST test ALL SQL queries via CLI BEFORE deploying. Follow guidelines strictly." compatibility: Requires databricks CLI (>= v1.0.0) metadata: - version: "0.1.0" + version: "0.2.0" parent: databricks-core --- @@ -30,17 +30,34 @@ A dashboard should be showing something relevant for a human, typically some KPI --- -## CRITICAL: Widget Version Requirements +## Widget Index (Version + Where Documented) > **Wrong version = broken widget!** This is the #1 cause of dashboard errors. -| Widget Type | Version | Notes | -|-------------|---------|-------| -| `counter` | **2** | KPI cards | -| `table` | **2** | Data tables | -| `bar`, `line`, `area`, `pie`, `scatter` | **3** | Charts | -| `combo`, `choropleth-map` | **1** | Advanced charts | -| `filter-*` | **2** | All filter types | +| Widget Type | Version | Documented in | +|-------------|---------|---------------| +| text (markdown, no spec block) | N/A | [1-widget-specifications.md#text-headersdescriptions](references/1-widget-specifications.md#text-headersdescriptions) | +| `counter` (KPI + sparkline + comparison) | **2** | [1-widget-specifications.md#counter-kpi](references/1-widget-specifications.md#counter-kpi) | +| `table` | **2** | [1-widget-specifications.md#table](references/1-widget-specifications.md#table) | +| `bar`, `line` | **3** | [1-widget-specifications.md#line--bar-charts](references/1-widget-specifications.md#line--bar-charts) | +| `pie` | **3** | [1-widget-specifications.md#pie-chart](references/1-widget-specifications.md#pie-chart) | +| `symbol-map` (lat/lon point map) | **2** | [1-widget-specifications.md#symbol-map-bubble-map](references/1-widget-specifications.md#symbol-map-bubble-map) | +| `area` | **3** | [2-advanced-widget-specifications.md#area-chart](references/2-advanced-widget-specifications.md#area-chart) | +| `scatter` | **3** | [2-advanced-widget-specifications.md#scatter-plot--bubble-chart](references/2-advanced-widget-specifications.md#scatter-plot--bubble-chart) | +| `combo` (bar+line, dual-axis) | **1** | [2-advanced-widget-specifications.md#combo-chart-bar--line](references/2-advanced-widget-specifications.md#combo-chart-bar--line) | +| `choropleth-map` (regions colored by value) | **1** | [2-advanced-widget-specifications.md#choropleth-map](references/2-advanced-widget-specifications.md#choropleth-map) | +| `forecast-line` (with `AI_FORECAST` SQL) | **1** | [2-advanced-widget-specifications.md#forecast-line-with-ai_forecast](references/2-advanced-widget-specifications.md#forecast-line-with-ai_forecast) | +| `pivot` (with conditional cell rules) | **3** | [2-advanced-widget-specifications.md#pivot](references/2-advanced-widget-specifications.md#pivot) | +| `histogram` (with `bin(col, binWidth=N)`) | **3** | [2-advanced-widget-specifications.md#histogram](references/2-advanced-widget-specifications.md#histogram) | +| `sankey` | **1** | [2-advanced-widget-specifications.md#sankey](references/2-advanced-widget-specifications.md#sankey) | +| `heatmap` | **3** | [2-advanced-widget-specifications.md#heatmap](references/2-advanced-widget-specifications.md#heatmap) | +| `funnel` | **1** | [2-advanced-widget-specifications.md#funnel](references/2-advanced-widget-specifications.md#funnel) | +| `box` | **1** | [2-advanced-widget-specifications.md#box](references/2-advanced-widget-specifications.md#box) | +| `waterfall` | **1** | [2-advanced-widget-specifications.md#waterfall](references/2-advanced-widget-specifications.md#waterfall) | +| `filter-single-select`, `filter-multi-select`, `filter-date-range-picker` | **2** | [3-filters.md#filter-widget-structure](references/3-filters.md#filter-widget-structure) | +| `range-slider` | **2** | [3-filters.md#range-slider-numeric-range-filter](references/3-filters.md#range-slider-numeric-range-filter) | + +> Cohort retention charts are built as a `pivot` with a color-scale cell style — there is no `cohort` widget type. See pivot in [2-advanced-widget-specifications.md](references/2-advanced-widget-specifications.md). --- @@ -101,7 +118,7 @@ If values don't match expectations, ensure the query is correct, fix the data if Before writing JSON, plan your dashboard: -1. You must know the expected specific JSON structure. For this, **Read reference files**: [references/1-widget-specifications.md](references/1-widget-specifications.md), [references/3-filters.md](references/3-filters.md), [references/4-examples.md](references/4-examples.md) +1. You must know the expected specific JSON structure. For this, **Read reference files**: [1-widget-specifications.md](references/1-widget-specifications.md), [3-filters.md](references/3-filters.md), [4-examples.md](references/4-examples.md) 2. Think: **What widgets?** Map each visualization to a dataset: | Widget | Type | Dataset | Has filter field? | @@ -196,29 +213,81 @@ Every dashboard's `serialized_dashboard` content must follow this exact structur ``` **Structural rules (violations cause "failed to parse serialized dashboard"):** -- `queryLines`: Array of strings, NOT `"query": "string"` +- `queryLines`: Array of strings, NOT `"query": "string"`. Elements are **joined verbatim** with no separator — end each line with `\n` (or strip `-- comments`). A line ending in `-- comment` with no newline swallows the next line. - Widgets: INLINE in `layout[].widget`, NOT a separate `"widgets"` array - `pageType`: Required on every page (`PAGE_TYPE_CANVAS` or `PAGE_TYPE_GLOBAL_FILTERS`) - Query binding: `query.fields[].name` must exactly match `encodings.*.fieldName` -### Linking a Genie Space (Optional) +### Theme & Color (always set this — it makes or breaks the dashboard) + +Top-level `uiSettings.theme` controls colors, fonts, and widget chrome across every widget on the dashboard. Without it, the dashboard inherits the workspace default and looks generic. **Set the full block on every dashboard you create** — a coherent palette is the single highest-impact polish item. -To add an "Ask Genie" button to the dashboard, or to link a genie space/room with an ID, add `uiSettings.genieSpace` to the JSON: +Mental model — **60/30/10 rule** mapped to theme keys: **60% neutral** = canvas/widget/border backgrounds (set `widgetBorderColor = widgetBackgroundColor` to hide borders); **30% secondary** = `fontColor` + `visualizationColors` (the content weight); **10% accent** = `selectionColor` for filters / tabs / active selections — pick something distinct from text and palette; a safe-blue around `#2272B4` matches the hyperlink convention and works as a default. ```json { "datasets": [...], "pages": [...], "uiSettings": { - "genieSpace": { - "isEnabled": true, - "overrideId": "your-genie-space-id-here", - "enablementMode": "ENABLED" + "theme": { + "canvasBackgroundColor": {"light": "#FCFCFC", "dark": "#1F272D"}, + "widgetBackgroundColor": {"light": "#FFFFFF", "dark": "#11171C"}, + "fontColor": {"light": "#11171C", "dark": "#E8ECF0"}, + "selectionColor": {"light": "#2272B4", "dark": "#8ACAFF"}, + "visualizationColors": [ + "#FFA600", "#FF7054", "#DE5582", "#995495", + "#4E5185", "#1D425C", "#99DDB4" + ], + "widgetHeaderAlignment": "LEFT" } } } ``` +**Theme keys** (mechanics): + +- `visualizationColors`: ordered palette every chart series and category mapping cycles through. **Positions are 0-indexed**: `position: 0` = first color (`#FFA600` above), `position: 6` = seventh (`#99DDB4`). Length 5–8 is typical. +- Background / font / selection colors take `light` + `dark` pairs; the dashboard auto-selects based on viewer mode. +- `widgetHeaderAlignment`: `"LEFT"` (default), `"CENTER"`, or `"RIGHT"`. +- Per-widget color references: `{"themeColorType": "visualizationColors", "position": N}` (0-indexed) to pin to a palette slot, or `{"hex": "#FF0000"}` for an exact color outside the palette. + +**Palette-design rules** (this is what separates a polished dashboard from a noisy one): + +1. **One coherent color family per dashboard, distinct across the suite.** Walk **across hues** (e.g., amber → coral → pink → purple → navy), not one color faded toward white — a single-hue lightness ramp reads as one color and the viewer can't tell categories apart. Adjacent stops must be visually distinct: if you squint and two blur into one, push them further apart. Single-hue ramps are for **quantitative** widgets only (`colorRamp.mode: "custom-sequential"`), never for `visualizationColors`. +2. **Pin semantic colors as literal hex, outside the palette.** "Bad" = a warm coral (e.g. `#FF7E5C`), "good" = a calm teal/green. Use `color.scale.mappings` with a bare hex string — `{"value": "Critical", "color": "#FF7E5C"}` — **not** `{"hex": "..."}` or `themeColorType: position` (both are silently dropped on chart widgets). Reuse the good-teal that's already in the palette so it never clashes. +3. **Color non-categorical widgets explicitly so they join the family.** Maps & heatmaps: `colorRamp.mode: "custom-sequential"` with `{start, end}` from the family (if directional: `start` = bad color, `end` = good color). Forecast / multi-series: pin per-series via `color.scale.mappings` keyed on `displayName` (actual = solid family color, forecast = contrast/alert, threshold = muted tone). Sparkline counters: set `value.color` to a family color, not grey. +4. **"Lighter / more pastel" tweak**: nudge all stops up in lightness *together*; don't recolor individual ones. Re-sync the pinned semantic hex values; keep enough contrast on the alert color that it still reads as a warning. + +**Starter palettes** (pick one and adapt — extend to 7-8 stops if needed; semantic red/green stay as literal hex per rule 2): + +``` +#094074 #3C6997 #5ADBFF #FFDD4A #FE9000 +#003F5C #594E90 #BC4C96 #FF5F66 #FFA600 +#4A8CC7 #F59770 #FFD84A #F0E09E #6DD980 +#440154 #3B528B #21918C #5EC962 #FDE725 +#4E79A7 #F28E2C #E15759 #76B7B2 #59A14F +#0072B2 #E69F00 #009E73 #CC79A7 #D55E00 +#0D0887 #7E03A8 #CC4778 #F89441 #F0F921 +#6929C4 #1192E8 #005D5D #9F1853 #FA4D56 +``` + +~4-5% of viewers have color blindness (mostly red/green). Rows 4 and 6 above (viridis, Okabe-Ito) are CB-safe by design; verify customized palettes via simulator (Adobe Color, `colorbrewer2.org`). Don't put red and green adjacent, and rely on lightness contrast — not hue alone — between adjacent stops. + +### Linking a Genie Space (Optional) + +To add an "Ask Genie" button to the dashboard, or to link a genie space/room with an ID, add `uiSettings.genieSpace` to the JSON (alongside `theme` if you have one): + +```json +"uiSettings": { + "theme": { /* ... */ }, + "genieSpace": { + "isEnabled": true, + "overrideId": "your-genie-space-id-here", + "enablementMode": "ENABLED" + } +} +``` + > **Genie is NOT a widget.** Link via `uiSettings.genieSpace` only. There is no `"widgetType": "assistant"`. --- @@ -233,13 +302,15 @@ Apply unless user specifies otherwise: ## Reference Files +> **Before generating any dashboard JSON, read [4-examples.md](references/4-examples.md) first.** It's a complete reference dashboard exercising every construct (dataset measures + `MEASURE()`, sparkline counters, forecast-line with annotations, pivot with conditional cells, symbol-map, histogram, range-slider filter, theme). Use it to learn the JSON shape; then adapt to the user's data and demo story — keep the structure, swap the tables, metrics, palette, and narrative for the case you're building. + | What are you building? | Reference | |------------------------|-----------| -| Any widget (text, counter, table, chart) | [references/1-widget-specifications.md](references/1-widget-specifications.md) | -| Advanced charts (area, scatter/Bubble, combo (Line+Bar), Choropleth map) | [references/2-advanced-widget-specifications.md](references/2-advanced-widget-specifications.md) | -| Dashboard with filters (global or page-level) | [references/3-filters.md](references/3-filters.md) | -| Need a complete working template to adapt | [references/4-examples.md](references/4-examples.md) | -| Debugging a broken dashboard | [references/5-troubleshooting.md](references/5-troubleshooting.md) | +| **Start here** — full working dashboard template | [4-examples.md](references/4-examples.md) | +| Any widget (text, counter, table, chart) | [1-widget-specifications.md](references/1-widget-specifications.md) | +| Advanced charts (area, scatter/Bubble, combo (Line+Bar), Choropleth map) | [2-advanced-widget-specifications.md](references/2-advanced-widget-specifications.md) | +| Dashboard with filters (global or page-level) | [3-filters.md](references/3-filters.md) | +| Debugging a broken dashboard | [5-troubleshooting.md](references/5-troubleshooting.md) | --- @@ -247,8 +318,11 @@ Apply unless user specifies otherwise: ### 1) DATASET ARCHITECTURE -- **One dataset per domain** (e.g., orders, customers, products). Datasets shared across widgets benefit from the same filters. -- **Exactly ONE valid SQL query per dataset** (no multiple queries separated by `;`) +- **Fewer datasets is better — aim for one dataset that backs as many widgets as possible.** Clicking a value on a chart (e.g., a bar, a slice) acts as a filter on **that dataset**, and every other widget sharing the same dataset re-renders with the click applied. Splitting widgets across many narrow datasets breaks this cross-filtering and forces users to set explicit filter widgets for what should "just work". Prefer one wide dataset per domain (orders, cases, customers); only split when a widget genuinely needs different grain, pre-aggregation, or a parameter the others can't tolerate. +- **Two ways to define a dataset**: + - **SQL query**: `{"name": "ds_x", "displayName": "...", "queryLines": ["SELECT ...", "FROM table"]}` — full control, can include `WITH` / `JOIN` / `AI_FORECAST` / etc. + - **UC asset shorthand**: `{"name": "ds_x", "displayName": "...", "asset_name": "catalog.schema.table_or_view"}` — no SQL needed. Works for regular tables, views, and metric views. +- **Exactly ONE valid SQL query per dataset** when using `queryLines` (no multiple queries separated by `;`) - **Queries must use bare table names only** — no catalog, no schema prefix. Example: `FROM orders`, never `FROM gold.orders` or `FROM main.gold.orders`. The catalog and schema come from the `--dataset-catalog` and `--dataset-schema` flags at creation time. These flags only fill in missing parts — they do NOT override any catalog/schema written in the query. - SELECT must include all dimensions needed by widgets and all derived columns via `AS` aliases - Put ALL business logic (CASE/WHEN, COALESCE, ratios) into the dataset SELECT with explicit aliases @@ -258,6 +332,39 @@ Apply unless user specifies otherwise: - Rankings/Top-N: `ORDER BY metric DESC LIMIT 10` for "Top 10" charts - Categorical charts: `ORDER BY metric DESC` to show largest values first +#### Dataset-level measures + `MEASURE()` + +Widget expressions are usually inline aggregations (`{"name": "sum(x)", "expression": "SUM(\`x\`)"}`). But you can also **declare reusable measures on the dataset itself** and reference them by name — every widget that consumes the dataset can use the same metric without redefining it. + +Two ways to define measures: + +1. **Dashboard-level `columns`** (works on any dataset — SQL query or `asset_name`): + ```json + { + "name": "ds_support", + "queryLines": ["SELECT * FROM support_cases"], + "columns": [ + {"displayName": "Total Cases", "description": "Count of cases", + "expression": "COUNT(`case_id`)"}, + {"displayName": "Reopen Rate %", "description": "% of reopened cases", + "expression": "SUM(CASE WHEN `reopened_flag` THEN 1 ELSE 0 END) * 100.0 / COUNT(`case_id`)"}, + {"displayName": "Priority Level", "description": "Sorted priority label", + "expression": "CASE WHEN `priority`='Critical' THEN '1-Critical' ELSE '4-Low' END"} + ] + } + ``` + +2. **Metric-view source** — if the dataset's `asset_name` (or `FROM` clause) is a UC metric view, its YAML-defined measures are already queryable. **Do not redeclare them** in `columns`. See [databricks-metric-views](../databricks-metric-views/SKILL.md). + +Either way, widgets reference the measure by name: + +```json +"fields": [{"name": "measure(Total Cases)", "expression": "MEASURE(`Total Cases`)"}], +"encodings": {"value": {"fieldName": "measure(Total Cases)", "displayName": "Total Cases"}} +``` + +`MEASURE(\`...\`)` works in counter, table, bar, line, pie, pivot — any widget that takes a field expression. Mix it with inline aggregations freely. + ### 2) WIDGET FIELD EXPRESSIONS > **CRITICAL: Field Name Matching Rule** @@ -314,7 +421,9 @@ If you need conditional logic or multi-field formulas, compute a derived column Each widget has a position: `{"x": 0, "y": 0, "width": 4, "height": 4}` -**CRITICAL**: Each row must fill width=12 exactly. No gaps allowed. +**Pick the subdivision based on the audience.** The 12-column grid divides cleanly into 3, 4, or 6 columns: a 3-column layout (each widget `width: 4`) reduces cognitive load and fits an executive overview; a 4-column (`width: 3`) is the all-rounder; a 6-column (`width: 2`) packs the most density for technical / operations dashboards where the reader is hunting through many metrics at once. + +**Default rule**: each row should fill width=12 exactly — no gaps. Once you're confident with the grid, you can stagger heights across columns (a tall widget on the left paired with several shorter ones on the right) so the two halves don't share row boundaries — see [4-examples.md](references/4-examples.md#layout-12-col-grid) for the pattern. Start with strict rows; relax only when the stagger reads better visually. ``` CORRECT: WRONG: @@ -396,5 +505,5 @@ If the range is very small relative to the scale (e.g., 83-89% on a 0-100 scale) ## Related Skills - **[databricks-unity-catalog](../databricks-unity-catalog/SKILL.md)** - for querying the underlying data and system tables -- **[databricks-spark-declarative-pipelines](../databricks-spark-declarative-pipelines/SKILL.md)** - for building the data pipelines that feed dashboards +- **[databricks-pipelines](../databricks-pipelines/SKILL.md)** - for building the data pipelines that feed dashboards - **[databricks-jobs](../databricks-jobs/SKILL.md)** - for scheduling dashboard data refreshes diff --git a/experimental/databricks-aibi-dashboards/references/1-widget-specifications.md b/experimental/databricks-aibi-dashboards/references/1-widget-specifications.md index d7851f0..acd62b3 100644 --- a/experimental/databricks-aibi-dashboards/references/1-widget-specifications.md +++ b/experimental/databricks-aibi-dashboards/references/1-widget-specifications.md @@ -7,6 +7,7 @@ Core widget types for AI/BI dashboards. For advanced visualizations (area, scatt - `widget.name`: alphanumeric + hyphens + underscores ONLY (max 60 characters) - `frame.title`: human-readable title (any characters allowed) - `frame.showTitle`: always set to `true` so users understand the widget +- `frame.description` + `frame.showDescription: true`: optional subtext under the title (e.g., `"All-time; 0% before the 2025-06 launch"`) — useful for giving a KPI number context without cluttering the chart itself - `displayName`: use in encodings to label axes/values clearly (e.g., "Revenue ($)", "Growth Rate (%)") - `widget.queries[].name`: use `"main_query"` for chart/counter/table widgets. Filter widgets with multiple queries can use descriptive names (see [3-filters.md](3-filters.md)) @@ -22,6 +23,7 @@ Core widget types for AI/BI dashboards. For advanced visualizations (area, scatt | bar | 3 | this file | | line | 3 | this file | | pie | 3 | this file | +| symbol-map | 2 | this file | | area | 3 | [2-advanced-widget-specifications.md](2-advanced-widget-specifications.md) | | scatter | 3 | [2-advanced-widget-specifications.md](2-advanced-widget-specifications.md) | | combo | 1 | [2-advanced-widget-specifications.md](2-advanced-widget-specifications.md) | @@ -74,83 +76,167 @@ Core widget types for AI/BI dashboards. For advanced visualizations (area, scatt - `widgetType`: "counter" - Percent values must be 0-1 in the data (not 0-100) -### Number Formatting +**Two strongly-recommended defaults:** -```json -"encodings": { - "value": { - "fieldName": "revenue", - "displayName": "Total Revenue", - "format": { - "type": "number-currency", - "currencyCode": "USD", - "abbreviation": "compact", - "decimalPlaces": {"type": "max", "places": 2} - } - } -} -``` - -Format types: `number`, `number-currency`, `number-percent` +1. **Add a sparkline** (`period` encoding) when the dataset has a temporal column. A bare number is context-free; the small trend line behind the value tells the user "rising / falling / flat" at a glance. Skip it only if the KPI is truly time-invariant (snapshot count with no historical column available). + > For the sparkline to render, the dataset query must **keep the temporal dimension** — i.e., return one row per period (`GROUP BY DATE_TRUNC(...)`), not a single fully-aggregated row. The counter's `value` then re-aggregates those rows; the `period` field drives the line behind it. +2. **Set a `format`** when the value has a unit — dollars, percent, large counts. "Revenue: 1287394.55" without `number-currency` formatting reads as noise. The only counters where `format` is fine to omit are unit-less small counts (e.g., "Open Tickets: 12") where the raw integer is already legible. See "Value formatting" below. ### Counter Patterns -**Pre-aggregated dataset (1 row)** - use `disaggregated: true`: +**Multi-row dataset with aggregation — the recommended default (supports filters + sparkline)** — use `disaggregated: false`: +- Dataset query returns one row per period (`GROUP BY DATE_TRUNC(...)`) — keeps the temporal dimension so the counter can both re-aggregate to the headline value AND render the sparkline. +- **CRITICAL**: Field `name` MUST match `fieldName` exactly (e.g., `"sum(spend)"`). +- Include the `period` field in `query.fields` AND the `period` encoding in the spec. + ```json { "widget": { - "name": "total-revenue", + "name": "weekly-spend", "queries": [{ "name": "main_query", "query": { - "datasetName": "summary_ds", - "fields": [{"name": "revenue", "expression": "`revenue`"}], - "disaggregated": true + "datasetName": "spend_ds", + "fields": [ + {"name": "sum(spend)", "expression": "SUM(`spend`)"}, + {"name": "weekly(spend_at)", "expression": "DATE_TRUNC(\"WEEK\", `spend_at`)"} + ], + "disaggregated": false } }], "spec": { "version": 2, "widgetType": "counter", "encodings": { - "value": {"fieldName": "revenue", "displayName": "Total Revenue"} + "value": { + "fieldName": "sum(spend)", + "displayName": "Total Spend", + "format": {"type": "number-currency", "currencyCode": "USD", + "abbreviation": "compact", "decimalPlaces": {"type": "max", "places": 2}} + }, + "period": {"fieldName": "weekly(spend_at)"} }, - "frame": {"showTitle": true, "title": "Total Revenue"} + "frame": {"showTitle": true, "title": "Total Spend"} } }, "position": {"x": 0, "y": 0, "width": 4, "height": 3} } ``` -**Multi-row dataset with aggregation (supports filters)** - use `disaggregated: false`: -- Dataset returns multiple rows (e.g., grouped by a filter dimension) -- Use `"disaggregated": false` and aggregation expression -- **CRITICAL**: Field `name` MUST match `fieldName` exactly (e.g., `"sum(spend)"`) +Dataset SQL for the example above: + +```sql +-- One row per week — the counter re-aggregates rows into the headline value +-- AND uses the temporal column to draw the sparkline. +SELECT DATE_TRUNC('WEEK', spend_at) AS spend_at, SUM(spend) AS spend +FROM spend_table +GROUP BY 1 +``` + +In this example the headline number is the **total spend across the trend window** (the counter's `SUM(spend)` re-aggregates the weekly rows), and the sparkline shows the **per-week values** that make up that total. If you instead want the headline to be the **latest week's spend** (not the cumulative total), expose it as its own column in the dataset SQL (e.g., `MAX_BY(spend, spend_at) AS latest_weekly_spend`) and point `value.fieldName` at that column, while keeping the period rows for the sparkline. + +> **`MEASURE()` works here too.** If the dataset defines measures via `dataset.columns[]` or is sourced from a metric view, use `{"expression": "MEASURE(\`Total Cases\`)"}` as the field expression — same pattern, no duplication. See SKILL.md "Dataset-level measures + MEASURE()". + +**Pre-aggregated dataset (1 row, no sparkline)** — use `disaggregated: true`. Fallback shape when the metric is truly time-invariant or the data is already collapsed and no temporal column is available: ```json { "widget": { - "name": "total-spend", + "name": "total-revenue", "queries": [{ "name": "main_query", "query": { - "datasetName": "by_category", - "fields": [{"name": "sum(spend)", "expression": "SUM(`spend`)"}], - "disaggregated": false + "datasetName": "summary_ds", + "fields": [{"name": "revenue", "expression": "`revenue`"}], + "disaggregated": true } }], "spec": { "version": 2, "widgetType": "counter", "encodings": { - "value": {"fieldName": "sum(spend)", "displayName": "Total Spend"} + "value": { + "fieldName": "revenue", + "displayName": "Total Revenue", + "format": {"type": "number-currency", "currencyCode": "USD", + "abbreviation": "compact", "decimalPlaces": {"type": "max", "places": 2}} + } }, - "frame": {"showTitle": true, "title": "Total Spend"} + "frame": {"showTitle": true, "title": "Total Revenue"} } }, "position": {"x": 0, "y": 0, "width": 4, "height": 3} } ``` +### Sparkline (period encoding) + +The `period` field must be a temporal expression also present in `query.fields` — typically a `DATE_TRUNC(...)` over the dataset's timestamp column. Granularity choices: + +| Use | Why | +|---|---| +| `DATE_TRUNC("DAY", col)` | Short window (1-4 weeks), high-frequency metric | +| `DATE_TRUNC("WEEK", col)` | Standard default for ops metrics over a quarter | +| `DATE_TRUNC("MONTH", col)` | Long window (>1 year) or low-volume metric | + +Match the sparkline grain to whatever the surrounding charts use — consistent grain across the page makes the dashboard easier to read. + +### Value formatting + +Format types: `number`, `number-plain`, `number-currency`, `number-percent`. + +| Field type | Format | Why | +|---|---|---| +| Money | `number-currency` + `currencyCode: "USD"` (or `EUR` etc.) + `abbreviation: "compact"` | "$1.2M" is readable, "1287394.55" isn't | +| Percentage | `number-percent` (data must be 0-1) | Renders "12.5%" from 0.125 | +| Large count | `number` + `abbreviation: "compact"` | Renders "1.5K" / "2.3M" | +| Small count (under ~1K) | `number` (no abbreviation) or omit `format` | Raw integer is fine | +| Value with custom unit (e.g., "8 hrs", "2 weeks") | `number-plain` + `formatTemplate: "{{ @formatted }} hrs"` | Append a unit cleanly without baking it into the dataset | + +Optional `format.suffix` (e.g., `"suffix": "h"`) appends a short unit directly after the number without a template — simpler than `formatTemplate` when you just need a single-char unit. + +> **Counters backed by `MEASURE()`**: omit `format` when `format.type` is plain `"number"` — the combination triggers an "automatically fixed" warning on the rendered widget. Use `number-plain`, `number-currency`, `number-percent`, or no format at all. + +```json +"value": { + "fieldName": "revenue", + "displayName": "Total Revenue", + "format": { + "type": "number-currency", + "currencyCode": "USD", + "abbreviation": "compact", + "decimalPlaces": {"type": "max", "places": 2} + } +} +``` + +### Counter comparison (delta vs previous period) + +Show the current value AND the change vs a previous period. Use a second field in `query.fields` whose expression filters/aggregates the comparison value, and reference it via the `target` encoding: + +```json +"fields": [ + {"name": "current", "expression": "SUM(CASE WHEN week=:this_week THEN amount END)"}, + {"name": "previous", "expression": "SUM(CASE WHEN week=:last_week THEN amount END)"} +], +"encodings": { + "value": {"fieldName": "current", "displayName": "This Week"}, + "target": {"fieldName": "previous", "displayName": "vs Last Week"} +} +``` + +### Counter format template (custom prefix/suffix text) + +Wrap the value with surrounding text. Use `{{@}}` for the raw value and `{{@formatted}}` for the formatted one. Reference other dataset fields with `{{FieldName}}`. + +```json +"value": { + "fieldName": "sum(revenue)", + "format": {"type": "number-currency", "currencyCode": "USD", "abbreviation": "compact"}, + "formatTemplate": "{{@formatted}} (in {{Region}})" +} +``` + --- ## Table @@ -192,6 +278,40 @@ Format types: `number`, `number-currency`, `number-percent` } ``` +### Column-level options + +Each column object supports format, conditional styling, links, and tooltips. Common patterns: + +```json +{ + "fieldName": "amount", + "displayName": "Amount", + "format": {"type": "number-currency", "currencyCode": "USD", + "abbreviation": "compact", "decimalPlaces": {"type": "max", "places": 2}}, + + // Conditional background color (heat-map style) + "style": { + "type": "basic", + "rules": [ + {"condition": {"operand": {"type": "data-value", "value": "10000"}, "operator": ">="}, + "backgroundColor": {"themeColorType": "visualizationColors", "position": 0}}, + {"condition": {"operand": {"type": "data-value", "value": "5000"}, "operator": ">="}, + "backgroundColor": {"themeColorType": "visualizationColors", "position": 6}} + ] + }, + + // Make the cell a clickable link. {{@}} is the cell value, {{Field}} pulls another column. + "link": {"templatedURL": "/sql/dashboardsv3/{{@}}"}, + + // Hover tooltip + "tooltip": {"templatedText": "Customer ID: {{customer_id}}"} +} +``` + +Other display types: `"image"` (renders base64 strings as images), `"html"` (sanitized HTML), `"json"` (collapsible JSON tree), `"color-scale"` (continuous color gradient on numeric values without explicit thresholds). + +> Same `style.rules` and `link`/`tooltip` patterns work on **pivot** cells — see pivot in [2-advanced-widget-specifications.md](2-advanced-widget-specifications.md). + --- ## Line / Bar Charts @@ -202,6 +322,10 @@ Format types: `number`, `number-currency`, `number-percent` - `scale.type`: `"temporal"` (dates), `"quantitative"` (numbers), `"categorical"` (strings) - Use `"disaggregated": true` with pre-aggregated dataset data +> **Two recommended defaults for time-series charts:** +> - **Mark meaningful events with an annotation.** A single `vertical-line` for a product launch, incident, holiday, or campaign turns a generic trend into a readable story. See [Annotations](#annotations-event-markers) below. +> - **For trend lines on time-series data, consider `forecast-line` with `AI_FORECAST`** instead of a plain `line`. Projects future values + confidence bands and makes a dashboard noticeably more compelling for demos. See [forecast-line in 2-advanced-widget-specifications.md](2-advanced-widget-specifications.md#forecast-line-with-ai_forecast). + **Multiple series - two approaches:** 1. **Multi-Y Fields** (different metrics): @@ -227,6 +351,7 @@ Format types: `number`, `number-currency`, `number-percent` |------|---------------| | Stacked (default) | No `mark` field | | Grouped | `"mark": {"layout": "group"}` | +| 100% stacked | `"mark": {"layout": "stack-100"}` | ### Horizontal Bar Chart @@ -238,10 +363,66 @@ Swap `x` and `y` - put quantitative on `x`, categorical/temporal on `y`: } ``` -### Color Scale +### Categorical sort with a custom order + +When the dimension has natural ordering that ASC/DESC won't capture (priority levels, weekdays, named tiers), pin the order explicitly: + +```json +"x": { + "fieldName": "channel", + "scale": { + "type": "categorical", + "sort": {"by": "custom-order", "orderedValues": ["Chat", "Email", "In-App", "Phone"]} + } +} +``` + +Other `sort.by` values: `"alphabetical"`, `"value"` (sort by the y measure), `"cell"` / `"cell-reversed"` (pivot only). + +### Color Scale + per-value mappings + +Default behaviour: theme colors are assigned to categories in order. To pin specific values (e.g., "Critical" must always be red), use `mappings`: + +```json +"color": { + "fieldName": "Priority Level", + "scale": { + "type": "categorical", + "mappings": [ + {"value": "1-Critical", "color": "#FF7E5C"}, + {"value": "4-Low", "color": "#99DDB4"} + ] + } +} +``` + +Inside `mappings[].color`, use a **bare hex string** (`"#FF0000"`) — that's the form chart widgets honor. Palette-position references (`themeColorType` / `position`) and the wrapped `{"hex": "..."}` object form are silently dropped on `mappings[].color`, so semantic pins must always be bare hex. + +> For continuous color ramps on quantitative encodings, use `colorRamp` — see Symbol Map below, or [Heatmap](2-advanced-widget-specifications.md#heatmap) and [Choropleth Map](2-advanced-widget-specifications.md#choropleth-map) in advanced specs. -> **CRITICAL**: For bar/line/pie, color scale ONLY supports `type` and `sort`. -> Do NOT use `scheme`, `colorRamp`, or `mappings` (only for choropleth-map). +### Annotations (event markers) + +Mark an event on a time-series chart — release, holiday, incident — with a vertical line. Works on `line`, `area`, `bar`, `combo`, and `forecast-line`. + +```json +"spec": { + "version": 3, + "widgetType": "line", + "encodings": { /* ... x, y, color ... */ }, + "annotations": [ + { + "type": "vertical-line", + "encodings": { + "x": {"dataValue": "2024-11-28T12:00:00.000", "dataType": "DATETIME"}, + "label": {"value": "Thanksgiving"}, + "color": {"value": {"themeColorType": "visualizationColors", "position": 3}} + } + } + ] +} +``` + +Multiple annotations are allowed — add more objects to the array. For non-datetime axes, use `"dataType": "STRING"` or `"NUMBER"` and set `dataValue` accordingly. --- @@ -249,9 +430,9 @@ Swap `x` and `y` - put quantitative on `x`, categorical/temporal on `y`: - `version`: **3** - `widgetType`: "pie" -- `angle`: quantitative field -- `color`: categorical dimension -- **Limit to 3-8 categories for readability** +- **`angle` is REQUIRED** — quantitative field (the slice size). Omitting it renders all slices the same size, which is meaningless: the pie no longer encodes any quantity. +- **`color` is REQUIRED** — categorical dimension (the slice grouping). +- **Limit to 3-8 categories for readability.** ```json "spec": { @@ -259,13 +440,55 @@ Swap `x` and `y` - put quantitative on `x`, categorical/temporal on `y`: "widgetType": "pie", "encodings": { "angle": {"fieldName": "revenue", "scale": {"type": "quantitative"}}, - "color": {"fieldName": "category", "scale": {"type": "categorical"}} + "color": {"fieldName": "category", "scale": {"type": "categorical"}}, + "label": {"show": true} } } ``` --- +## Symbol Map (bubble map) + +Lat/lon scatter plot on a map. Use for **point data** (customer locations, sensor readings); use `choropleth-map` for **regions** (countries, states) colored by aggregate. + +> **Strongly preferred whenever the data has a geographic dimension.** A bubble map is one of the highest-signal visuals in a dashboard — "where is the action" reads at a glance and grabs attention better than a bar chart of the same data. If the dataset has lat/lon (or a country/state column → `choropleth-map`), include a map widget. + +- `version`: **2** +- `widgetType`: "symbol-map" +- Dataset must include latitude and longitude columns (or a `GEOMETRY`/`GEOGRAPHY` column). + +```json +"spec": { + "version": 2, + "widgetType": "symbol-map", + "encodings": { + "coordinates": { + "latitude": {"fieldName": "customer_latitude"}, + "longitude": {"fieldName": "customer_longitude"} + }, + "color": { + "fieldName": "sum(satisfaction_score)", + "scale": {"type": "quantitative", + "colorRamp": {"mode": "custom-sequential", "colors": {"start": "#FF7E5C", "end": "#99DDB4"}}}, + "legend": {"hide": true} + }, + "size": {"fieldName": "count(*)", "scale": {"type": "quantitative"}} + }, + "mark": {"opacity": 0.7}, + "frame": {"showTitle": true, "title": "Customer Locations"} +} +``` + +**`colorRamp` modes:** + +- `{"mode": "custom-sequential", "colors": {"start": "#FF7E5C", "end": "#99DDB4"}}` — your own gradient between two hex stops. **Prefer this for themed dashboards** so the map ties into the palette; if directional, `start` = bad color, `end` = good color. +- `{"mode": "scheme", "scheme": ""}` — prebuilt ramps. Known names: `magma`, `viridis`, `plasma`, `inferno`, `YlGnBu`, `RdYlBu`, `blues`, `redyellowgreen`. Avoid `redyellowgreen` — clashes with most modern themes. + +For categorical color (e.g., colored by region), use `scale.type: "categorical"` with the same `mappings` syntax as bar charts. `mark.opacity` (0–1) controls point transparency — useful when many points cluster. + +--- + ## Axis Formatting Add `format` to any encoding to display values appropriately: @@ -291,7 +514,7 @@ Add `format` to any encoding to display values appropriately: **Options:** - `abbreviation`: `"compact"` (K/M/B) or omit for full numbers -- `decimalPlaces`: `{"type": "max", "places": N}` or `{"type": "fixed", "places": N}` +- `decimalPlaces`: `{"type": "max", "places": N}` for "up to N decimals, trailing zeros suppressed" ($1.2M / $1.25M — casual headline), or `{"type": "exact", "places": N}` for "always exactly N decimals" ($1.20M — polished/financial) --- @@ -313,7 +536,7 @@ Use `:param` syntax in SQL for dynamic filtering. Parameters can be bound to fil **Parameter types:** - Single value: `"dataType": "INTEGER"` / `"DECIMAL"` / `"STRING"` -- Multi-select: Add `"complexType": "MULTI"` +- Multi-select: `"complexType": "MULTI"` — binds as a SQL `ARRAY`, filter with `array_contains(:p, col)`, not `col IN (:p)`. Full pattern in [3-filters.md](3-filters.md#multi-select-parameters-multi). - Range: `"dataType": "DATE", "complexType": "RANGE"` - use `:param.min` / `:param.max` --- diff --git a/experimental/databricks-aibi-dashboards/references/2-advanced-widget-specifications.md b/experimental/databricks-aibi-dashboards/references/2-advanced-widget-specifications.md index 140f618..d62b921 100644 --- a/experimental/databricks-aibi-dashboards/references/2-advanced-widget-specifications.md +++ b/experimental/databricks-aibi-dashboards/references/2-advanced-widget-specifications.md @@ -10,6 +10,8 @@ Advanced visualization types for AI/BI dashboards. For core widgets (text, count - `widgetType`: "area" - Same structure as line chart - useful for showing cumulative values or emphasizing volume +> Time-series area charts benefit from a `vertical-line` annotation marking a meaningful event (launch, incident, holiday) — turns a generic trend into a readable story. See [Annotations in 1-widget-specifications.md](1-widget-specifications.md#annotations-event-markers). + ```json "spec": { "version": 3, @@ -61,6 +63,8 @@ Combines bar and line visualizations on the same chart - useful for showing rela - `y.primary`: bar chart fields - `y.secondary`: line chart fields +> Mark meaningful events with a `vertical-line` annotation when the x-axis is temporal. See [Annotations in 1-widget-specifications.md](1-widget-specifications.md#annotations-event-markers). + ```json { "widget": { @@ -159,19 +163,326 @@ Displays geographic regions colored by aggregate values. Requires a field with g --- -## Other Visualization Types +## Forecast Line (with `AI_FORECAST`) + +Overlays a model prediction on top of historical data — historical line continues into a future band with upper/lower confidence bounds. + +- `version`: **1** +- `widgetType`: "forecast-line" +- The dataset SQL produces the original series **plus** three forecast columns: a point forecast, upper band, lower band. Spark's built-in `AI_FORECAST` table function generates them. + +### Dataset SQL pattern + +> **Always exclude the current (in-progress) bucket from the historical series.** If you aggregate weekly and today is Tuesday, the current week's bucket is only 2 days of data — the line drops off a cliff right before the forecast starts. Filter with `WHERE bucket_start < DATE_TRUNC('', current_date())` using the **same grain as the aggregation**. + +```sql +WITH actuals AS ( + SELECT DATE_TRUNC('WEEK', opened_at) AS opened_at, COUNT(*) AS count + FROM support_cases + -- Drop the partial-elapsed bucket. Grain MUST match the DATE_TRUNC above — + -- weekly aggregation → exclude current week; monthly → exclude current month. + WHERE DATE_TRUNC('WEEK', opened_at) < DATE_TRUNC('WEEK', current_date()) + GROUP BY 1 +), +dates AS ( + SELECT MAX(opened_at) AS max_d, MIN(opened_at) AS min_d FROM actuals +), +forecast AS ( + SELECT opened_at, count_forecast, count_upper, count_lower, CAST(NULL AS BIGINT) AS count + FROM AI_FORECAST( + TABLE(actuals), + horizon => (SELECT max_d + MAKE_DT_INTERVAL( + CAST(FLOOR(DATEDIFF(max_d, min_d) * 0.5) AS INT), 0, 0, 0) FROM dates), + time_col => 'opened_at', + value_col => 'count' + ) +), +bridge AS ( + -- One-row "seam" that carries the last actual value into the forecast columns + -- so the historical line and the forecast band visually connect instead of breaking + -- with a gap at the boundary. + SELECT a.opened_at, + a.count AS count_forecast, + a.count AS count_upper, + a.count AS count_lower, + a.count + FROM actuals a + JOIN dates d ON a.opened_at = d.max_d +) +SELECT opened_at, CAST(NULL AS BIGINT) AS count_forecast, CAST(NULL AS BIGINT) AS count_upper, CAST(NULL AS BIGINT) AS count_lower, count FROM actuals +UNION ALL SELECT opened_at, count_forecast, count_upper, count_lower, count FROM bridge +UNION ALL SELECT opened_at, count_forecast, count_upper, count_lower, count FROM forecast +``` + +Three CTEs: +- **`actuals`** — historical series (`count` populated, forecast columns NULL). +- **`forecast`** — `AI_FORECAST` output (forecast columns populated, `count` NULL). +- **`bridge`** — a **single row at the last actual timestamp** with the actual value duplicated into all three forecast columns. Without it, the historical line and the forecast band have a visible gap at the boundary; with it, they connect smoothly. + +> **The final `SELECT`s must list columns explicitly, in the same order, in every branch.** `SELECT * FROM actuals` (2 cols) `UNION ALL SELECT * FROM forecast` (5 cols) errors out with `NUM_COLUMNS_MISMATCH`. Project the same 5-column shape from every CTE — fill NULLs where a branch doesn't have a value (and `CAST(NULL AS )` so the types align). + +The `horizon` expression projects forward 50% of the historical range. Tune the multiplier (0.5 → 1.0 for "predict as far as we've seen") to taste. + +**If you switch the aggregation grain, update both `DATE_TRUNC` calls.** They must match — a daily x-axis with a weekly cutoff filter would still show the cliff. Common pairings: + +| Aggregation `DATE_TRUNC` | Cutoff filter | +|---|---| +| `'DAY'` | `WHERE event_ts < DATE_TRUNC('DAY', current_timestamp())` — drops today | +| `'WEEK'` | `WHERE DATE_TRUNC('WEEK', event_ts) < DATE_TRUNC('WEEK', current_date())` — drops current week | +| `'MONTH'` | `WHERE DATE_TRUNC('MONTH', event_ts) < DATE_TRUNC('MONTH', current_date())` — drops current month | +| `'QUARTER'` / `'YEAR'` | same shape with that grain | + +### Widget spec + +```json +"spec": { + "version": 1, + "widgetType": "forecast-line", + "encodings": { + "x": {"fieldName": "opened_at", "scale": {"type": "temporal"}}, + "y": { + "scale": {"type": "quantitative"}, + "original": {"fieldName": "count", "displayName": "Cases"}, + "prediction": {"fieldName": "count_forecast", "displayName": "Forecast"}, + "predictionUpper": {"fieldName": "count_upper"}, + "predictionLower": {"fieldName": "count_lower"} + } + }, + "annotations": [ /* vertical-line for known events — same shape as in 1-widget */ ], + "frame": {"showTitle": true, "title": "Case Volume Forecast"} +} +``` + +> Annotations (`vertical-line`) work on forecast-line — useful for marking known seasonal events (holidays, releases) inside both the historical window and the prediction band. Shape documented in [1-widget-specifications.md](1-widget-specifications.md#annotations-event-markers). + +--- + +## Pivot + +A cross-tab — dimensions on rows AND columns, measures in cells. Supports per-cell conditional styling (heat-map-style). + +- `version`: **3** +- `widgetType`: "pivot" +- For multi-dimensional aggregations like "category × priority", supports drill-down totals, and is the right widget for cohort retention (see end of section). + +```json +"spec": { + "version": 3, + "widgetType": "pivot", + "encodings": { + "rows": [ + {"fieldName": "industry"}, + {"fieldName": "customer_segment", "total": {"show": true}} + ], + "columns": [ + {"fieldName": "category"}, + {"fieldName": "Priority Level", "total": {"show": true}} + ], + "cell": { + "type": "multi-cell", + "fields": [ + { + "fieldName": "count(*)", + "cellType": "text", + "style": { + "type": "basic", + "rules": [ + {"condition": {"operand": {"type": "data-value", "value": "30"}, "operator": ">="}, + "backgroundColor": {"hex": "#FF7E5C"}}, + {"condition": {"operand": {"type": "data-value", "value": "20"}, "operator": ">="}, + "backgroundColor": {"themeColorType": "visualizationColors", "position": 0}}, + {"condition": {"operand": {"type": "data-value", "value": "15"}, "operator": ">="}, + "backgroundColor": {"themeColorType": "visualizationColors", "position": 6}} + ] + } + } + ] + } + }, + "frame": {"showTitle": true, "title": "Cases by Category × Priority"} +} +``` + +For a **continuous color gradient** instead of explicit thresholds, set `cell.fields[].cellType: "color-scale"` and drop the `style.rules`. The gradient auto-fits to min/max in the cell values. + +**Sort by cell values** (e.g., put the highest-volume column first) — useful for cohort tables: +```json +"columns": [{ + "fieldName": "category", + "scale": {"type": "categorical", "sort": {"by": "cell", "field": {"index": 0}}} +}] +``` +`"by": "cell-reversed"` flips the order. `field.index` picks which value field to sort by when there are multiple. + +> **Cohort retention charts** are built as a `pivot`. Rows = cohort date, columns = period offset (`0`, `1 year`, `2 years`…), cell = retention ratio with `cellType: "color-scale"`. There is no separate `cohort` widget type. + +--- + +## Histogram + +Frequency distribution. The bin width is set in the **widget's field expression**, not in the dataset SQL. + +- `version`: **3** +- `widgetType`: "histogram" + +```json +"queries": [{ + "name": "main_query", + "query": { + "datasetName": "ds_cases", + "fields": [ + {"name": "bin(time_to_resolution_hours, binWidth=2)", + "expression": "BIN_FLOOR(`time_to_resolution_hours`, 2)"}, + {"name": "count(*)", "expression": "COUNT(`*`)"} + ], + "disaggregated": false + } +}], +"spec": { + "version": 3, + "widgetType": "histogram", + "encodings": { + "x": {"fieldName": "bin(time_to_resolution_hours, binWidth=2)", + "scale": {"type": "quantitative"}}, + "y": {"fieldName": "count(*)", "scale": {"type": "quantitative"}} + }, + "frame": {"showTitle": true, "title": "Resolution Time Distribution"} +} +``` + +The field `name` (and the widget's `fieldName`) is the readable `bin(col, binWidth=N)` label; the underlying `expression` uses `BIN_FLOOR(\`col\`, N)` — a Lakeview field-expression, not raw SQL. + +--- + +## Sankey -The following visualization types are available in Databricks AI/BI dashboards but are less commonly used. Refer to [Databricks documentation](https://docs.databricks.com/visualizations/visualization-types) for details: +Flow between two or more stages. Each stage is a categorical field; the value is a quantitative aggregate. -| Widget Type | Description | +- `version`: **1** +- `widgetType`: "sankey" + +```json +"spec": { + "version": 1, + "widgetType": "sankey", + "encodings": { + "value": {"fieldName": "count(*)"}, + "stages": [ + {"fieldName": "channel"}, + {"fieldName": "reopened_flag", "displayName": "Reopened"} + ] + }, + "frame": {"showTitle": true, "title": "Channel → Reopen flow"} +} +``` + +Add more `stages` entries for multi-step flows (e.g., funnel-with-attribution: `source → channel → outcome`). + +--- + +## Heatmap + +Color-intensity grid: x-axis categorical, y-axis categorical, color = numeric aggregate. Useful for "X by Y" matrices. + +- `version`: **3** +- `widgetType`: "heatmap" + +```json +"spec": { + "version": 3, + "widgetType": "heatmap", + "encodings": { + "x": {"fieldName": "priority", "scale": {"type": "categorical"}, "axis": {"hideTitle": true}}, + "y": {"fieldName": "ship_mode", "scale": {"type": "categorical"}, "axis": {"hideTitle": true}}, + "color": {"fieldName": "sum(order_count)", + "scale": {"type": "quantitative", + "colorRamp": {"mode": "scheme", "scheme": "viridis"}}} + }, + "frame": {"showTitle": true, "title": "Order count by priority × ship mode"} +} +``` + +Heatmap limit: 64K rows / 10MB. For larger data, pre-aggregate to a smaller grid. + +`axis.hideTitle: true` (shown above) drops the redundant "priority" / "ship_mode" axis labels — the row/column headers already tell you what they are. Same trick works on any x/y axis encoding (line, bar, heatmap, pivot) when the column name is obvious from context. + +--- + +## Funnel + +Stage-by-stage conversion: how many users / records make it from step 1 to step N. + +- `version`: **1** +- `widgetType`: "funnel" + +```json +"spec": { + "version": 1, + "widgetType": "funnel", + "encodings": { + "x": {"fieldName": "stage"}, + "y": {"fieldName": "count", "scale": {"type": "quantitative"}}, + "color": {"fieldName": "count"} + }, + "frame": {"showTitle": true, "title": "Signup funnel"} +} +``` + +Dataset SQL typically returns one row per stage, with an ordering column. Use `ORDER BY stage_order` in the SQL to guarantee top-to-bottom visualization. + +--- + +## Box + +Distribution summary (median, quartiles, whiskers, outliers). Compare distributions across categories. + +- `version`: **1** +- `widgetType`: "box" + +```json +"spec": { + "version": 1, + "widgetType": "box", + "encodings": { + "x": {"fieldName": "return_flag", "displayName": "Return flag"}, + "y": {"fieldName": "l_extendedprice", "displayName": "Extended price", + "scale": {"type": "quantitative"}} + }, + "frame": {"showTitle": true, "title": "Price distribution by return flag"} +} +``` + +--- + +## Waterfall + +Cumulative effect of positive/negative deltas — useful for P&L bridges, MoM revenue walks, factor decomposition. + +- `version`: **1** +- `widgetType`: "waterfall" + +```json +"spec": { + "version": 1, + "widgetType": "waterfall", + "encodings": { + "x": {"fieldName": "monthly(date_col)", + "expression": "DATE_TRUNC(\"MONTH\", `date_col`)"}, + "y": {"fieldName": "sum(amount)", "scale": {"type": "quantitative"}} + }, + "frame": {"showTitle": true, "title": "Monthly P&L"} +} +``` + +Dataset typically returns one row per period with signed values (positive contributions, negative deductions). + +--- + +## Other (less common) + +| Widget Type | When to use | |-------------|-------------| -| heatmap | Color intensity grid for numerical data | -| histogram | Frequency distribution with configurable bins | -| funnel | Stage-based metric analysis | -| sankey | Flow visualization between value sets | -| box | Distribution summary with quartiles | -| marker-map | Latitude/longitude point markers | -| pivot | Drag-and-drop aggregation table | -| word-cloud | Word frequency visualization | -| sunburst | Hierarchical data in concentric circles | -| cohort | Group outcome analysis over time | +| `word-cloud` | Word/category frequency from a text field. | +| `sunburst` | Hierarchical data in nested rings (org chart, taxonomy). | + +These follow the same `version`/`widgetType`/`encodings` pattern — see the [official docs](https://docs.databricks.com/dashboards/manage/visualizations/types) for spec details. diff --git a/experimental/databricks-aibi-dashboards/references/3-filters.md b/experimental/databricks-aibi-dashboards/references/3-filters.md index 0c98e49..14ea544 100644 --- a/experimental/databricks-aibi-dashboards/references/3-filters.md +++ b/experimental/databricks-aibi-dashboards/references/3-filters.md @@ -10,6 +10,7 @@ - `filter-date-range-picker`: for DATE/TIMESTAMP fields (date range selection) - `filter-single-select`: categorical with single selection - `filter-multi-select`: categorical with multiple selections (preferred for drill-down) +- `range-slider`: numeric range filter on a quantitative column (e.g., "resolution time hours", "order amount") > **Performance note**: Global filters automatically apply `WHERE` clauses to dataset queries at runtime. You don't need to pre-filter data in your SQL - the dashboard engine handles this efficiently. @@ -235,6 +236,87 @@ Each `queryName` in `encodings.fields` binds the filter to that specific dataset --- +## Multi-Select Parameters (`MULTI`) + +When the dataset **pre-aggregates** (`GROUP BY` in the SQL), uses a CTE, or wraps a table function (e.g. `AI_FORECAST`), a field-based filter can't auto-inject a `WHERE` — you must filter explicitly with a parameter. Same goes when you want the filter expressed in SQL for traceability. + +A `MULTI` parameter binds as a **SQL `ARRAY`**, not an `IN`-list. Two rules to filter correctly: + +```sql +-- ❌ WRONG: a MULTI param is an ARRAY → DATATYPE_MISMATCH (STRING vs ARRAY) +WHERE category IN (:category_filter) + +-- ✅ RIGHT: array_contains, with size()=0 guard so an empty selection means "all" +WHERE (size(:category_filter) = 0 OR array_contains(:category_filter, category)) +``` + +The empty-selection default is an **empty array, not NULL** — without the `size()=0` guard, the dashboard loads showing zero rows. + +```json +{ + "name": "metric_by_category", + "queryLines": [ + "SELECT category, SUM(revenue) AS total FROM orders ", + "WHERE (size(:category_filter) = 0 OR array_contains(:category_filter, category)) ", + "GROUP BY category" + ], + "parameters": [{ + "keyword": "category_filter", + "dataType": "STRING", + "complexType": "MULTI", + "defaultSelection": {"values": {"dataType": "STRING", "values": []}} + }] +} +``` + +The filter widget binds the parameter via `parameterName` (NOT `fieldName`), same shape as the date-range parameter example above: + +```json +"encodings": {"fields": [{"parameterName": "category_filter", "queryName": "q_param"}]} +``` + +> **Parameters live on the dataset + the filter widget only.** Don't add `parameters` to a chart/counter widget's own query — the chart reads the dataset, which the filter has already parameterized. Adding parameters to the consuming widget makes it render blank with no error. + +--- + +## Range Slider (numeric range filter) + +For filtering on a numeric column where the user wants to drag a min/max slider — e.g., resolution-time hours, amount, age. The query exposes `MIN(col)` and `MAX(col)` so the dashboard knows the slider bounds; `encodings.fields[].fieldName` is the underlying column name. + +```json +{ + "widget": { + "name": "time-to-resolution", + "queries": [{ + "name": "ds_resolution", + "query": { + "datasetName": "ds_cases", + "fields": [ + {"name": "min(time_to_resolution_hours)", "expression": "MIN(`time_to_resolution_hours`)"}, + {"name": "max(time_to_resolution_hours)", "expression": "MAX(`time_to_resolution_hours`)"} + ], + "disaggregated": false + } + }], + "spec": { + "version": 2, + "widgetType": "range-slider", + "encodings": { + "fields": [ + {"fieldName": "time_to_resolution_hours", "queryName": "ds_resolution"} + ] + }, + "frame": {"showTitle": true, "title": "Resolution time (hours)"} + } + }, + "position": {"x": 0, "y": 0, "width": 4, "height": 2} +} +``` + +`range-slider` only works on numeric / temporal columns. On a categorical field it will fail at render. To filter a numeric field by an explicit min/max in SQL (rather than a UI-only WHERE), bind to a `:param.min`/`:param.max` parameter — same pattern as date-range, see "Date Range Filtering" above. + +--- + ## Filter Layout Guidelines - Global filters: Position on dedicated filter page, stack vertically at `x=0` diff --git a/experimental/databricks-aibi-dashboards/references/4-examples.md b/experimental/databricks-aibi-dashboards/references/4-examples.md index 81129e2..1021572 100644 --- a/experimental/databricks-aibi-dashboards/references/4-examples.md +++ b/experimental/databricks-aibi-dashboards/references/4-examples.md @@ -1,402 +1,847 @@ # Complete Dashboard Example -This is a **reference example** to understand the JSON structure and layout patterns. **Always adapt to what the user requests** - use their tables, metrics, and visualizations. This example demonstrates the correct syntax; your dashboard should reflect the user's actual requirements. +A working dashboard JSON that exercises the new feature set: -## Key Patterns (Read First) +- **`dataset.columns[]` + `MEASURE()`** — reusable named measures across widgets. +- **`forecast-line`** with `AI_FORECAST` SQL and a **vertical-line annotation** for a known event. +- **`pivot`** with conditional cell coloring. +- **`symbol-map`** (lat/lon) with a continuous color ramp. +- **`range-slider`** filter on a numeric column. +- **Counter sparkline** via the `period` encoding. -### 1. Page Types (Required) -- `PAGE_TYPE_CANVAS` - Main content page with widgets -- `PAGE_TYPE_GLOBAL_FILTERS` - Dedicated filter page that affects all canvas pages +**Adapt this to the user's actual data and story** — the structure and feature mix is what to copy, not the column names. -### 2. Widget Versions (Critical!) -| Widget Type | Version | -|-------------|---------| -| `counter`, `table` | **2** | -| `bar`, `line`, `area`, `pie` | **3** | -| `filter-*` | **2** | +## Key Patterns (read first) -### 3. KPI Counter with Currency Formatting -```json -"format": { - "type": "number-currency", - "currencyCode": "USD", - "abbreviation": "compact", - "decimalPlaces": {"type": "max", "places": 1} -} -``` +### Page types +- `PAGE_TYPE_CANVAS` — content page with widgets. +- `PAGE_TYPE_GLOBAL_FILTERS` — dedicated filter page, applies to all canvas pages whose datasets contain the filter field. -### 4. Filter Binding to Multiple Datasets -Each filter query binds the filter to one dataset. Add multiple queries to filter multiple datasets: -```json -"queries": [ - {"name": "ds1_region", "query": {"datasetName": "dataset1", ...}}, - {"name": "ds2_region", "query": {"datasetName": "dataset2", ...}} -] -``` +### Widget versions used in this example + +| Widget | Version | +|---|---| +| `counter`, `table`, `filter-*`, `range-slider`, `symbol-map` | **2** | +| `bar`, `line`, `area`, `pie`, `pivot`, `histogram`, `heatmap` | **3** | +| `combo`, `choropleth-map`, `forecast-line`, `sankey`, `funnel`, `box`, `waterfall` | **1** | + +See [SKILL.md](../SKILL.md#widget-index-version--where-documented) for the full version table. + +### Layout (12-col grid) -### 5. Layout Grid (12 columns) ``` -y=0: Header with title + description (w=12, h=2) -y=2: KPI(w=4,h=3) | KPI(w=4,h=3) | KPI(w=4,h=3) ← fills 12 -y=5: Section header (w=12, h=1) -y=6: Area chart (w=12, h=5) -y=11: Section header (w=12, h=1) -y=12: Pie(w=4,h=5) | Bar chart(w=8,h=5) ← fills 12 +y=0: Header (w=12, h=3) ← story prose tying the dashboard together +y=3: KPI (w=3) | KPI w/ sparkline (w=3) | KPI (w=3) | KPI (w=3) ← fills 12 +y=6: Forecast w/ release annotation (w=8, h=6) | Histogram (w=4, h=8) +y=12: Symbol map (w=8, h=5) | +y=14: | Pie by channel (w=4, h=4) +y=17: Detail table (w=8, h=7) | +y=18: | Heatmap (w=4, h=6) ``` -Use `\n\n` in text widget lines array to create line breaks within a single widget. +The right-hand column uses **staggered heights** — the histogram extends past the forecast, the pie sits in the middle, the heatmap aligns to the bottom of the detail table. The widgets on the left and right don't share row boundaries; the engine tolerates this as long as the canvas reads naturally. Pair tall widgets on one side with several shorter ones on the other to vary the rhythm rather than forcing strict row alignment. ---- - -## Full Dashboard: Sales Analytics +This example's header carries a short narrative tying the widgets together, and the forecast widget uses a `vertical-line` annotation to mark a notable date. That's one way to structure a story — useful if there's a real inflection point in the data — but it's not required: a dashboard can also just present the metrics neutrally, or anchor the story on a different widget. Treat it as illustrative. -This example shows a complete dashboard with: -- Title and subtitle text widgets -- 3 KPI counters with currency/number formatting -- Area chart for time series trends -- Pie chart for category breakdown -- Bar chart with color grouping by region -- Data table for detailed records -- Global filters (date range, region, category) +--- -> **Note**: Queries reference bare table names only (no catalog, no schema). Catalog and schema are set via `--dataset-catalog "my_catalog" --dataset-schema "gold"` when creating the dashboard. These flags only apply when the query omits catalog/schema — they will NOT override anything you hardcode in the `FROM` clause. +## Full Dashboard: Support Operations ```json { "datasets": [ { - "name": "ds_daily_sales", - "displayName": "Daily Sales", + "name": "ds_support", + "displayName": "Support cases", "queryLines": [ - "SELECT sale_date, region, department, total_orders, total_units, total_revenue, total_cost, profit_margin ", - "FROM daily_sales ", - "ORDER BY sale_date" + "SELECT case_id, opened_at, closed_at, priority, channel, region_name,\n", + " customer_id, reopened_flag, satisfaction_score,\n", + " customer_latitude, customer_longitude,\n", + " (unix_timestamp(closed_at) - unix_timestamp(opened_at)) / 3600.0 AS time_to_resolution_hours\n", + "FROM support_cases" + ], + "columns": [ + { + "displayName": "Total Cases", + "description": "Count of support cases", + "expression": "COUNT(`case_id`)" + }, + { + "displayName": "Avg Resolution Hours", + "description": "Mean resolution time across closed cases", + "expression": "AVG(`time_to_resolution_hours`)" + }, + { + "displayName": "Reopen Rate %", + "description": "Percent of cases reopened after closure", + "expression": "SUM(CASE WHEN `reopened_flag`=true THEN 1 ELSE 0 END) * 1.0 / COUNT(`case_id`)" + }, + { + "displayName": "Avg Satisfaction", + "description": "Average customer satisfaction (1-10)", + "expression": "AVG(`satisfaction_score`)" + }, + { + "displayName": "Priority Level", + "description": "Sortable priority label", + "expression": "CASE WHEN `priority`='Critical' THEN '1-Critical' WHEN `priority`='High' THEN '2-High' WHEN `priority`='Medium' THEN '3-Medium' ELSE '4-Low' END" + } ] }, { - "name": "ds_products", - "displayName": "Product Performance", + "name": "ds_forecast", + "displayName": "Cases forecast", "queryLines": [ - "SELECT product_id, product_name, department, region, units_sold, revenue, cost, profit ", - "FROM product_performance" + "WITH actuals AS (\n", + " SELECT DATE_TRUNC('WEEK', opened_at) AS opened_at, COUNT(*) AS count\n", + " FROM support_cases\n", + " WHERE DATE_TRUNC('WEEK', opened_at) < DATE_TRUNC('WEEK', current_date())\n", + " GROUP BY 1\n", + "),\n", + "dates AS (SELECT MAX(opened_at) AS max_d, MIN(opened_at) AS min_d FROM actuals),\n", + "forecast AS (\n", + " SELECT opened_at, count_forecast, count_upper, count_lower, CAST(NULL AS BIGINT) AS count\n", + " FROM AI_FORECAST(TABLE(actuals),\n", + " horizon => (SELECT max_d + MAKE_DT_INTERVAL(CAST(FLOOR(DATEDIFF(max_d, min_d) * 0.5) AS INT), 0, 0, 0) FROM dates),\n", + " time_col => 'opened_at', value_col => 'count')\n", + "),\n", + "bridge AS (\n", + " SELECT a.opened_at, a.count AS count_forecast, a.count AS count_upper, a.count AS count_lower, a.count\n", + " FROM actuals a JOIN dates d ON a.opened_at = d.max_d\n", + ")\n", + "SELECT opened_at, CAST(NULL AS BIGINT) AS count_forecast, CAST(NULL AS BIGINT) AS count_upper, CAST(NULL AS BIGINT) AS count_lower, count FROM actuals\n", + "UNION ALL SELECT opened_at, count_forecast, count_upper, count_lower, count FROM bridge\n", + "UNION ALL SELECT opened_at, count_forecast, count_upper, count_lower, count FROM forecast" ] } ], "pages": [ { - "name": "sales_overview", - "displayName": "Sales Overview", - "pageType": "PAGE_TYPE_CANVAS", - "layoutVersion": "GRID_V1", + "name": "overview", + "displayName": "Overview", "layout": [ { "widget": { "name": "header", "multilineTextboxSpec": { - "lines": ["# Sales Dashboard\n\nMonitor daily sales, revenue, and profit margins across regions and departments."] + "lines": [ + "# Support Operations \u2014 Post-Release Surge (4.1)\n", + "\n", + "**The story this week:** a clear volume spike in mid-February \u2014 the date the new Product 4.1 release went out (marked on the forecast chart). The release introduced a regression that drove a wave of Critical/High cases over the following 6 weeks: case volume jumps, average resolution time creeps up, reopen rate climbs, and customer satisfaction dips on the affected metros \u2014 visible on the satisfaction map as warmer (lower) scores. The forecast extends the trend forward so the team can size the cleanup ahead. Use the filters page to slice by region or resolution-time bucket to localize the impact." + ] } }, - "position": {"x": 0, "y": 0, "width": 12, "height": 2} + "position": { + "x": 0, + "y": 0, + "width": 12, + "height": 3 + } }, { "widget": { - "name": "kpi_revenue", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_daily_sales", - "fields": [{"name": "sum(total_revenue)", "expression": "SUM(`total_revenue`)"}], - "disaggregated": false + "name": "kpi-total-cases", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "measure(Total Cases)", + "expression": "MEASURE(`Total Cases`)" + } + ], + "disaggregated": false + } } - }], + ], "spec": { "version": 2, "widgetType": "counter", "encodings": { "value": { - "fieldName": "sum(total_revenue)", - "displayName": "Total Revenue", - "format": { - "type": "number-currency", - "currencyCode": "USD", - "abbreviation": "compact", - "decimalPlaces": {"type": "max", "places": 1} - } + "fieldName": "measure(Total Cases)", + "displayName": "Total Cases" } }, - "frame": {"title": "Total Revenue", "showTitle": true, "description": "For the selected period", "showDescription": true} + "frame": { + "title": "Total Cases", + "showTitle": true + } } }, - "position": {"x": 0, "y": 2, "width": 4, "height": 3} + "position": { + "x": 0, + "y": 3, + "width": 3, + "height": 3 + } }, { "widget": { - "name": "kpi_orders", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_daily_sales", - "fields": [{"name": "sum(total_orders)", "expression": "SUM(`total_orders`)"}], - "disaggregated": false + "name": "kpi-volume-trend", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "weekly(opened_at)", + "expression": "DATE_TRUNC(\"WEEK\", `opened_at`)" + }, + { + "name": "measure(Total Cases)", + "expression": "MEASURE(`Total Cases`)" + } + ], + "disaggregated": false, + "orders": [ + { + "direction": "DESC", + "expression": "DATE_TRUNC(\"WEEK\", `opened_at`)" + } + ] + } } - }], + ], "spec": { "version": 2, + "frame": { + "title": "Daily Case Volume ", + "showTitle": true + }, "widgetType": "counter", "encodings": { "value": { - "fieldName": "sum(total_orders)", - "displayName": "Total Orders", - "format": { - "type": "number", - "abbreviation": "compact", - "decimalPlaces": {"type": "max", "places": 0} - } + "fieldName": "measure(Total Cases)", + "displayName": "This Week" + }, + "period": { + "fieldName": "weekly(opened_at)" } - }, - "frame": {"title": "Total Orders", "showTitle": true, "description": "For the selected period", "showDescription": true} + } } }, - "position": {"x": 4, "y": 2, "width": 4, "height": 3} + "position": { + "x": 3, + "y": 3, + "width": 3, + "height": 3 + } }, { "widget": { - "name": "kpi_profit", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_daily_sales", - "fields": [{"name": "avg(profit_margin)", "expression": "AVG(`profit_margin`)"}], - "disaggregated": false + "name": "kpi-resolution", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "measure(Avg Resolution Hours)", + "expression": "MEASURE(`Avg Resolution Hours`)" + } + ], + "disaggregated": false + } } - }], + ], "spec": { "version": 2, + "frame": { + "title": "Avg Resolution Time", + "showTitle": true + }, "widgetType": "counter", "encodings": { "value": { - "fieldName": "avg(profit_margin)", - "displayName": "Avg Profit Margin", - "format": { - "type": "number-percent", - "decimalPlaces": {"type": "max", "places": 1} - } + "fieldName": "measure(Avg Resolution Hours)", + "formatTemplate": "{{ @formatted }} hrs", + "displayName": "Avg Hours" } - }, - "frame": {"title": "Profit Margin", "showTitle": true, "description": "Average for period", "showDescription": true} + } } }, - "position": {"x": 8, "y": 2, "width": 4, "height": 3} + "position": { + "x": 6, + "y": 3, + "width": 3, + "height": 3 + } }, { "widget": { - "name": "section_trends", - "multilineTextboxSpec": { - "lines": ["## Revenue Trend"] + "name": "kpi-reopen", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "measure(Reopen Rate %)", + "expression": "MEASURE(`Reopen Rate %`)" + } + ], + "disaggregated": false + } + } + ], + "spec": { + "version": 2, + "frame": { + "title": "Reopen Rate (%)", + "showTitle": true + }, + "widgetType": "counter", + "encodings": { + "value": { + "fieldName": "measure(Reopen Rate %)", + "format": { + "type": "number-percent", + "decimalPlaces": { + "type": "max", + "places": 2 + } + }, + "displayName": "Reopen Rate" + } + } } }, - "position": {"x": 0, "y": 5, "width": 12, "height": 1} + "position": { + "x": 9, + "y": 3, + "width": 3, + "height": 3 + } }, { "widget": { - "name": "chart_revenue_trend", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_daily_sales", - "fields": [ - {"name": "sale_date", "expression": "`sale_date`"}, - {"name": "sum(total_revenue)", "expression": "SUM(`total_revenue`)"} - ], - "disaggregated": false + "name": "case-forecast", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_forecast", + "fields": [ + { + "name": "opened_at", + "expression": "`opened_at`" + }, + { + "name": "count", + "expression": "`count`" + }, + { + "name": "count_forecast", + "expression": "`count_forecast`" + }, + { + "name": "count_upper", + "expression": "`count_upper`" + }, + { + "name": "count_lower", + "expression": "`count_lower`" + } + ], + "disaggregated": true + } } - }], + ], "spec": { - "version": 3, - "widgetType": "area", + "version": 1, + "widgetType": "forecast-line", "encodings": { "x": { - "fieldName": "sale_date", - "scale": {"type": "temporal"}, - "axis": {"title": "Date"}, - "displayName": "Date" + "fieldName": "opened_at", + "scale": { + "type": "temporal" + } }, "y": { - "fieldName": "sum(total_revenue)", - "scale": {"type": "quantitative"}, - "format": { - "type": "number-currency", - "currencyCode": "USD", - "abbreviation": "compact" + "scale": { + "type": "quantitative", + "domainMin": 0 + }, + "original": { + "fieldName": "count", + "displayName": "Cases" }, - "axis": {"title": "Revenue ($)"}, - "displayName": "Revenue ($)" + "prediction": { + "fieldName": "count_forecast", + "displayName": "Forecast" + }, + "predictionUpper": { + "fieldName": "count_upper" + }, + "predictionLower": { + "fieldName": "count_lower" + } } }, + "annotations": [ + { + "type": "vertical-line", + "encodings": { + "x": { + "dataValue": "2026-02-16T09:00:00.000", + "dataType": "DATETIME" + }, + "label": { + "value": "Product release 4.1" + }, + "color": { + "value": { + "hex": "#FF7E5C" + } + } + } + } + ], "frame": { - "title": "Daily Revenue", "showTitle": true, - "description": "Track daily revenue trends" + "title": "Case Volume \u2014 actuals + forecast" } } }, - "position": {"x": 0, "y": 6, "width": 12, "height": 5} + "position": { + "x": 0, + "y": 6, + "width": 8, + "height": 6 + } }, { "widget": { - "name": "section_breakdown", - "multilineTextboxSpec": { - "lines": ["## Breakdown"] + "name": "priority-by-channel", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "count(case_id)", + "expression": "COUNT(`case_id`)" + }, + { + "name": "Priority Level", + "expression": "`Priority Level`" + }, + { + "name": "channel", + "expression": "`channel`" + } + ], + "disaggregated": false + } + } + ], + "spec": { + "version": 3, + "frame": { + "showTitle": true, + "title": "Cases by channel \u00d7 priority" + }, + "widgetType": "heatmap", + "encodings": { + "x": { + "fieldName": "Priority Level", + "scale": { + "type": "categorical" + } + }, + "y": { + "fieldName": "channel", + "scale": { + "type": "categorical" + } + }, + "color": { + "fieldName": "count(case_id)", + "scale": { + "type": "quantitative", + "colorRamp": { + "mode": "custom-sequential", + "colors": { + "start": "#FFA600", + "end": "#995495" + } + } + } + }, + "label": { + "show": true + } + } } }, - "position": {"x": 0, "y": 11, "width": 12, "height": 1} + "position": { + "x": 8, + "y": 18, + "width": 4, + "height": 6 + } }, { "widget": { - "name": "chart_by_department", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_daily_sales", - "fields": [ - {"name": "department", "expression": "`department`"}, - {"name": "sum(total_revenue)", "expression": "SUM(`total_revenue`)"} - ], - "disaggregated": false + "name": "customer-map", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "measure(Avg Satisfaction)", + "expression": "MEASURE(`Avg Satisfaction`)" + }, + { + "name": "customer_latitude", + "expression": "`customer_latitude`" + }, + { + "name": "customer_longitude", + "expression": "`customer_longitude`" + }, + { + "name": "count(*)", + "expression": "COUNT(`*`)" + } + ], + "disaggregated": false + } } - }], + ], "spec": { - "version": 3, - "widgetType": "pie", + "version": 2, + "frame": { + "showTitle": true, + "title": "Customer Satisfaction Map" + }, + "mark": { + "opacity": 0.7 + }, + "widgetType": "symbol-map", "encodings": { - "angle": { - "fieldName": "sum(total_revenue)", - "scale": {"type": "quantitative"}, - "displayName": "Revenue" + "coordinates": { + "latitude": { + "fieldName": "customer_latitude" + }, + "longitude": { + "fieldName": "customer_longitude" + } }, "color": { - "fieldName": "department", - "scale": {"type": "categorical"}, - "displayName": "Department" + "fieldName": "measure(Avg Satisfaction)", + "scale": { + "type": "quantitative", + "colorRamp": { + "mode": "custom-sequential", + "colors": { + "start": "#FFDC00", + "end": "#995495" + } + } + } }, - "label": {"show": true} - }, - "frame": {"title": "Revenue by Department", "showTitle": true} + "size": { + "fieldName": "count(*)", + "scale": { + "type": "quantitative" + } + } + } } }, - "position": {"x": 0, "y": 12, "width": 4, "height": 5} + "position": { + "x": 0, + "y": 12, + "width": 8, + "height": 5 + } }, { "widget": { - "name": "chart_by_region", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_daily_sales", - "fields": [ - {"name": "sale_date", "expression": "`sale_date`"}, - {"name": "region", "expression": "`region`"}, - {"name": "sum(total_revenue)", "expression": "SUM(`total_revenue`)"} - ], - "disaggregated": false + "name": "resolution-distribution", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "channel", + "expression": "`channel`" + }, + { + "name": "bin(time_to_resolution_hours, binWidth=2)", + "expression": "BIN_FLOOR(`time_to_resolution_hours`, 2)" + }, + { + "name": "count(*)", + "expression": "COUNT(`*`)" + } + ], + "disaggregated": false + } } - }], + ], "spec": { "version": 3, - "widgetType": "bar", + "frame": { + "showTitle": true, + "title": "Resolution time (hours)" + }, + "widgetType": "histogram", "encodings": { "x": { - "fieldName": "sale_date", - "scale": {"type": "temporal"}, - "axis": {"title": "Date"}, - "displayName": "Date" + "fieldName": "bin(time_to_resolution_hours, binWidth=2)", + "scale": { + "type": "quantitative", + "domain": { + "max": 175 + } + } }, "y": { - "fieldName": "sum(total_revenue)", - "scale": {"type": "quantitative"}, - "format": { - "type": "number-currency", - "currencyCode": "USD", - "abbreviation": "compact" - }, - "axis": {"title": "Revenue ($)"}, - "displayName": "Revenue ($)" + "fieldName": "count(*)", + "scale": { + "type": "quantitative" + } }, "color": { - "fieldName": "region", - "scale": {"type": "categorical"}, - "displayName": "Region" + "fieldName": "channel", + "scale": { + "type": "categorical", + "mappings": [ + { + "value": "Email", + "color": "#FF7054" + } + ] + } } - }, - "frame": {"title": "Revenue by Region", "showTitle": true} - } - }, - "position": {"x": 4, "y": 12, "width": 8, "height": 5} - }, - { - "widget": { - "name": "section_products", - "multilineTextboxSpec": { - "lines": ["## Top Products"] + } } }, - "position": {"x": 0, "y": 17, "width": 12, "height": 1} + "position": { + "x": 8, + "y": 6, + "width": 4, + "height": 8 + } }, { "widget": { - "name": "table_products", - "queries": [{ - "name": "main_query", - "query": { - "datasetName": "ds_products", - "fields": [ - {"name": "product_name", "expression": "`product_name`"}, - {"name": "department", "expression": "`department`"}, - {"name": "units_sold", "expression": "`units_sold`"}, - {"name": "revenue", "expression": "`revenue`"}, - {"name": "profit", "expression": "`profit`"} - ], - "disaggregated": true + "name": "case-detail", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "case_id", + "expression": "`case_id`" + }, + { + "name": "opened_at", + "expression": "`opened_at`" + }, + { + "name": "channel", + "expression": "`channel`" + }, + { + "name": "Priority Level", + "expression": "`Priority Level`" + }, + { + "name": "time_to_resolution_hours", + "expression": "`time_to_resolution_hours`" + }, + { + "name": "satisfaction_score", + "expression": "`satisfaction_score`" + } + ], + "disaggregated": true + } } - }], + ], "spec": { "version": 2, "widgetType": "table", "encodings": { "columns": [ - {"fieldName": "product_name", "displayName": "Product"}, - {"fieldName": "department", "displayName": "Department"}, - {"fieldName": "units_sold", "displayName": "Units Sold"}, - {"fieldName": "revenue", "displayName": "Revenue ($)"}, - {"fieldName": "profit", "displayName": "Profit ($)"} + { + "fieldName": "case_id", + "displayName": "Case" + }, + { + "fieldName": "opened_at", + "displayName": "Opened" + }, + { + "fieldName": "channel", + "displayName": "Channel" + }, + { + "fieldName": "Priority Level", + "displayName": "Priority" + }, + { + "fieldName": "time_to_resolution_hours", + "displayName": "Hours to resolve", + "format": { + "type": "number", + "decimalPlaces": { + "type": "exact", + "places": 1 + } + }, + "style": { + "type": "basic", + "rules": [ + { + "condition": { + "operand": { + "type": "data-value", + "value": "24" + }, + "operator": ">" + }, + "backgroundColor": { + "hex": "#FF7E5C" + } + } + ] + } + }, + { + "fieldName": "satisfaction_score", + "displayName": "CSAT" + } ] }, "frame": { - "title": "Product Performance", "showTitle": true, - "description": "Top products by revenue" + "title": "Case Detail" } } }, - "position": {"x": 0, "y": 18, "width": 12, "height": 6} + "position": { + "x": 0, + "y": 17, + "width": 8, + "height": 7 + } + }, + { + "widget": { + "name": "b4dd0785", + "queries": [ + { + "name": "main_query", + "query": { + "datasetName": "ds_support", + "fields": [ + { + "name": "measure(Total Cases)", + "expression": "MEASURE(`Total Cases`)" + }, + { + "name": "channel", + "expression": "`channel`" + } + ], + "disaggregated": false + } + } + ], + "spec": { + "version": 3, + "frame": { + "showTitle": true, + "title": "Cases by channel", + "showDescription": true, + "description": "Distribution of support cases across intake channels." + }, + "widgetType": "pie", + "encodings": { + "angle": { + "fieldName": "measure(Total Cases)", + "scale": { + "type": "quantitative" + }, + "displayName": "Cases" + }, + "color": { + "fieldName": "channel", + "displayName": "Channel", + "scale": { + "type": "categorical", + "mappings": [ + { + "value": "Email", + "color": "#FF7054" + }, + { + "value": "Chat", + "color": "#FFA600" + }, + { + "value": "Phone", + "color": "#DE5582" + }, + { + "value": "Web Form", + "color": "#995495" + } + ] + } + }, + "label": { + "show": true + } + } + } + }, + "position": { + "x": 8, + "y": 14, + "width": 4, + "height": 4 + } } - ] + ], + "pageType": "PAGE_TYPE_CANVAS", + "layoutVersion": "GRID_V1" }, { - "name": "global_filters", + "name": "filters", "displayName": "Filters", - "pageType": "PAGE_TYPE_GLOBAL_FILTERS", - "layoutVersion": "GRID_V1", "layout": [ { "widget": { - "name": "filter_date_range", + "name": "filter-date", "queries": [ { - "name": "ds_sales_date", + "name": "ds_date", "query": { - "datasetName": "ds_daily_sales", - "fields": [{"name": "sale_date", "expression": "`sale_date`"}], + "datasetName": "ds_support", + "fields": [ + { + "name": "opened_at", + "expression": "`opened_at`" + } + ], "disaggregated": false } } @@ -406,40 +851,39 @@ This example shows a complete dashboard with: "widgetType": "filter-date-range-picker", "encodings": { "fields": [ - {"fieldName": "sale_date", "displayName": "Date", "queryName": "ds_sales_date"} - ] - }, - "selection": { - "defaultSelection": { - "range": { - "dataType": "DATE", - "min": {"value": "now/y"}, - "max": {"value": "now/y"} + { + "fieldName": "opened_at", + "queryName": "ds_date" } - } + ] }, - "frame": {"showTitle": true, "title": "Date Range"} + "frame": { + "showTitle": true, + "title": "Date" + } } }, - "position": {"x": 0, "y": 0, "width": 4, "height": 2} + "position": { + "x": 0, + "y": 0, + "width": 4, + "height": 2 + } }, { "widget": { - "name": "filter_region", + "name": "filter-region", "queries": [ { - "name": "ds_sales_region", - "query": { - "datasetName": "ds_daily_sales", - "fields": [{"name": "region", "expression": "`region`"}], - "disaggregated": false - } - }, - { - "name": "ds_products_region", + "name": "ds_region", "query": { - "datasetName": "ds_products", - "fields": [{"name": "region", "expression": "`region`"}], + "datasetName": "ds_support", + "fields": [ + { + "name": "region_name", + "expression": "`region_name`" + } + ], "disaggregated": false } } @@ -449,52 +893,128 @@ This example shows a complete dashboard with: "widgetType": "filter-multi-select", "encodings": { "fields": [ - {"fieldName": "region", "displayName": "Region", "queryName": "ds_sales_region"}, - {"fieldName": "region", "displayName": "Region", "queryName": "ds_products_region"} + { + "fieldName": "region_name", + "queryName": "ds_region", + "displayName": "Region" + } ] }, - "frame": {"showTitle": true, "title": "Region"} + "frame": { + "showTitle": true, + "title": "Region" + } } }, - "position": {"x": 4, "y": 0, "width": 4, "height": 2} + "position": { + "x": 4, + "y": 0, + "width": 4, + "height": 2 + } }, { "widget": { - "name": "filter_department", + "name": "filter-resolution-time", "queries": [ { - "name": "ds_sales_dept", + "name": "ds_resolution", "query": { - "datasetName": "ds_daily_sales", - "fields": [{"name": "department", "expression": "`department`"}], - "disaggregated": false - } - }, - { - "name": "ds_products_dept", - "query": { - "datasetName": "ds_products", - "fields": [{"name": "department", "expression": "`department`"}], + "datasetName": "ds_support", + "fields": [ + { + "name": "min(time_to_resolution_hours)", + "expression": "MIN(`time_to_resolution_hours`)" + }, + { + "name": "max(time_to_resolution_hours)", + "expression": "MAX(`time_to_resolution_hours`)" + } + ], "disaggregated": false } } ], "spec": { "version": 2, - "widgetType": "filter-multi-select", + "widgetType": "range-slider", "encodings": { "fields": [ - {"fieldName": "department", "displayName": "Department", "queryName": "ds_sales_dept"}, - {"fieldName": "department", "displayName": "Department", "queryName": "ds_products_dept"} + { + "fieldName": "time_to_resolution_hours", + "queryName": "ds_resolution" + } ] }, - "frame": {"showTitle": true, "title": "Department"} + "frame": { + "showTitle": true, + "title": "Resolution time (hrs)" + } } }, - "position": {"x": 8, "y": 0, "width": 4, "height": 2} + "position": { + "x": 8, + "y": 0, + "width": 4, + "height": 2 + } } - ] + ], + "pageType": "PAGE_TYPE_GLOBAL_FILTERS", + "layoutVersion": "GRID_V1" + } + ], + "uiSettings": { + "theme": { + "canvasBackgroundColor": { + "light": "#FCFCFC", + "dark": "#1F272D" + }, + "widgetBackgroundColor": { + "light": "#FFFFFF", + "dark": "#11171C" + }, + "widgetBorderColor": { + "light": "#FFFFFF", + "dark": "#11171C" + }, + "fontColor": { + "light": "#11171C", + "dark": "#E8ECF0" + }, + "selectionColor": { + "light": "#2272B4", + "dark": "#8ACAFF" + }, + "visualizationColors": [ + "#FFA600", + "#FF7054", + "#DE5582", + "#995495", + "#4E5185", + "#1D425C", + "#99DDB4" + ], + "widgetHeaderAlignment": "LEFT" } - ] + } } ``` + +This is the "warm sunset" family used in the live Customer Support dashboard — amber → coral → pink → purple → navy, plus a mint-green at position 6 (0-indexed) for "good/safe" semantic use. The categorical palette covers chart series; **the alert/critical color (`#FF7E5C`) is pinned as a literal `hex` in the conditional-cell rules and the annotation** (NOT a palette position), so semantic meaning holds even if the palette is reshuffled later. + +## What each widget demonstrates + +| Widget | Feature shown | +|---|---| +| 4 KPI counters | `MEASURE()` referencing dataset-level `columns[]` | +| `kpi-volume-trend` counter | `period` encoding (sparkline behind the value) | +| `case-forecast` | `forecast-line` with `AI_FORECAST` SQL + `vertical-line` annotation | +| `priority-by-channel` | `pivot` with conditional cell-color rules | +| `customer-map` | `symbol-map` with continuous `colorRamp` | +| `resolution-distribution` | `histogram` with `bin(col, binWidth=N)` | +| `case-detail` table | per-column `format` + conditional `style.rules` for high-hour cells | +| `filter-resolution-time` | `range-slider` filter on a numeric column | +| Global filters page | Filters bound to one source dataset cascade to every widget that uses it | + +Adapt the table names, columns, story, and palette to your domain — the structure stays the same. diff --git a/experimental/databricks-aibi-dashboards/references/5-troubleshooting.md b/experimental/databricks-aibi-dashboards/references/5-troubleshooting.md index f6477c0..132e982 100644 --- a/experimental/databricks-aibi-dashboards/references/5-troubleshooting.md +++ b/experimental/databricks-aibi-dashboards/references/5-troubleshooting.md @@ -98,3 +98,41 @@ These errors occur when the JSON structure is wrong: - Use TOP-N + "Other" bucketing in dataset SQL - Aggregate to a higher level (region instead of store) - Use a table widget instead of a chart for high-cardinality data + +## `MEASURE()` errors + +- **"Cannot resolve `MEASURE(\`X\`)`"** — measure `X` is not defined on the dataset. Either add it to `dataset.columns[]` with `displayName: "X"`, or (if the source is a metric view) confirm the YAML defines a measure named `X`. Name matching is case-sensitive and backticks are required if the name has spaces. +- **`MEASURE()` returns wrong number** — check that `query.disaggregated: false` is set on the widget. With `true`, the widget bypasses dataset measures and shows raw rows. + +## Forecast-line shows blank or partial line + +- Dataset must return both historical AND forecast columns, with historical rows having `NULL` in the forecast columns and forecast rows having `NULL` in the historical column. Use `UNION ALL` to glue them — see [2-advanced-widget-specifications.md](2-advanced-widget-specifications.md#forecast-line-with-ai_forecast). +- All four y-encoding fields (`original`, `prediction`, `predictionUpper`, `predictionLower`) must reference columns that exist in `query.fields`. +- `AI_FORECAST` requires the time column to be sorted and have no gaps — pre-aggregate (e.g., `DATE_TRUNC('WEEK', ts)`) before passing to the table function. + +## Forecast-line dips right before the prediction starts + +The last historical bucket is the **current (partial) period** — e.g., aggregating weekly but today is Tuesday → "this week" bucket has only 2 days of data and looks like a cliff. Filter the partial bucket out in the dataset SQL with a cutoff using the **same `DATE_TRUNC` grain as the aggregation**: + +```sql +WHERE DATE_TRUNC('WEEK', event_ts) < DATE_TRUNC('WEEK', current_date()) +``` + +If you change the chart's aggregation grain (weekly → monthly), update **both** the `DATE_TRUNC` in `GROUP BY` and the one in the `WHERE`. Mismatched grains cause the same cliff. + +## Range-slider filter shows error or no min/max + +- The filter's `query.fields[]` must expose `MIN(col)` and `MAX(col)` — the dashboard reads these to set the slider bounds. See [3-filters.md](3-filters.md#range-slider-numeric-range-filter). +- Slider only works on numeric / temporal columns. Categorical fields fail at render — use `filter-single-select` / `filter-multi-select` instead. + +## Symbol-map shows no points + +- Verify the dataset returns both `latitude` and `longitude` columns with valid floats (not strings, not nulls for all rows). +- Lat values must be in [-90, 90], lon in [-180, 180]. Out-of-range rows are silently dropped. +- For region-based maps (countries, states), use `choropleth-map`, not `symbol-map`. + +## Annotations not appearing on chart + +- `annotations` is a sibling of `encodings` inside `spec`, not nested inside it. +- Each annotation needs `type: "vertical-line"`, `encodings.x.dataValue` matching the chart's x-axis type, and a matching `dataType` (`DATETIME` / `STRING` / `NUMBER`). +- Annotations are only rendered on time-series chart types (`line`, `area`, `bar`, `combo`, `forecast-line`). Pie / pivot / map ignore them. diff --git a/manifest.json b/manifest.json index 2f2876e..4ed2467 100644 --- a/manifest.json +++ b/manifest.json @@ -43,7 +43,7 @@ "references/5-troubleshooting.md" ], "repo_dir": "experimental", - "version": "0.1.0" + "version": "0.2.0" }, "databricks-apps": { "description": "Build apps on Databricks Apps platform.",