-
Notifications
You must be signed in to change notification settings - Fork 202
FMEPRD-306 Add Warehouse Native Experimentation Beta Documentation #11720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+1,477
−13
Merged
Changes from all commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
efa51c7
FMEPRD-306
alai97 4169bab
Fix Links
alai97 58580a6
More Links
alai97 0e1da68
Merge branch 'main' into add-warehouse-native-docs
alai97 11178c7
Wording Nit
alai97 845c065
Add Beta Badge + Create Metrics Page
alai97 b182192
Merge branch 'main' into add-warehouse-native-docs
alai97 687e4ed
Replace Experiment Results + Add Integrations
alai97 620f957
Add Formatting
alai97 be7fd4a
Update Tabs
alai97 e6356a0
Fix Link
alai97 c481435
Add Bullet
alai97 84f4bc4
Remove BigQuery and Trino
alai97 3c2e675
Add Redshift and Snowflake Content (#11757)
alai97 5760c2a
Bump Release Date
alai97 28755a5
Replace Experiment Results Screenshot + Remove CUPED
alai97 ffa6779
Clarify FME Settings (or Admin?)
alai97 b74ab51
Update Navigation
alai97 2b481b5
Replace Analyze Results Screenshot
alai97 925b887
Swap Images
alai97 32af1fa
SME Review
alai97 4a2cc91
Add Content for View Experiment Results
alai97 b90bbde
Add Schema + SQL Commands
alai97 1c116c0
Add Create Experiment Content
alai97 0db0163
More Changes
alai97 8dfdc40
Remove More Cloud Experimentation Content
alai97 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
alai97 marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8 changes: 8 additions & 0 deletions
8
.../warehouse-native/experiment-results/analyze-experiment-results/health-check.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
title: Health Check | ||
sidebar_position: 40 | ||
--- | ||
|
||
import HealthCheck from '/docs/feature-management-experimentation/60-experimentation/experiment-results/analyzing-experiment-results/health-check.md'; | ||
|
||
<HealthCheck /> |
8 changes: 8 additions & 0 deletions
8
...ntation/warehouse-native/experiment-results/analyze-experiment-results/index.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
--- | ||
title: Analyze Experiment Results | ||
sidebar_position: 20 | ||
--- | ||
|
||
import AnalyzeResults from '/docs/feature-management-experimentation/60-experimentation/experiment-results/analyzing-experiment-results/index.md'; | ||
|
||
<AnalyzeResults /> |
43 changes: 43 additions & 0 deletions
43
...feature-management-experimentation/warehouse-native/experiment-results/index.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
--- | ||
title: Warehouse Native Experimentation Results | ||
sidebar_label: Warehouse Native Experiment Results | ||
description: Analyze your experiment results in Harness FME. | ||
sidebar_position: 5 | ||
--- | ||
|
||
<CTABanner | ||
buttonText="Request Access" | ||
title="Warehouse Native is in beta!" | ||
tagline="Get early access to run Harness FME experiments directly in your data warehouse." | ||
link="https://developer.harness.io/docs/feature-management-experimentation/fme-support" | ||
closable={true} | ||
target="_self" | ||
/> | ||
|
||
## Overview | ||
|
||
Understanding how your experiment is performing, and whether it's driving meaningful impact, is key to making confident, data-informed product decisions. Warehouse Native experiment results help you interpret metrics derived directly from your <Tooltip id="fme.warehouse-native.data-warehouse">data warehouse</Tooltip>, assess experiment health, and share validated outcomes with stakeholders. | ||
|
||
## View experiment results | ||
|
||
Review key experiment metrics and overall significance in Harness FME. | ||
|
||
 | ||
|
||
Explore [how each metric performs](/docs/feature-management-experimentation/warehouse-native/experiment-results/view-experiment-results/) across treatments, inspect query-based data directly from your warehouse, and understand how results are calculated based on your metric definitions. | ||
|
||
## Analyze experiment results | ||
|
||
Drill down into experiment details to validate setup, confirm metric source alignment, and investigate user or account-level behavior. | ||
|
||
 | ||
|
||
Use [detailed metric breakdowns](/docs/feature-management-experimentation/warehouse-native/experiment-results/analyze-experiment-results/) to identify anomalies or confirm expected outcomes. | ||
|
||
## Share results | ||
|
||
Download experiment metrics, statistical summaries, and warehouse query outputs in CSV or JSON format for further analysis or collaboration with your team. | ||
|
||
 | ||
|
||
You can also share experiment results directly within Harness FME to maintain visibility across product, data, and engineering teams. |
78 changes: 78 additions & 0 deletions
78
...imentation/warehouse-native/experiment-results/view-experiment-results/index.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
--- | ||
title: View Experiment Results | ||
sidebar_position: 10 | ||
--- | ||
|
||
## Overview | ||
|
||
You can view your experiment results from the **Experiments** page. This page provides a centralized view of all experiments and allows you to quickly access performance metrics, significance levels, and summary details for each treatment group. | ||
|
||
Click into any experiment to view detailed results, including the following: | ||
|
||
* Experiment metadata, such as: | ||
|
||
- Experiment name, owners, and tags | ||
- Start and end dates | ||
- Active targeting rule | ||
- Total number of exposures | ||
- Treatment group assignment counts and percentages | ||
|
||
* Treatment comparison, including: | ||
|
||
- The baseline treatment (e.g. `off`) | ||
- One or more comparison treatments (e.g. `low`) | ||
|
||
## Use AI Summarize | ||
|
||
For faster interpretation of experiment outcomes, the Experiments page includes an **AI Summarize** button. This analyzes key and guardrail metric results to generate a summary of your experiment, making it easier to share results and next steps with your team. | ||
|
||
 | ||
|
||
The summary is broken into three sections: | ||
|
||
* **Winner Analysis**: Highlights whether a clear winner emerged across key metrics and guardrails. | ||
* **Overall Impact Summary**: Summarizes how the treatment impacted user behavior or business outcomes. | ||
* **Next Steps Suggestion**: Recommends what to do next, whether to iterate, roll out, or revisit your setup. | ||
|
||
## Manually recalculating metrics | ||
|
||
You can manually run calculations on-demand by clicking the Recalculate button. Recalculations can be run for key metrics only, or for all metrics (key, guardrail, and supporting). **Most recalculations take up to five minutes, but can take longer, depending on the size of your data and the length of your experiment.** | ||
|
||
Reasons you may choose to recalculate metrics: | ||
|
||
* If you create or modify a metric after the last updated metric impact calculation, recalculate to get the latest results. | ||
* If you assign a metric to the Key metrics or Supporting metrics groups, recalculate to populate results for those metrics. | ||
|
||
The **Recalculate** button will be disabled when: | ||
|
||
* **A forced recalculation is already scheduled.** A calculation is in progress. You can click the Recalculate button again, as soon as the currently running calculation finishes. | ||
|
||
## Concluding on interim data | ||
|
||
Although we show the statistical results for multiple interim points, we caution against drawing conclusions from interim data. Each interim point at which the data is analyzed has its own chance of bringing a false positive result, so looking at more points brings more chance of a false positive. For more information about statistical significance and false positives, see [Statistical significance](/docs/feature-management-experimentation/release-monitoring/metrics/statistical-significance/). | ||
|
||
If you were to look at all the p-values from the interim analysis points and claim a significant result if any of those were below your significance threshold, then you would have a substantially higher false positive rate than expected based on the threshold alone. For example, you would have far more than a 5% chance of seeing a falsely significant result when using a significance threshold of 0.05, if you concluded on any significant p-value shown in the metric details and trends view. This is because there are multiple chances for you to happen upon a time when the natural noise in the data happened to look like a real impact. | ||
|
||
For this reason, it is good practice to only draw conclusions from your experiment at the predetermined conclusion point(s), such as at the end of the review period. | ||
|
||
### Interpreting the line chart and trends | ||
|
||
The line chart provides a visualization of how the measured impact has changed since the beginning of the feature flag. This may be useful for gaining insights on any seasonality or for identifying any unexpected sudden changes in the performance of the treatments. | ||
|
||
However it is important to remember that there will naturally be noise and variation in the data, especially when the sample size is low at the beginning of a feature flag, so some differences in the measured impact over time are to be expected. | ||
|
||
Additionally, since the data is cumulative, it may be expected that the impact changes as the run time of your feature flag increases. For example, the fraction of users who have done an event may be expected to increase over time simply because the users have had more time to do the action. | ||
|
||
### Example Interpretation | ||
|
||
The image below shows the impact over time line chart for an example A/A test, a feature flag where there is no true difference between the performance of the treatments. Despite there being no difference between the treatments, and hence a constant true impact of zero, the line chart shows a large measured difference at the beginning, and an apparent trend upwards over time. | ||
|
||
This is due only to noise in the data at the early stages of the feature flag when the sample size is low, and the measured impact moving towards the true value as more data arrives. | ||
|
||
 | ||
|
||
Note also that in the chart above there are 3 calculation buckets for which the error margin is entirely below zero, and hence the p-values at those points in time would imply a statistically significant impact. This is again due to noise and the unavoidable chance of false positive results. | ||
|
||
If you weren't aware of the risk of peeking at the data, or of considering multiple evaluations of your feature flag at different points in time, then you may have concluded that a meaningful impact had been detected. However, by following the recommended practice of concluding only at the predetermined end time of your feature flag you would eventually have seen a statistically inconclusive result as expected for an A/A test. | ||
|
||
If you have questions or need help troubleshooting, contact [[email protected]](mailto:[email protected]). |
114 changes: 114 additions & 0 deletions
114
docs/feature-management-experimentation/warehouse-native/index.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
--- | ||
title: Warehouse Native Experimentation | ||
id: index | ||
slug: /feature-management-experimentation/warehouse-native | ||
sidebar_label: Overview | ||
sidebar_position: 1 | ||
description: Learn how to run experiments in your data warehouse using Harness Feature Management & Experimentation (FME). | ||
--- | ||
|
||
<CTABanner | ||
buttonText="Request Access" | ||
title="Warehouse Native is in beta!" | ||
tagline="Get early access to run Harness FME experiments directly in your data warehouse." | ||
link="https://developer.harness.io/docs/feature-management-experimentation/fme-support" | ||
closable={true} | ||
target="_self" | ||
/> | ||
|
||
## Overview | ||
|
||
Warehouse Native enables [experimentation](/docs/feature-management-experimentation/experimentation/setup/) workflows, from targeting and assignment to analysis, and provides a statistical engine for analyzing existing experiments with measurement tools in Harness Feature Management & Experimentation (FME). | ||
|
||
## How Warehouse Native works | ||
|
||
Warehouse Native runs experimentation jobs directly in your <Tooltip id="fme.warehouse-native.data-warehouse">data warehouse</Tooltip> by using your existing data to calculate metrics and enrich experiment analyses. | ||
|
||
 | ||
|
||
The data model is designed around two two primary types of data: **assignment data** and **performance/behavioral data**, which power the FME statistical engine in your warehouse. | ||
|
||
Key components include: | ||
|
||
- **Assignment data**: Tracks user or entity assignments to experiments. This includes metadata about the experiment. | ||
- **Performance and behavioral data**: Captures metrics, events, and user behavior relevant to the experiment. | ||
- **Experiment metadata**: Contains definitions for experiments, including the experiment ID, name, start/end dates, traffic allocation, and grouping logic. | ||
- **Metric definitions**: Defines how metrics are computed in the warehouse, including aggregation logic and denominators. These definitions ensure analyses are standardized across experiments. | ||
|
||
### Cloud Experimentation | ||
|
||
<Tooltip id="fme.warehouse-native.cloud-experimentation">Cloud Experiments</Tooltip> are executed and analyzed within Harness FME, which collects feature flag impressions and performance data from your application and integrations. For more information, see the [Cloud Experimentation documentation](/docs/feature-management-experimentation/experimentation). | ||
|
||
```mermaid | ||
flowchart LR | ||
%% Customer infrastructure | ||
subgraph CI["Customer Infrastructure"] | ||
direction TB | ||
subgraph APP["Your Application"] | ||
FME["FME SDK"] | ||
style FME fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
end | ||
integrations["Integrations including Google Analytics, Segment, Sentry, mParticle, Amplitude, and Amazon S3"] | ||
style integrations fill:none,stroke:none,color:#fff | ||
end | ||
style CI fill:#8110B5,stroke:#8110B5,color:#fff | ||
%% Harness FME System | ||
subgraph HFM["Harness FME"] | ||
direction TB | ||
%% Horizontal input boxes without a subgraph | ||
FF["FME Feature Flags"] | ||
PD["Performance and behavioral data"] | ||
style FF fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
style PD fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
AE["FME Attribution Engine"] | ||
style AE fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
%% Connect inputs to Attribution Engine | ||
FF --> AE | ||
PD --> AE | ||
end | ||
style HFM fill:#8110B5,stroke:#8110B5,color:#fff | ||
%% Arrows from Customer Infra to input boxes | ||
CI -- "Feature flag impression data" --> FF | ||
CI -- "Performance and additional event data" --> PD | ||
``` | ||
|
||
### Warehouse Native | ||
|
||
<Tooltip id="fme.warehouse-native.warehouse-native">Warehouse Native Experiments</Tooltip> are executed directly in your data warehouse, leveraging assignment and behavioral data from Harness FME to calculate metrics and run statistical analyses at scale. | ||
|
||
```mermaid | ||
flowchart LR | ||
subgraph DW["Data Warehouse"] | ||
style DW fill:#8110B5,stroke:#8110B5,color:#fff | ||
direction TB | ||
AF["Assignment and FME feature flag data"] | ||
PB["Performance and behavioral data"] | ||
AE["FME Attribution Engine"] | ||
style AF fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
style PB fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
style AE fill:#9b5de5,stroke:#9b5de5,color:#fff | ||
end | ||
subgraph HFME[" "] | ||
direction TB | ||
HFM["Harness FME"] | ||
PAD1[" "]:::invisible | ||
PAD2[" "]:::invisible | ||
end | ||
classDef invisible fill:none,stroke:none; | ||
style HFM fill:#8110B5,stroke:#8110B5,color:#fff | ||
DW --> HFM | ||
``` | ||
|
||
## Get started | ||
alai97 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
To get started, [connect a data warehouse](/docs/feature-management-experimentation/warehouse-native/integrations/) and set up [assignment and metric sources](/docs/feature-management-experimentation/warehouse-native/setup/) to enable Warehouse Native Experimentation in Harness FME. |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.