Rewards (#93)

kalantar · web-flow · commit a707c2c25723 · 2023-06-12T09:42:43.000-04:00
* experiment with reward

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* update index

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* wordsmith and spelling

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* wordsmith and spelling

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* mockoon configuration

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* udpate reference in rewards

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* update links

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

* add explanation

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;

---------

Signed-off-by: Michael Kalantar &lt;kalantar@us.ibm.com&gt;
diff --git a/.github/wordlist.txt b/.github/wordlist.txt
@@ -50,7 +50,9 @@ LitmusChaos
 localhost
 minikube
 MLOps
+mockoon
 modelmesh
+msec
 namespace
 namespaces
 NewRelic
diff --git a/docs/tutorials/abn/rewards.md b/docs/tutorials/abn/rewards.md
@@ -0,0 +1,115 @@
+---
+template: main.html
+---
+
+# A/B/n Experiments with Rewards
+
+This tutorial describes how to use Iter8 to evaluate two or more versions on an application or ML model to identify the "best" version according to some reward metric(s).
+
+A reward metric is a metric that measures the benefit or profit of a version of an application or ML model.  Reward metrics are usually application or model specific.  User engagement, sales, and net profit are examples.
+
+## Assumptions
+
+We assume that you have deployed multiple versions of an application (or ML model) with the following characteristics:
+
+- There is a way to route user traffic to the deployed versions. This might be done using the Iter8 SDK, the Iter8 traffic control features, or some other mechanism.
+- Metrics, including reward metrics, are being exported to a metrics store such as Prometheus.
+- Metrics can be retrieved from the metrics store by application (model) version.
+
+In this tutorial, we mock a Prometheus service and demonstrate how to write an Iter8 experiment that evaluates reward metrics.
+
+## Mock Prometheus
+
+For simplicity, we use [mockoon](https://mockoon.com/) to create a mocked Prometheus service instead of deploying Prometheus itself:
+
+```shell
+kubectl create deploy prometheus-mock \
+--image mockoon/cli:latest \
+--port 9090 \
+-- mockoon-cli start --daemon-off \
+--port 9090 \
+--data https://raw.githubusercontent.com/kalantar/docs/rewards/samples/abn/model-prometheus-abn-tutorial.json
+kubectl expose deploy prometheus-mock --port 9090
+```
+
+## Define template
+
+Create a [_provider specification_](../../user-guide/tasks/custommetrics.md#provider-spec) that describes how Iter8 should fetch each metric value from the metrics store. The specification provides information about the provider URL, the HTTP method to be used, and any common headers. Furthermore, for each metric, there is:
+- metadata, such as name, type and description, 
+- HTTP query parameters, and 
+- a jq expression describing how to extract the metric value from the response.
+
+For example, a specification for the mean latency metric from Prometheus can look like the following: 
+
+```
+metric:
+- name: latency-mean
+  type: gauge
+  description: |
+    Mean latency
+  params:
+  - name: query
+    value: |
+      (sum(last_over_time(revision_app_request_latencies_sum{
+        {{- template "labels" . }}
+      }[{{ .elapsedTimeSeconds }}s])) or on() vector(0))/(sum(last_over_time(revision_app_request_latencies_count{
+        {{- template "labels" . }}
+      }[{{ .elapsedTimeSeconds }}s])) or on() vector(0))
+  jqExpression: .data.result[0].value[1] | tonumber
+```
+
+Note that the template is parameterized. Values are provided by the Iter8 experiment at run time.
+
+A sample provider specification for Prometheus is provided [here](https://gist.githubusercontent.com/kalantar/80c9efc0fd4cc34572d893cc82bdc4d2/raw/f3629aa62cdc9fd7e39ee2b6b113a8bf7b6b4463/model-prometheus-abn-tutorial.tpl).
+
+It describes the following metrics:
+
+- request-count
+- latency-mean
+- profit-mean
+
+## Launch experiment
+
+```shell
+iter8 k launch \
+--set "tasks={custommetrics,assess}" \
+--set custommetrics.templates.model-prometheus="https://gist.githubusercontent.com/kalantar/80c9efc0fd4cc34572d893cc82bdc4d2/raw/f3629aa62cdc9fd7e39ee2b6b113a8bf7b6b4463/model-prometheus-abn-tutorial.tpl" \
+--set custommetrics.values.labels.model_name=wisdom \
+--set 'custommetrics.versionValues[0].labels.mm_vmodel_id=wisdom-1' \
+--set 'custommetrics.versionValues[1].labels.mm_vmodel_id=wisdom-2' \
+--set assess.SLOs.upper.model-prometheus/latency-mean=50 \
+--set "assess.rewards.max={model-prometheus/profit-mean}" \
+--set runner=cronjob \
+--set cronjobSchedule="*/1 * * * *"
+```
+
+This experiment executes in a [loop](../../user-guide/topics/parameters.md), once every minute. It uses the [`custommetrics` task](../../user-guide/tasks/custommetrics.md) to read metrics from the (mocked) Prometheus provider. Finally, the [`assess` task](../../user-guide/tasks/assess.md) verifies that the `latency-mean` is below 50 msec and identifies which version provides the greatest reward; that is, the greatest mean profit.
+
+## Inspect experiment report
+
+=== "Text"
+    ```shell
+    iter8 k report
+    ```
+=== "HTML"
+    ```shell
+    iter8 k report -o html > report.html # view in a browser
+    ```
+
+Because the experiment loops, the reported results will change over time.
+
+***
+
+## Cleanup
+
+Delete the experiment:
+
+```shell
+iter8 k delete
+```
+
+Terminate the mocked Prometheus service:
+
+```shell
+kubectl delete deploy/prometheus-mock svc/prometheus-mock
+```
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -122,7 +122,9 @@ nav:
   - Load test gRPC with SLOs: tutorials/load-test-grpc.md
   - Load test multiple gRPC methods: tutorials/load-test-grpc-multiple.md
   - Chaos injection with SLOs: tutorials/chaos/slo-validation-chaos.md
-  - A/B experiments: tutorials/abn/abn.md
+  - A/B experiments:
+    - Iter8 SDK: tutorials/abn/abn.md
+    - Evaluating rewards: tutorials/abn/rewards.md
   - Automated experiments: tutorials/autox/autox.md
   - Custom metrics:
     - One version: tutorials/custom-metrics/one-version.md
diff --git a/samples/abn/model-prometheus-abn-tutorial.json b/samples/abn/model-prometheus-abn-tutorial.json
@@ -0,0 +1 @@
+{"uuid":"010a623b-dcbe-499c-a964-5501b725e663","lastMigration":25,"name":"Prometheus (model)","endpointPrefix":"api/v1/","latency":0,"port":9090,"hostname":"0.0.0.0","folders":[],"routes":[{"uuid":"387e3484-79f3-4844-8228-4cc2700a24d6","documentation":"","method":"get","endpoint":"query","responses":[{"uuid":"dc1c57ee-fe48-47f3-846e-8f67a9ac38e8","body":"{\n  \"response\": \"wisdom-1: request-count\",\n  \"status\":\"success\",\n  \"data\": {\n    \"resultType\": \"vector\",\n    \"result\": [\n      {\n        \"metric\":{},\n        \"value\": [\n          {{ divide (now 'T') 1000 }},\n          \"{{ int 0 100 }}\"\n          ]\n      }]\n  }\n}","latency":0,"statusCode":200,"label":"wisdom-1: request-count","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_count","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-1","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"fa57be05-b2b1-4284-bf21-7d7a8fc3c779","body":"{\n  \"response\": \"wisdom-1: request-count\",\n  \"status\":\"success\",\n  \"data\": {\n    \"resultType\": \"vector\",\n    \"result\": [\n      {\n        \"metric\":{},\n        \"value\": [\n          {{ divide (now 'T') 1000 }},\n          \"{{ int 0 100 }}\"\n          ]\n      }]\n  }\n}","latency":0,"statusCode":200,"label":"wisdom-2: request-count","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_count","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-2","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"2e36070b-162b-4af5-81c6-0df83ab2503c","body":"{\n  \"response\": \"v1: latency-mean\",\n  \"status\":\"success\",\n  \"data\": {\n    \"resultType\": \"vector\",\n    \"result\": [\n      {\n        \"metric\":{},\n        \"value\": [\n          {{ divide (now 'T') 1000 }},\n          \"{{ float 0 50 }}\"\n          ]\n      }]\n  }\n}","latency":0,"statusCode":200,"label":"wisdom-1: latency-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"model_request_count","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"\\)\\s*/\\s*\\(","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-1","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"9e7e7ef3-7aad-46bd-a469-2bed8c90917f","body":"{\n  \"response\": \"v2: latency-mean\",\n  \"status\":\"success\",\n  \"data\": {\n    \"resultType\": \"vector\",\n    \"result\": [\n      {\n        \"metric\":{},\n        \"value\": [\n          {{ divide (now 'T') 1000 }},\n          \"{{ float 0 50 }}\"\n          ]\n      }]\n  }\n}","latency":0,"statusCode":200,"label":"wisdom-2: latency-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"model_request_latencies_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"\\)\\s*/\\s*\\(","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-2","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"00e55214-d6f6-414a-8b52-10b202fef479","body":"{\n  \"response\": \"v1: profit-mean\",\n  \"status\":\"success\",\n  \"data\": {\n    \"resultType\": \"vector\",\n    \"result\": [\n      {\n        \"metric\":{},\n        \"value\": [\n          {{ divide (now 'T') 1000 }},\n          \"{{ int 10 80 }}\"\n          ]\n      }]\n  }\n}","latency":0,"statusCode":200,"label":"wisdom-1: profit-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"profit_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-1","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"e2a07264-2c5e-4877-993b-750296a31dab","body":"{\n  \"response\": \"v2: profit-mean\",\n  \"status\":\"success\",\n  \"data\": {\n    \"resultType\": \"vector\",\n    \"result\": [\n      {\n        \"metric\":{},\n        \"value\": [\n          {{ divide (now 'T') 1000 }},\n          \"{{ int 5 100 }}\"\n          ]\n      }]\n  }\n}","latency":0,"statusCode":200,"label":"wisdom-2: profit-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"profit_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-2","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"785190e8-3e45-4e7f-9352-fe8e06a4928b","body":"{\n  \"response\": \"unable to identify query\"\n  \"query\": \"{{ queryParam 'query' }}\",\n}","latency":0,"statusCode":400,"label":"unmatched query","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[],"rulesOperator":"OR","disableTemplating":false,"fallbackTo404":false,"default":true},{"uuid":"566f29dc-0bff-4fa9-8449-fb4b37e8f6df","body":"{}","latency":0,"statusCode":200,"label":"","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[],"rulesOperator":"OR","disableTemplating":false,"fallbackTo404":false,"default":false}],"enabled":true,"responseMode":null}],"rootChildren":[{"type":"route","uuid":"387e3484-79f3-4844-8228-4cc2700a24d6"}],"proxyMode":false,"proxyHost":"","proxyRemovePrefix":false,"tlsOptions":{"enabled":false,"type":"CERT","pfxPath":"","certPath":"","keyPath":"","caPath":"","passphrase":""},"cors":true,"headers":[{"key":"Content-Type","value":"application/json"}],"proxyReqHeaders":[{"key":"","value":""}],"proxyResHeaders":[{"key":"","value":""}],"data":[]}

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1 @@`
	`1`	+{"uuid":"010a623b-dcbe-499c-a964-5501b725e663","lastMigration":25,"name":"Prometheus (model)","endpointPrefix":"api/v1/","latency":0,"port":9090,"hostname":"0.0.0.0","folders":[],"routes":[{"uuid":"387e3484-79f3-4844-8228-4cc2700a24d6","documentation":"","method":"get","endpoint":"query","responses":[{"uuid":"dc1c57ee-fe48-47f3-846e-8f67a9ac38e8","body":"{\n \"response\": \"wisdom-1: request-count\",\n \"status\":\"success\",\n \"data\": {\n \"resultType\": \"vector\",\n \"result\": [\n {\n \"metric\":{},\n \"value\": [\n {{ divide (now 'T') 1000 }},\n \"{{ int 0 100 }}\"\n ]\n }]\n }\n}","latency":0,"statusCode":200,"label":"wisdom-1: request-count","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_count","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-1","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"fa57be05-b2b1-4284-bf21-7d7a8fc3c779","body":"{\n \"response\": \"wisdom-1: request-count\",\n \"status\":\"success\",\n \"data\": {\n \"resultType\": \"vector\",\n \"result\": [\n {\n \"metric\":{},\n \"value\": [\n {{ divide (now 'T') 1000 }},\n \"{{ int 0 100 }}\"\n ]\n }]\n }\n}","latency":0,"statusCode":200,"label":"wisdom-2: request-count","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_count","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-2","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"2e36070b-162b-4af5-81c6-0df83ab2503c","body":"{\n \"response\": \"v1: latency-mean\",\n \"status\":\"success\",\n \"data\": {\n \"resultType\": \"vector\",\n \"result\": [\n {\n \"metric\":{},\n \"value\": [\n {{ divide (now 'T') 1000 }},\n \"{{ float 0 50 }}\"\n ]\n }]\n }\n}","latency":0,"statusCode":200,"label":"wisdom-1: latency-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"model_request_count","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"\\)\\s/\\s\\(","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-1","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"9e7e7ef3-7aad-46bd-a469-2bed8c90917f","body":"{\n \"response\": \"v2: latency-mean\",\n \"status\":\"success\",\n \"data\": {\n \"resultType\": \"vector\",\n \"result\": [\n {\n \"metric\":{},\n \"value\": [\n {{ divide (now 'T') 1000 }},\n \"{{ float 0 50 }}\"\n ]\n }]\n }\n}","latency":0,"statusCode":200,"label":"wisdom-2: latency-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"model_request_latencies_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"model_request_latencies_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"\\)\\s/\\s\\(","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-2","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"00e55214-d6f6-414a-8b52-10b202fef479","body":"{\n \"response\": \"v1: profit-mean\",\n \"status\":\"success\",\n \"data\": {\n \"resultType\": \"vector\",\n \"result\": [\n {\n \"metric\":{},\n \"value\": [\n {{ divide (now 'T') 1000 }},\n \"{{ int 10 80 }}\"\n ]\n }]\n }\n}","latency":0,"statusCode":200,"label":"wisdom-1: profit-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"profit_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-1","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"e2a07264-2c5e-4877-993b-750296a31dab","body":"{\n \"response\": \"v2: profit-mean\",\n \"status\":\"success\",\n \"data\": {\n \"resultType\": \"vector\",\n \"result\": [\n {\n \"metric\":{},\n \"value\": [\n {{ divide (now 'T') 1000 }},\n \"{{ int 5 100 }}\"\n ]\n }]\n }\n}","latency":0,"statusCode":200,"label":"wisdom-2: profit-mean","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[{"target":"query","modifier":"query","value":"profit_sum","invert":false,"operator":"regex"},{"target":"query","modifier":"query","value":"wisdom-2","invert":false,"operator":"regex"}],"rulesOperator":"AND","disableTemplating":false,"fallbackTo404":false,"default":false},{"uuid":"785190e8-3e45-4e7f-9352-fe8e06a4928b","body":"{\n \"response\": \"unable to identify query\"\n \"query\": \"{{ queryParam 'query' }}\",\n}","latency":0,"statusCode":400,"label":"unmatched query","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[],"rulesOperator":"OR","disableTemplating":false,"fallbackTo404":false,"default":true},{"uuid":"566f29dc-0bff-4fa9-8449-fb4b37e8f6df","body":"{}","latency":0,"statusCode":200,"label":"","headers":[],"bodyType":"INLINE","filePath":"","databucketID":"","sendFileAsBody":false,"rules":[],"rulesOperator":"OR","disableTemplating":false,"fallbackTo404":false,"default":false}],"enabled":true,"responseMode":null}],"rootChildren":[{"type":"route","uuid":"387e3484-79f3-4844-8228-4cc2700a24d6"}],"proxyMode":false,"proxyHost":"","proxyRemovePrefix":false,"tlsOptions":{"enabled":false,"type":"CERT","pfxPath":"","certPath":"","keyPath":"","caPath":"","passphrase":""},"cors":true,"headers":[{"key":"Content-Type","value":"application/json"}],"proxyReqHeaders":[{"key":"","value":""}],"proxyResHeaders":[{"key":"","value":""}],"data":[]}