You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/tutorials/abn/abn.md
+8-12Lines changed: 8 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -92,7 +92,7 @@ data:
92
92
EOF
93
93
```
94
94
95
-
In this definition, each version of the application is composed of a `Service` and a `Deployment`. In the primary version, both are named `backend`. In any candidate version they are named `backend-candidate-1`. Iter8 uses this definition to identify when any of the versions of the application are available. It can then respond appropriate to `Lookup()` requests.
95
+
In this definition, each version of the application is composed of a `Service` and a `Deployment`. In the primary version, both are named `backend`. In any candidate version they are named `backend-candidate-1`. Iter8 uses this definition to identify when any of the versions of the application are available. It can then respond appropriately to `Lookup()` requests.
Until the candidate version is ready; that is, until all expected resources are deployed and available, calls to `Lookup()` will return only the index 0; the existing version.
120
-
Once the candidate version is ready, `Lookup()` will return both indices (0 and 1) so that requests can be distributed across versions.
119
+
Until the candidate version is ready; that is, until all expected resources are deployed and available, calls to `Lookup()` will return only the version number `0`; the existing version.
120
+
Once the candidate version is ready, `Lookup()` will return both version numbers (`0` and `1`) so that requests can be distributed across versions.
121
121
122
122
## Compare versions using Grafana
123
123
@@ -127,24 +127,20 @@ Inspect the metrics using Grafana. If Grafana is deployed to your cluster, port-
127
127
kubectl port-forward service/grafana 3000:3000
128
128
```
129
129
130
-
Open Grafana in a browser:
131
-
132
-
```shell
133
-
http://localhost:3000/
134
-
```
130
+
Open Grafana in a browser by going to [http://localhost:3000](http://localhost:3000)
135
131
136
132
[Add a JSON API data source](http://localhost:3000/connections/datasources/marcusolsson-json-datasource)`Iter8` with:
137
133
138
-
- URL `http://iter8.default:8080/metrics` and
139
-
-query string `application=default%2Fbackend`
134
+
- URL:`http://iter8.default:8080/metrics`
135
+
-Query string:`application=default%2Fbackend`
140
136
141
-
[Create a new dashboard](http://localhost:3000/dashboards) by *import*. Do so by pasting the contents of this [JSON definition](https://gist.githubusercontent.com/Alan-Cha/aa4ba259cc4631aafe9b43500502c60f/raw/034249f24e2c524ee4e326e860c06149ae7b2677/gistfile1.txt) into the box and *load* it. Associate it with the JSON API data source defined above.
137
+
[Create a new dashboard](http://localhost:3000/dashboards) by *import*. Copy and paste the contents of this [JSON definition](https://gist.githubusercontent.com/Alan-Cha/aa4ba259cc4631aafe9b43500502c60f/raw/034249f24e2c524ee4e326e860c06149ae7b2677/gistfile1.txt) into the text box and *load* it. Associate it with the JSON API data source above.
142
138
143
139
The Iter8 dashboard allows you to compare the behavior of the two versions of the backend component against each other and select a winner. Since user requests are being sent by the load generation script, the values in the report may change over time. The Iter8 dashboard may look like the following:
144
140
145
141

146
142
147
-
Once a winner is identified, the winner can be promoted, and the candidate version deleted.
143
+
Once you identify a winner, it can be promoted, and the candidate version deleted.
Copy file name to clipboardExpand all lines: docs/tutorials/integrations/ghactions.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,11 +8,11 @@ There are two ways that you can use Iter8 with GitHub Actions. You can [run Iter
8
8
9
9
# Use Iter8 in a GitHub Actions workflow
10
10
11
-
Install the latest version of the Iter8 CLI using `iter8-tools/iter8@v0.14`. Once installed, the Iter8 CLI can be used as documented in various tutorials. For example:
11
+
Install the latest version of the Iter8 CLI using `iter8-tools/iter8@v0.15`. Once installed, the Iter8 CLI can be used as documented in various tutorials. For example:
12
12
13
13
```yaml linenums="1"
14
14
- name: Install Iter8
15
-
run: GOBIN=/usr/local/bin go install github.com/iter8-tools/iter8@v0.14
15
+
run: GOBIN=/usr/local/bin go install github.com/iter8-tools/iter8@v0.15
16
16
17
17
# Launch an experiment inside Kubernetes
18
18
# This assumes that your Kubernetes cluster is accessible from the GitHub Actions pipeline
Copy file name to clipboardExpand all lines: docs/tutorials/integrations/kserve-mm/blue-green.md
+64-58Lines changed: 64 additions & 58 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,26 +4,31 @@ template: main.html
4
4
5
5
# Blue-Green Rollout of a ML Model
6
6
7
-
This tutorial shows how Iter8 can be used to implement a blue-green rollout of ML models hosted in a KServe modelmesh serving environment. In a blue-green rollout, a percentage of inference requests are directed to a candidate version of the model. The remaining requests go to the primary, or initial, version of the model. Iter8 enables a blue-green rollout by automatically configuring the network to distribute inference requests.
7
+
This tutorial shows how Iter8 can be used to implement a blue-green rollout of ML models hosted in a KServe modelmesh serving environment. In a blue-green rollout, a percentage of inference requests are directed to a candidate version of the model. The remaining requests go to the primary, or initial, version of the model. Iter8 enables a blue-green rollout by automatically configuring routing resources to distribute inference requests.
8
8
9
-
After a one time initialization step, the end user merely deploys candidate models, evaluates them, and either promotes or deletes them. Optionally, the end user can modify the percentage of inference requests being sent to the candidate model. Iter8 automatically handles all underlying network configuration.
9
+
After a one time initialization step, the end user merely deploys candidate models, evaluates them, and either promotes or deletes them. Optionally, the end user can modify the percentage of inference requests being sent to the candidate model. Iter8 automatically handles all underlying routing configuration.
10
10
11
11

12
12
13
13
In this tutorial, we use the Istio service mesh to distribute inference requests between different versions of a model.
14
14
15
15
???+ "Before you begin"
16
16
1. Ensure that you have the [kubectl CLI](https://kubernetes.io/docs/reference/kubectl/).
17
-
2. Have access to a cluster running [KServe ModelMesh Serving](https://github.com/kserve/modelmesh-serving). For example, you can create a modelmesh-serving [Quickstart](https://github.com/kserve/modelmesh-serving/blob/main/docs/quickstart.md) environment.
17
+
2. Have access to a cluster running [KServe ModelMesh Serving](https://github.com/kserve/modelmesh-serving). For example, you can create a modelmesh-serving [Quickstart](https://github.com/kserve/modelmesh-serving/blob/release-0.11/docs/quickstart.md) environment. If using the Quickstart environment, change your default namespace to `modelmesh-serving`:
3. Install [Istio](https://istio.io). You can install the [demo profile](https://istio.io/latest/docs/setup/getting-started/).
19
22
20
-
## Install the Iter8 controller
23
+
## Install Iter8
21
24
22
25
--8<-- "docs/tutorials/installiter8controller.md"
23
26
24
-
## Deploy a primary model
27
+
## Initialize primary
25
28
26
-
Deploy the primary version of a model using an `InferenceService`:
29
+
### Application
30
+
31
+
Deploy the primary version of the application. In this tutorial, the application is an ML model. Initialize the resources for the primary version of the model (`v0`) by deploying an `InferenceService` as follows:
27
32
28
33
```shell
29
34
cat <<EOF | kubectl apply -f -
@@ -48,36 +53,36 @@ EOF
48
53
```
49
54
50
55
??? note "About the primary `InferenceService`"
51
-
Naming the model with the suffix `-0` (and the candidate with the suffix `-1`) simplifies the rollout initialization. However, any name can be specified.
56
+
The base name (`wisdom`) and version (`v0`) are identified using the labels `app.kubernets.io/name` and `app.kubernets.io.version`, respectively. These labels are not required.
57
+
58
+
Naming the instance with the suffix `-0` (and the candidate with the suffix `-1`) simplifies the routing initialization (see below). However, any name can be specified.
52
59
53
-
The label `iter8.tools/watch: "true"` lets Iter8 know that it should pay attention to changes to this `InferenceService`.
60
+
The label `iter8.tools/watch: "true"` is required. It lets Iter8 know that it should pay attention to changes to this application resource.
54
61
55
-
Inspect the deployed `InferenceService`:
62
+
You can inspect the deployed `InferenceService`. When the `READY` field becomes `True`, the model is fully deployed.
56
63
57
64
```shell
58
65
kubectl get inferenceservice wisdom-0
59
66
```
60
-
61
-
When the `READY` field becomes `True`, the model is fully deployed.
62
67
63
-
##Initialize the Blue-Green routing policy
68
+
### Routing
64
69
65
70
Initialize model rollout with a blue-green traffic pattern as follows:
The `initialize-rollout` template (with `trafficStrategy: blue-green`) configures the Istio service mesh to route all requests to the primary version of the model (`wisdom-0`). Further, it defines the routing policy that will be used by Iter8 when it observes changes in the models. By default, this routing policy splits inference requests 50-50 between the primary and candidate versions. For detailed configuration options, see the Helm chart.
81
+
The `initialize` action (with strategy `blue-green`) configures the (Istio) service mesh to route all requests to the primary version of the application (`wisdom-0`). It further defines the routing policy that will be used when changes are observed in the application resources. By default, this routing policy splits requests 50-50 between the primary and candidate versions. For detailed configuration options, see the [Helm chart](https://github.com/iter8-tools/iter8/blob/v0.15.5/charts/routing-actions/values.yaml).
77
82
78
-
## Verify network configuration
83
+
## Verify routing
79
84
80
-
To verify the network configuration, you can inspect the network configuration:
85
+
To verify the routing configuration, you can inspect the `VirtualService`:
81
86
82
87
```shell
83
88
kubectl get virtualservice -o yaml wisdom
@@ -88,7 +93,7 @@ To send inference requests to the model:
88
93
=== "From within the cluster"
89
94
1. Create a "sleep" pod in the cluster from which requests can be made:
90
95
```shell
91
-
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.14.3/samples/modelmesh-serving/sleep.sh | sh -
96
+
curl -s https://raw.githubusercontent.com/iter8-tools/docs/v0.15.2/samples/modelmesh-serving/sleep.sh | sh -
92
97
```
93
98
94
99
2. exec into the sleep pod:
@@ -111,21 +116,22 @@ To send inference requests to the model:
Note that the model version responding to each inference request can be determined from the `modelName` field of the response.
132
+
Note that the model version responding to each inference request is noted in the response header `app-version`. In the requests above, we display only this header.
127
133
128
-
## Deploy a candidate model
134
+
## Deploy candidate
129
135
130
136
Deploy a candidate model using a second `InferenceService`:
131
137
@@ -152,45 +158,43 @@ EOF
152
158
```
153
159
154
160
??? note "About the candidate `InferenceService`"
155
-
The model name (`wisdom`) and version (`v1`) are recorded using the labels `app.kubernets.io/name` and `app.kubernets.io.version`.
156
-
157
161
In this tutorial, the model source (field `spec.predictor.model.storageUri`) is the same as for the primary version of the model. In a real world example, this would be different.
158
162
159
-
## Verify network configuration changes
163
+
## Verify routing changes
160
164
161
-
The deployment of the candidate model triggers an automatic reconfiguration by Iter8. Inspect the `VirtualService` to see that inference requests are now distributed between the primary model and the secondary model:
165
+
The deployment of the candidate model triggers an automatic reconfiguration by Iter8. Inspect the `VirtualService` to see that the routing has been changed. Requests are now distributed between the primary and candidate:
162
166
163
167
```shell
164
168
kubectl get virtualservice wisdom -o yaml
165
169
```
166
170
167
-
Send additional inference requests as described above.
171
+
You can send additional inference requests as described above. They will be handled by both versions of the model.
168
172
169
173
## Modify weights (optional)
170
174
171
-
You can modify the weight distribution of inference requests using the Iter8 `traffic-template` chart:
175
+
You can modify the weight distribution of inference requests as follows:
0 commit comments