Complete reference for using the AICR API Server.
The AICR API Server provides HTTP REST access to recipe generation and bundle creation for GPU-accelerated infrastructure. Use the API for programmatic access to configuration recommendations and deployment artifacts.
Version numbers in the sample requests and responses below (server version, chart versions, driver versions) are illustrative. The authoritative, current versions are in the Component Catalog and the Container Images BOM.
┌──────────────┐ ┌──────────────┐
│ GET /recipe │─────▶│ Recipe │
└──────────────┘ └──────────────┘
│
▼
┌──────────────┐ ┌──────────────┐
│ POST /bundle │─────▶│ bundles.zip │
└──────────────┘ └──────────────┘
- Use the API for remote recipe generation and bundle creation
- Use the CLI for local operations, snapshot capture, and ConfigMap integration
| Feature | API | CLI |
|---|---|---|
| Recipe generation | ✅ GET /v1/recipe | ✅ aicr recipe |
| Value query | ✅ GET /v1/query | ✅ aicr query |
| Bundle creation | ✅ POST /v1/bundle | ✅ aicr bundle |
| Snapshot capture | ❌ Use CLI | ✅ aicr snapshot |
| ConfigMap I/O | ❌ Use CLI | ✅ cm:// URIs |
| Agent deployment | ❌ Use CLI | ✅ aicr snapshot |
Local development (example):
http://localhost:8080
Start the local server:
docker pull ghcr.io/nvidia/aicrd:latest
docker run -p 8080:8080 ghcr.io/nvidia/aicrd:latestGenerate an optimized configuration recipe for your environment:
# GET: Basic recipe for H100 on EKS (query parameters)
curl "http://localhost:8080/v1/recipe?accelerator=h100&service=eks"
# GET: Training workload on Ubuntu
curl "http://localhost:8080/v1/recipe?accelerator=h100&service=eks&intent=training&os=ubuntu"
# POST: Recipe from criteria file (YAML body)
curl -X POST "http://localhost:8080/v1/recipe" \
-H "Content-Type: application/x-yaml" \
-d 'kind: RecipeCriteria
apiVersion: aicr.run/v1alpha2
metadata:
name: my-config
spec:
service: eks
accelerator: h100
intent: training'
# Save recipe to file
curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" -o recipe.jsonCreate deployment bundles from a recipe:
# Pipe recipe directly to bundle endpoint.
# The POST body must be a fully-hydrated RecipeResult; piping GET /v1/recipe
# output (as below) supplies one. Do not hand-author a partial body.
curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
curl -X POST "http://localhost:8080/v1/bundle" \
-H "Content-Type: application/json" -d @- -o bundles.zip
# Extract the bundles
unzip bundles.zip -d ./bundlesService information and available routes.
curl "http://localhost:8080/"Response:
{
"service": "aicrd",
"version": "v0.14.0",
"routes": ["/v1/recipe", "/v1/query", "/v1/bundle"]
}Generate an optimized configuration recipe based on environment parameters.
Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
service |
string | any | K8s service: eks, gke, aks, oke, ocp, kind, lke, bcm, any |
accelerator |
string | any | GPU type: h100, h200, gb200, b200, a100, l40, l40s, rtx-pro-6000, any |
gpu |
string | any | Alias for accelerator |
intent |
string | any | Workload: training, inference, any |
os |
string | any | Node OS: ubuntu, rhel, cos, amazonlinux, ol, talos, any |
platform |
string | any | Platform/framework: dynamo, kubeflow, nim, runai, slurm, any |
nodes |
integer | 0 | GPU node count (0 = any) |
Examples:
# Minimal request
curl "http://localhost:8080/v1/recipe"
# Specify accelerator
curl "http://localhost:8080/v1/recipe?accelerator=h100"
# Full specification
curl "http://localhost:8080/v1/recipe?service=eks&accelerator=h100&intent=training&os=ubuntu&nodes=8"
# Using gpu alias
curl "http://localhost:8080/v1/recipe?gpu=gb200&service=gke"
# Pretty print with jq
curl -s "http://localhost:8080/v1/recipe?accelerator=h100" | jq '.'Generate an optimized configuration recipe from a criteria file body. This endpoint provides an alternative to query parameters, accepting a Kubernetes-style RecipeCriteria resource in the request body.
Content Types:
application/json- JSON formatapplication/x-yaml- YAML format
Request Body:
The request body must be a RecipeCriteria resource:
kind: RecipeCriteria
apiVersion: aicr.run/v1alpha2
metadata:
name: my-criteria
spec:
service: eks
accelerator: gb200
os: ubuntu
intent: training
platform: kubeflow
nodes: 8Examples:
# POST with YAML body
curl -X POST "http://localhost:8080/v1/recipe" \
-H "Content-Type: application/x-yaml" \
-d 'kind: RecipeCriteria
apiVersion: aicr.run/v1alpha2
metadata:
name: training-config
spec:
service: eks
accelerator: h100
intent: training'
# POST with JSON body
curl -X POST "http://localhost:8080/v1/recipe" \
-H "Content-Type: application/json" \
-d '{
"kind": "RecipeCriteria",
"apiVersion": "aicr.run/v1alpha2",
"metadata": {"name": "training-config"},
"spec": {
"service": "eks",
"accelerator": "h100",
"intent": "training"
}
}'
# POST with criteria file
curl -X POST "http://localhost:8080/v1/recipe" \
-H "Content-Type: application/yaml" \
-d @criteria.yaml
# Pretty print response
curl -s -X POST "http://localhost:8080/v1/recipe" \
-H "Content-Type: application/json" \
-d '{"kind":"RecipeCriteria","apiVersion":"aicr.run/v1alpha2","spec":{"service":"eks","accelerator":"h100"}}' \
| jq '.'Error Responses:
400 Bad Request- Invalid criteria format, missing required fields, or invalid enum values405 Method Not Allowed- Only GET and POST are supported
Response:
{
"apiVersion": "aicr.run/v1alpha2",
"kind": "RecipeResult",
"metadata": {
"version": "v0.14.0",
"appliedOverlays": [
"base",
"eks",
"eks-training",
"gb200-eks-training"
],
"excludedOverlays": [
{
"name": "h100-eks-ubuntu-training",
"reason": "mixin-constraint-failed"
}
],
"constraintWarnings": [
{
"overlay": "h100-eks-ubuntu-training",
"constraint": "OS.sysctl./proc/sys/kernel/osrelease",
"expected": ">= 6.8",
"actual": "5.15.0",
"reason": "mixin-constraint-failed: expected >= 6.8, got 5.15.0"
}
]
},
"criteria": {
"service": "eks",
"accelerator": "gb200",
"intent": "training",
"os": "any",
"platform": "any"
},
"constraints": [
{
"name": "GPU.driver.version",
"value": "580.82.07"
},
{
"name": "GPU.driver.cudaVersion",
"value": "13.1"
}
],
"componentRefs": [
{
"name": "gpu-operator",
"type": "Helm",
"chart": "gpu-operator",
"source": "https://helm.ngc.nvidia.com/nvidia",
"version": "v25.3.3"
},
{
"name": "network-operator",
"type": "Helm",
"chart": "network-operator",
"source": "https://helm.ngc.nvidia.com/nvidia",
"version": "v25.4.0"
}
],
"deploymentOrder": [
"gpu-operator",
"network-operator"
]
}metadata.excludedOverlays is optional. When present, each entry includes the overlay name and a machine-readable reason such as constraint-failed or mixin-constraint-failed.
Query a specific value from a fully hydrated recipe. Resolves a recipe from criteria (same parameters as GET /v1/recipe), merges all base, overlay, and inline overrides, then returns the value at the given selector path.
Query Parameters:
All GET /v1/recipe parameters are supported, plus:
| Parameter | Type | Required | Description |
|---|---|---|---|
selector |
string | Yes | Dot-delimited path to the value to extract (e.g. components.gpu-operator.values.driver.version). Empty string returns the entire hydrated recipe. |
Response:
- Scalar values (string, number, bool) are returned as plain JSON values
- Complex values (maps, lists) are returned as JSON objects/arrays
Examples:
# Get a specific Helm value
curl -s "http://localhost:8080/v1/query?service=eks&accelerator=h100&intent=training&selector=components.gpu-operator.values.driver.version"
# Get deployment order
curl -s "http://localhost:8080/v1/query?service=eks&accelerator=h100&intent=training&selector=deploymentOrder" | jq '.'
# Get a component subtree
curl -s "http://localhost:8080/v1/query?service=eks&accelerator=h100&selector=components.gpu-operator.values.driver" | jq '.'Alternative to GET /v1/query that accepts the criteria and selector in the request body. The body is a QueryRequest with a criteria object (same fields as the RecipeCriteria spec) and a selector string.
Content Types:
application/json- JSON formatapplication/x-yaml- YAML format
Request Body:
criteria:
service: eks
accelerator: h100
intent: training
selector: "components.gpu-operator.values.driver.version"Examples:
curl -X POST "http://localhost:8080/v1/query" \
-H "Content-Type: application/json" \
-d '{
"criteria": {"service": "eks", "accelerator": "h100", "intent": "training"},
"selector": "components.gpu-operator.values.driver.version"
}'The response format matches GET /v1/query: scalar values are returned as plain JSON values; maps and lists are returned as JSON objects/arrays.
Generate deployment bundles from a recipe.
Query Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
bundlers |
string | (all) | Comma-delimited list of bundler types to execute. Not currently honored — all bundlers run regardless (#1531). |
set |
string[] | Value overrides (format: bundler:path.to.field=value). Repeat for multiple. |
|
dynamic |
string[] | Declare value paths as install-time parameters (format: component:path.to.field). Repeat for multiple. Supported with deployer=helm, deployer=argocd-helm, deployer=flux, and deployer=helmfile. |
|
system-node-selector |
string[] | Node selectors for system components (format: key=value). Repeat for multiple. |
|
system-node-toleration |
string[] | Tolerations for system components (format: key=value:effect). Repeat for multiple. |
|
accelerated-node-selector |
string[] | Node selectors for GPU nodes (format: key=value). Repeat for multiple. |
|
accelerated-node-toleration |
string[] | Tolerations for GPU nodes (format: key=value:effect). Repeat for multiple. |
|
nodes |
int | 0 | Estimated number of GPU nodes (0 = unset). Written to Helm value paths declared in the registry under nodeScheduling.nodeCountPaths. |
vendor-charts |
bool | false | Pull upstream Helm chart bytes into the bundle at bundle time so the artifact is fully self-contained and air-gap deployable. Each vendored chart is recorded in provenance.yaml with name, version, source URL, and SHA256. Trades the upstream CVE-yank fail-loud signal for offline deployability — see the CLI reference's "Vendoring Charts for Air-Gap" section for the full tradeoff. Requires the helm binary on the API server's $PATH and registry credentials configured for any private upstream repos (HELM_REPOSITORY_USERNAME/HELM_REPOSITORY_PASSWORD for HTTP(S); docker config for OCI). If prerequisites are missing the request fails with a structured error code (SERVICE_UNAVAILABLE / HTTP 503 for missing helm, UNAUTHORIZED / HTTP 401 for credentials). |
deployer |
string | helm | Deployment method: helm, argocd, argocd-helm, flux, or helmfile |
repo |
string | Git repository URL for GitOps deployments (used with deployer=argocd and deployer=flux; ignored by deployer=argocd-helm) |
|
app-name |
string | Parent Argo Application name (default: aicr-stack for deployer=argocd-helm, nvidia-stack for deployer=argocd). Must be a DNS-1123 subdomain. Required when deploying multiple non-overlapping AICR bundles to the same Argo CD namespace so the parent Applications do not collide. For deployer=argocd-helm, the value is the chart default and can still be overridden at install time via helm install --set appName=.... Rejected with HTTP 400 on other deployers. |
Request Body:
The request body is the recipe (RecipeResult) directly. No wrapper object needed.
These are the recipe components in recipes/registry.yaml. (The bundlers query parameter that would select a subset is currently ignored — all run regardless, #1531.) The registry is the authoritative source — see the component catalog for the full, current list with detailed descriptions. The table below is illustrative of commonly used components:
| Component | Description |
|---|---|
agentgateway |
Kubernetes Gateway API implementation for AI/ML inference (InferencePool routing) |
agentgateway-crds |
Kubernetes Gateway API CRDs for AI/ML inference (Gateway API + Inference Extension) |
aws-ebs-csi-driver |
Amazon EBS CSI driver (EKS) |
aws-efa |
AWS Elastic Fabric Adapter device plugin (EKS) |
cert-manager |
TLS certificate management |
dynamo-platform |
NVIDIA Dynamo inference serving platform |
gatekeeper |
OPA Gatekeeper policy controller |
gke-nccl-tcpxo |
NCCL TCPxO network plugin for optimized collective communication (GKE) |
gpu-operator |
NVIDIA GPU Operator — driver and runtime lifecycle |
gpu-operator-ocp |
GPU Operator variant for OpenShift (OCP) |
gpu-operator-ocp-olm |
GPU Operator for OpenShift via Operator Lifecycle Manager (OLM) |
grove |
Dynamo pod lifecycle management |
k8s-ephemeral-storage-metrics |
Ephemeral storage usage metrics |
k8s-nim-operator |
NVIDIA NIM Operator for inference microservice deployments |
kai-scheduler |
DRA-aware gang scheduler with topology-aware placement |
kube-prometheus-stack |
Prometheus, Grafana, Alertmanager monitoring stack |
kubeflow-trainer |
Kubeflow Training Operator for distributed training |
kueue |
Kubernetes-native job queuing for batch and AI workloads |
network-operator |
NVIDIA Network Operator — RDMA, SR-IOV, host networking |
network-operator-ocp |
Network Operator variant for OpenShift (OCP) |
network-operator-ocp-olm |
Network Operator for OpenShift via Operator Lifecycle Manager (OLM) |
nfd |
Node Feature Discovery — labels nodes with hardware features; publishes per-node NodeResourceTopology CRDs on production GPU recipes |
nfd-ocp |
Node Feature Discovery variant for OpenShift (OCP) |
nfd-ocp-olm |
Node Feature Discovery for OpenShift via Operator Lifecycle Manager (OLM) |
nodewright-customizations |
Environment-specific node tuning profiles |
nodewright-operator |
OS-level node tuning and kernel configuration |
nvidia-dra-driver-gpu |
Dynamic Resource Allocation driver for GPUs |
nvsentinel |
GPU health monitoring and automated remediation |
prometheus-adapter |
Custom metrics for HPA scaling |
prometheus-operator-crds |
CRDs for the prometheus-operator (Alertmanager, Prometheus, ServiceMonitor, etc.) |
slinky-slurm |
Slinky-managed Slurm cluster instance (Controller, LoginSet, NodeSet, RestApi); reconciled by slinky-slurm-operator |
slinky-slurm-operator |
SchedMD Slinky Slurm operator and admission webhook |
slinky-slurm-operator-crds |
CRDs for the SchedMD Slinky Slurm operator (slinky.slurm.net) |
Examples:
Note: The POST body must be a fully-hydrated
RecipeResult— the server adopts the body as-is and does not hydrate registry defaults, so a hand-authored partial body (missingnamespace,valuesFile,overrides,dependencyRefs) yields empty values and namespaces in the generated bundle. Obtain a complete body fromaicr recipe ... --format json --output -(the CLI defaults to YAML, butPOST /v1/bundleJSON-decodes its body) orGET /v1/recipeand pass it unchanged. The inline bodies below are elided for brevity (only a few component fields shown) — use a generatedRecipeResult, not these literals.The
bundlersquery parameter is not currently honored — all bundlers run regardless (#1531). There is no supported way to bundle a subset via the API today: hand-trimmingcomponentRefsis unsafe (it silently drops required dependencies and breaks deployers like Helmfile on danglingdependencyRefs). Bundle the full hydrated result.
# Basic: pipe recipe to bundle
curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
curl -X POST "http://localhost:8080/v1/bundle" \
-H "Content-Type: application/json" -d @- -o bundles.zip
# Advanced: with value overrides and Argo CD deployer
curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
curl -X POST "http://localhost:8080/v1/bundle?deployer=argocd&repo=https://github.com/my-org/my-gitops-repo.git&set=gpuoperator:gds.enabled=true" \
-H "Content-Type: application/json" -d @- -o bundles.zip
# With node scheduling for system and GPU nodes
# (recipe.json must be a fully-hydrated RecipeResult, e.g. from GET /v1/recipe)
curl -X POST "http://localhost:8080/v1/bundle?system-node-selector=nodeGroup=system&system-node-toleration=dedicated=system:NoSchedule&accelerated-node-selector=nvidia.com/gpu.present=true&accelerated-node-toleration=nvidia.com/gpu=present:NoSchedule" \
-H "Content-Type: application/json" \
-d @recipe.json \
-o bundles.zip
# Generate bundles from a saved (fully-hydrated) recipe
curl -X POST "http://localhost:8080/v1/bundle" \
-H "Content-Type: application/json" \
-d @recipe.json \
-o bundles.zip
# Elided literal body (NOT complete — use a generated RecipeResult instead)
curl -X POST "http://localhost:8080/v1/bundle" \
-H "Content-Type: application/json" \
-d '{
"apiVersion": "aicr.run/v1alpha2",
"kind": "RecipeResult",
"componentRefs": [
{"name": "gpu-operator", "type": "Helm", "chart": "gpu-operator", "source": "https://helm.ngc.nvidia.com/nvidia", "version": "v26.3.2", "namespace": "gpu-operator", "valuesFile": "components/gpu-operator/values.yaml"},
{"name": "network-operator", "type": "Helm", "chart": "network-operator", "source": "https://helm.ngc.nvidia.com/nvidia", "version": "26.1.1", "namespace": "nvidia-network-operator", "valuesFile": "components/network-operator/values.yaml"}
],
"deploymentOrder": ["gpu-operator", "network-operator"]
}' \
-o bundles.zipResponse Headers:
| Header | Description | Example |
|---|---|---|
Content-Type |
Always application/zip |
application/zip |
Content-Disposition |
Download filename | attachment; filename="bundles.zip" |
X-Bundle-Files |
Total files in archive | 10 |
X-Bundle-Size |
Uncompressed size (bytes) | 45678 |
X-Bundle-Duration |
Generation time | 1.234s |
bundles.zip
├── deploy.sh # root automation script (executable)
├── README.md # root deployment guide
├── checksums.txt # SHA256 checksums (always set for /v1/bundle)
├── recipe.yaml # canonical post-resolution recipe (helm deployer)
├── 001-<component>/ # per-component folder (NNN-prefixed)
│ ├── install.sh # component install script
│ ├── values.yaml # static Helm values
│ ├── cluster-values.yaml # per-cluster dynamic values
│ └── upstream.env # CHART/REPO/VERSION (upstream-helm only)
└── 002-<component>/
├── install.sh
├── values.yaml
└── cluster-values.yaml
Checksums are root-level only; component folders carry install.sh at their
root (no scripts/ subdirectory), and no uninstall.sh/undeploy.sh is
generated.
Service health check (liveness probe).
curl "http://localhost:8080/health"Response:
{
"status": "healthy",
"timestamp": "2026-01-11T10:30:00Z"
}Service readiness check (readiness probe).
curl "http://localhost:8080/ready"Response:
{
"status": "ready",
"timestamp": "2026-01-11T10:30:00Z"
}Prometheus metrics endpoint.
curl "http://localhost:8080/metrics"Key Metrics:
| Metric | Type | Description |
|---|---|---|
aicr_http_requests_total |
counter | Total HTTP requests by method, path, status |
aicr_http_request_duration_seconds |
histogram | Request latency distribution |
aicr_http_requests_in_flight |
gauge | Current concurrent requests |
aicr_rate_limit_rejects_total |
counter | Rate limit rejections |
Fetch a recipe and generate bundles in one workflow:
#!/bin/bash
# Step 1: Get recipe for H100 on EKS for training
echo "Fetching recipe..."
curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks&intent=training" \
-o recipe.json
# Display recipe summary
echo "Recipe components:"
jq -r '.componentRefs[] | " - \(.name): \(.version)"' recipe.json
# Step 2: Generate bundles from recipe (pipe directly)
# recipe.json is the fully-hydrated RecipeResult fetched in Step 1.
echo "Generating bundles..."
curl -s -X POST "http://localhost:8080/v1/bundle" \
-H "Content-Type: application/json" \
-d @recipe.json \
-o bundles.zip
# Alternative: one-liner without intermediate file
# curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
# curl -X POST "http://localhost:8080/v1/bundle" \
# -H "Content-Type: application/json" -d @- -o bundles.zip
# Step 3: Extract and verify
echo "Extracting bundles..."
unzip -q bundles.zip -d ./deployment
# Verify checksums (checksums.txt is at the bundle root, not per-component)
echo "Verifying checksums..."
cd deployment
sha256sum -c checksums.txt
# Step 4: Deploy (example)
echo "Bundle ready for deployment:"
ls -la{
"code": "ERROR_CODE",
"message": "Human-readable error description",
"details": { ... },
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2026-01-11T10:30:00Z",
"retryable": true
}| Code | HTTP Status | Description | Retryable |
|---|---|---|---|
INVALID_REQUEST |
400 | Invalid query parameters, request body, or disallowed criteria value | No |
UNAUTHORIZED |
401 | Authentication or authorization failure | No |
NOT_FOUND |
404 | Selector path not found in the resolved configuration | No |
METHOD_NOT_ALLOWED |
405 | Wrong HTTP method | No |
CONFLICT |
409 | Resource state conflict (e.g., already exists or version mismatch) | No |
RATE_LIMIT_EXCEEDED |
429 | Too many requests | Yes |
INTERNAL |
500 | Server error | Yes |
SERVICE_UNAVAILABLE |
503 | Server temporarily unavailable | Yes |
TIMEOUT |
504 | Operation exceeded its time limit | Yes |
INVALID_REQUESTis not always400:POST /v1/queryandPOST /v1/recipereturn it with HTTP 413 Request Entity Too Large when the request body exceeds the server's body-size limit (MaxRecipePOSTBytes).
# Check rate limit headers
curl -I "http://localhost:8080/v1/recipe?accelerator=h100"
# Response headers:
# X-RateLimit-Limit: 100
# X-RateLimit-Remaining: 95
# X-RateLimit-Reset: 1736589000When rate limited (HTTP 429), use the Retry-After header:
# Retry with backoff
response=$(curl -s -w "%{http_code}" "http://localhost:8080/v1/recipe?accelerator=h100")
if [ "${response: -3}" = "429" ]; then
retry_after=$(curl -sI "http://localhost:8080/v1/recipe" | grep -i "Retry-After" | awk '{print $2}')
echo "Rate limited. Retrying after ${retry_after}s..."
sleep "$retry_after"
fi- Limit: 100 requests per second (a single process-global token bucket shared across all clients, not per-IP)
- Burst: 200 requests
- Headers:
X-RateLimit-Limit,X-RateLimit-Remaining,X-RateLimit-Reset - 429 Response: Includes
Retry-Afterheader
The API server can be configured to restrict which criteria values are allowed. This enables operators to limit the API to specific accelerators, services, intents, or OS types.
Allowlists are configured via environment variables when starting the server:
| Environment Variable | Description | Example |
|---|---|---|
AICR_ALLOWED_ACCELERATORS |
Comma-separated list of allowed GPU types | h100,l40 |
AICR_ALLOWED_SERVICES |
Comma-separated list of allowed K8s services | eks,gke |
AICR_ALLOWED_INTENTS |
Comma-separated list of allowed workload intents | training |
AICR_ALLOWED_OS |
Comma-separated list of allowed OS types | ubuntu,rhel |
Behavior:
- If an environment variable is not set, all values for that criteria are allowed
- If an environment variable is set, only the specified values are permitted
- The
anyvalue is always allowed regardless of allowlist configuration - Allowlists apply to both
/v1/recipeand/v1/bundleendpoints
# Start server allowing only H100 and L40 GPUs on EKS
docker run -p 8080:8080 \
-e AICR_ALLOWED_ACCELERATORS=h100,l40 \
-e AICR_ALLOWED_SERVICES=eks \
ghcr.io/nvidia/aicrd:latestWhen a disallowed criteria value is requested:
curl "http://localhost:8080/v1/recipe?accelerator=gb200&service=eks"Response (HTTP 400):
{
"code": "INVALID_REQUEST",
"message": "accelerator type not allowed",
"details": {
"requested": "gb200",
"allowed": ["h100", "l40"]
},
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2026-01-27T10:30:00Z",
"retryable": false
}The CLI (aicr) is not affected by allowlists. Allowlists only apply to the API server, allowing operators to restrict API access while maintaining full CLI functionality for administrative tasks.
import requests
import zipfile
import io
BASE_URL = "http://localhost:8080"
# Get recipe
params = {
"accelerator": "h100",
"service": "eks",
"intent": "training",
"os": "ubuntu"
}
resp = requests.get(f"{BASE_URL}/v1/recipe", params=params)
resp.raise_for_status()
recipe = resp.json()
print(f"Recipe has {len(recipe['componentRefs'])} components")
# Generate bundles — the (fully-hydrated) recipe is the request body.
resp = requests.post(
f"{BASE_URL}/v1/bundle",
json=recipe,
)
resp.raise_for_status()
# Extract zip
with zipfile.ZipFile(io.BytesIO(resp.content)) as zf:
zf.extractall("./deployment")
print(f"Extracted {len(zf.namelist())} files")package main
import (
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"os"
)
func main() {
baseURL := "http://localhost:8080"
// Get recipe
params := url.Values{}
params.Add("accelerator", "h100")
params.Add("service", "eks")
resp, err := http.Get(baseURL + "/v1/recipe?" + params.Encode())
if err != nil {
panic(err)
}
defer resp.Body.Close()
var recipe map[string]interface{}
json.NewDecoder(resp.Body).Decode(&recipe)
fmt.Printf("Got recipe with %d components\n",
len(recipe["componentRefs"].([]interface{})))
}const BASE_URL = "http://localhost:8080";
async function main() {
// Get recipe
const params = new URLSearchParams({
accelerator: "h100",
service: "eks",
intent: "training"
});
const recipeResp = await fetch(`${BASE_URL}/v1/recipe?${params}`);
const recipe = await recipeResp.json();
console.log(`Recipe has ${recipe.componentRefs.length} components`);
// Generate bundles — the (fully-hydrated) recipe is the request body.
const bundleResp = await fetch(`${BASE_URL}/v1/bundle`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify(recipe),
});
// Save zip
const buffer = await bundleResp.arrayBuffer();
require("fs").writeFileSync("bundles.zip", Buffer.from(buffer));
console.log("Bundles saved to bundles.zip");
}
main();#!/bin/bash
# Generate recipes for multiple environments
environments=(
"os=ubuntu&accelerator=h100&service=eks"
"os=ubuntu&accelerator=gb200&service=gke"
"os=rhel&accelerator=a100&service=aks"
)
for env in "${environments[@]}"; do
echo "Fetching recipe for: $env"
curl -s "http://localhost:8080/v1/recipe?${env}" \
| jq -r '.componentRefs[] | "\(.name): \(.version)"'
echo ""
doneThe full OpenAPI 3.1 specification is available at: api/aicr/v1/server.yaml
Generate client SDKs:
# Download spec
curl https://raw.githubusercontent.com/NVIDIA/aicr/main/api/aicr/v1/server.yaml \
-o openapi.yaml
# Generate Python client
openapi-generator-cli generate -i openapi.yaml -g python -o ./python-client
# Generate Go client
openapi-generator-cli generate -i openapi.yaml -g go -o ./go-client
# Generate TypeScript client
openapi-generator-cli generate -i openapi.yaml -g typescript-fetch -o ./ts-client"Invalid accelerator type" error:
# Use valid values: h100, h200, gb200, b200, a100, l40, l40s, rtx-pro-6000, any
curl "http://localhost:8080/v1/recipe?accelerator=h100""Recipe is required" error:
# The body IS the RecipeResult itself — not wrapped in a {"recipe": ...} field.
# Pass a fully-hydrated RecipeResult (e.g. from GET /v1/recipe) directly:
curl -s "http://localhost:8080/v1/recipe?accelerator=h100&service=eks" | \
curl -X POST "http://localhost:8080/v1/bundle" \
-H "Content-Type: application/json" -d @- -o bundles.zipEmpty zip file:
# Check recipe has componentRefs
curl -s "http://localhost:8080/v1/recipe?accelerator=h100" | jq '.componentRefs'Connection refused (local):
# Start local server first
make server- CLI Reference - Command-line interface
- Agent Deployment - Kubernetes agent for snapshot capture
- Installation Guide - Setup instructions
- Data Flow - Understanding recipe data architecture
- Automation Guide - CI/CD integration patterns
- Kubernetes Deployment - Self-hosted API server deployment