WIP: Add OCI skill image mounting to AgentRuntime by cooktheryan · Pull Request #332 · kagenti/kagenti-operator

cooktheryan · 2026-05-06T15:27:30Z

Summary

Adds a skills field to AgentRuntimeSpec for declaring OCI skill images to mount into agent pods as Kubernetes ImageVolumes
Gated behind a skillImageVolumes feature gate (default off), requires Kubernetes 1.31+
Uses the skillimage OCI format: FROM scratch images with skill.yaml + SKILL.md
Each skill specifies a mountPath, making the feature framework-agnostic (Claude, Cursor, custom agents, etc.)

Example

apiVersion: agent.kagenti.dev/v1alpha1
kind: AgentRuntime
metadata:
  name: resume-agent-runtime
spec:
  type: agent
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: resume-agent
  skills:
    - name: resume-reviewer
      image: ghcr.io/redhat-et/skillimage/resume-reviewer:v1.0.0
      mountPath: /agent/skills/resume-reviewer
    - name: blog-writer
      image: ghcr.io/redhat-et/skillimage/blog-writer:latest
      mountPath: /app/.claude/skills/blog-writer
      pullPolicy: Always

Changes

Area	Files	What
CRD types	`api/v1alpha1/agentruntime_types.go`	`SkillImageRef`, `SkillPullPolicy`, `Skills` field
Feature gate	`internal/webhook/config/feature_gates.go`	`SkillImageVolumes` (default false)
Controller	`internal/controller/agentruntime_controller.go`, `agentruntime_skills.go`	Reconcile ImageVolumes on Deployment/StatefulSet, cleanup on deletion
Config hash	`internal/controller/agentruntime_config.go`	Skills in hash → rolling updates on change
Webhook	`internal/webhook/v1alpha1/agentruntime_webhook.go`	Validate duplicate names, reserved volume collisions
Wiring	`cmd/main.go`	Pass feature gate loader to reconciler
Docs	`docs/api-reference.md`, `docs/architecture.md`	SkillImageRef reference, conditions, examples
Samples	`config/samples/agent_v1alpha1_agentruntime_skills.yaml`, updated `_full.yaml`	New and updated sample manifests
Helm	`charts/kagenti-operator/values.yaml`, CRD YAML	Feature gate + CRD schema
Tests	`agentruntime_skills_test.go`, `agentruntime_webhook_test.go`	Volume reconciliation, config hash, validation

Relationship to ConfigMap-based skill linking (kagenti/kagenti#1440)

This feature complements the ConfigMap-based skill mounting in kagenti/kagenti#1440. Both deliver skill files into agent pods, but target different maturity stages from kagenti/kagenti#1342:

	#1440 (ConfigMap)	This PR (OCI ImageVolume)
Storage	Kubernetes ConfigMap (~1MB limit)	OCI registry (no size limit)
Versioning	None (mutable ConfigMap)	OCI tags + digests (immutable)
Lifecycle	Create/delete ConfigMap	draft → testing → published → deprecated → archived
Declaration	Backend API at deploy time	AgentRuntime CR (declarative, GitOps-friendly)
Mount path	Hardcoded `/app/skills/<name>`	User-specified per skill
K8s version	Any	1.31+ (ImageVolume feature gate)

Integration opportunities for discussion

SKILL_FOLDERS env var — #1440 sets SKILL_FOLDERS so agents discover mounted skills. This operator feature could inject the same env var so agents work transparently with both delivery mechanisms.
kagenti.io/skills annotation — #1440 stores linked skills in this annotation. The operator could write this annotation when skills are declared on the AgentRuntime CR, enabling the UI/backend to display OCI-mounted skills alongside ConfigMap-mounted ones.
Coexistence — Both mechanisms can coexist on the same pod. ConfigMap volumes use names like skill-0, skill-1; OCI ImageVolumes use skill-<name>. Different volume types, different names, no conflicts.
Migration path — ConfigMap skills work today on any K8s version. OCI ImageVolume skills are the upgrade path when clusters reach K8s 1.31+. Teams can adopt incrementally.

Test plan

Unit tests: volume reconciliation (add/remove/update/multi-container), config hash, webhook validation
make manifests generate — CRD and deepcopy regenerated
go build ./... — compiles cleanly
go test ./internal/controller/ ./internal/webhook/... — all tests pass
Kind cluster (K8s 1.31): CRD installs, schema validation works, fields round-trip correctly
E2E: Full operator deployment with skill ImageVolumes on K8s 1.33+ cluster (requires kind v0.29.0+ with containerd 2.1.1 for runtime-level ImageVolume support)

Assisted-By: Claude Code

cooktheryan · 2026-05-06T15:31:16Z

DO NOT MERGE at the current time. I would like feedback based on kagenti/kagenti#1342

cwiklik

Solid implementation with proper feature gating, comprehensive tests (unit + E2E), clean separation of concerns (controller, webhook, config hash), and thorough docs. The ImageVolume K8s requirement (1.31+) is well-documented and the graceful degradation (condition + event when gate is disabled) is good UX.

Areas reviewed: Go (types, controller, webhook), Helm, CRD, Docs, Tests
Commits: 3 commits, all signed-off: yes
CI status: all passing (E2E pending manual trigger)

Suggestions (non-blocking)

1. PR body attribution (nit)
PR body ends with "Generated with Claude Code" — per repo conventions this should be "Assisted-By: Claude Code".

2. Commit hygiene (suggestion)
Commits 45e06bc ("include e2e tests for oci") and 4bd2f5a ("fixes due to code review") don't follow the imperative commit convention and are vague. Consider squashing into the main commit before merge.

3. Skill mounts applied to all containers (suggestion)
Skills are currently mounted into ALL containers including sidecars (envoy-proxy, spiffe-helper). For pods with AuthBridge injection, sidecars don't need skill files. Consider targeting only the agent container in a follow-up. Not a blocker for alpha — the extra read-only mounts are harmless — but worth tracking to avoid clutter in complex pod specs.

pavelanni · 2026-05-06T17:39:31Z

It's important to make sure that the mounted skills are listed in the AgentCard exposed by the agent running in Agent Runtime. There is a section in the AgentCard spec for that.

https://agent2agent.info/docs/concepts/agentcard/

In my agent harness (https://github.com/redhat-et/docsclaw) it is implemented by the agent itself, but it would be good to have it implemented at the runtime level to make it agent-agnostic.

Another important thing is ensure that images are mounted in containers read-only to avoid any risk of mutating them my malicious agents. If the Operator mounts them, it should be in its logic.

pavelanni · 2026-05-06T21:57:10Z

Please take a look at the SkillCard schema that I use in Skill Image: https://github.com/redhat-et/skillimage/blob/main/schemas/skillcard-v1.json
It might be used as a prototype for Kagenti skills.

Add kagenti.io/skills annotation on target workload metadata with a JSON array of mounted skill names for downstream discovery (agent card controllers, UI). The annotation is set when the skillImageVolumes feature gate is enabled and removed on skill clearing or AgentRuntime deletion. Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Ryan Cook <rcook@redhat.com>

cooktheryan · 2026-05-07T19:58:21Z

@kevincogan one thing my brain is stuck on right now when we build the container image for an agent we build it with the agentcard and that agentcard is r/o. The stuck point I have is the OCI mounting for skills may be dynamic but the SKILL section of an agentcard is pretty much locked with our mechanism. Any advice or thoughts here?

cooktheryan · 2026-05-07T20:25:55Z

additionally @Ladas do you have any opinions here based on your work launching claude code and etc using the OCI mounting mechanisms

kevincogan · 2026-05-07T21:07:12Z

@kevincogan one thing my brain is stuck on right now when we build the container image for an agent we build it with the agentcard and that agentcard is r/o. The stuck point I have is the OCI mounting for skills may be dynamic but the SKILL section of an agentcard is pretty much locked with our mechanism. Any advice or thoughts here?

@cooktheryan I don't think we should touch the signed card the agent serves. That stays locked down. But the AgentCardReconciler already fetches and caches the card into status, so we can just append the runtime skills to that cached copy after verification completes. One flow, one CR, just an enriched status at the end.

Security-wise nothing changes. Verification (JWS or mTLS) still runs against the original card before any merging happens. The Verified condition, NetworkPolicy, and identity binding are all driven by the original signed card. The appended skills are purely informational for discovery and the UI.

Your kagenti.io/skills annotation is basically all I'd need on my side. The AgentCard controller reads that and appends anything not already in the card.

Let me know if I am missing anything. If not I can pick this up as a follow-up once yours lands.

pdettori · 2026-05-07T23:01:58Z

@cooktheryan should we set this PR as draft until ready to merge ?

cooktheryan · 2026-05-08T00:01:55Z

@pdettori yes for sure...i was feeling confident in the PR early then I realized how many pieces we have to tie in

eranra · 2026-05-10T10:03:06Z

@cooktheryan @pavelanni @pdettori are you guys in sync with the initial community effort around OCI and skills here: https://github.com/agentskills/agentskills/discussions/292?ref=thomasvitale.com --- if will be best if we can make Kagnti as "generic" as possible and if we can join forces with the community effort and align the code it will be best.

pavelanni · 2026-05-10T15:01:21Z

@eranra Yes, I reached out to Thomas Vitale on Slack and we are working on organizing a meeting. There is also a CNCF initiative around that: cncf/toc#1740 which I am participating in as well.
I'm also in contact with the Lola project: https://github.com/LobsterTrap/lola where we are adding OCI extension to their toolset.

eranra · 2026-05-11T16:05:30Z

@eranra Yes, I reached out to Thomas Vitale on Slack and we are working on organizing a meeting. There is also a CNCF initiative around that: cncf/toc#1740 which I am participating in as well. I'm also in contact with the Lola project: https://github.com/LobsterTrap/lola where we are adding OCI extension to their toolset.

@pavelanni Thanks for sharing ;-)

I looked at the link/initiative, and it is indeed very interesting. I think we should also consider a more “shift-right” approach that automates processes and moves more of the intelligence and optimization into the runtime space.

Focusing on the AI developer persona makes a lot of sense today, but as the skills and AI ecosystem evolves toward greater automation and iterative optimization, the outer loop will become just as important. In particular, the ability to automatically improve, adapt, and incorporate new skills over time will be critical for long-term scalability and operational efficiency. I think that dynamic interaction with skills is a characteristic we need to consider in the interface between agents and skills.

cooktheryan requested a review from a team as a code owner May 6, 2026 15:27

rubambiza added this to Kagenti Issue Prioritization May 6, 2026

github-project-automation Bot moved this to Backlog in Kagenti Issue Prioritization May 6, 2026

cooktheryan force-pushed the feat/skill-image-volumes branch from b32eaa8 to d3257c1 Compare May 6, 2026 15:29

cooktheryan mentioned this pull request May 6, 2026

Proposal: Skills Management for Kagenti kagenti/kagenti#1342

Open

1 task

cooktheryan force-pushed the feat/skill-image-volumes branch from d3257c1 to 274bd62 Compare May 6, 2026 15:42

cwiklik approved these changes May 6, 2026

View reviewed changes

cooktheryan force-pushed the feat/skill-image-volumes branch 2 times, most recently from bd605a8 to 2961db3 Compare May 6, 2026 18:45

cooktheryan force-pushed the feat/skill-image-volumes branch from 2961db3 to 2f3bf03 Compare May 7, 2026 19:47

cooktheryan force-pushed the feat/skill-image-volumes branch from 2f3bf03 to 58f3815 Compare May 7, 2026 19:57

cooktheryan changed the title ~~Feat: Add OCI skill image mounting to AgentRuntime~~ WIP: Add OCI skill image mounting to AgentRuntime May 8, 2026

pdettori marked this pull request as draft May 8, 2026 02:19

xjacka mentioned this pull request May 11, 2026

Weekly Report 2026-05-11 kagenti/kagenti#1533

Open

clawgenti mentioned this pull request May 18, 2026

Weekly Report 2026-05-18 kagenti/kagenti#1608

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Add OCI skill image mounting to AgentRuntime#332

WIP: Add OCI skill image mounting to AgentRuntime#332
cooktheryan wants to merge 1 commit into
kagenti:mainfrom
cooktheryan:feat/skill-image-volumes

cooktheryan commented May 6, 2026 •

edited

Loading

Uh oh!

cooktheryan commented May 6, 2026

Uh oh!

cwiklik left a comment

Uh oh!

pavelanni commented May 6, 2026 •

edited

Loading

Uh oh!

pavelanni commented May 6, 2026

Uh oh!

cooktheryan commented May 7, 2026

Uh oh!

cooktheryan commented May 7, 2026

Uh oh!

kevincogan commented May 7, 2026

Uh oh!

pdettori commented May 7, 2026

Uh oh!

cooktheryan commented May 8, 2026

Uh oh!

eranra commented May 10, 2026 •

edited

Loading

Uh oh!

pavelanni commented May 10, 2026

Uh oh!

eranra commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

cooktheryan commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example

Changes

Relationship to ConfigMap-based skill linking (kagenti/kagenti#1440)

Integration opportunities for discussion

Test plan

Uh oh!

cooktheryan commented May 6, 2026

Uh oh!

cwiklik left a comment

Choose a reason for hiding this comment

Suggestions (non-blocking)

Uh oh!

pavelanni commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pavelanni commented May 6, 2026

Uh oh!

cooktheryan commented May 7, 2026

Uh oh!

cooktheryan commented May 7, 2026

Uh oh!

kevincogan commented May 7, 2026

Uh oh!

pdettori commented May 7, 2026

Uh oh!

cooktheryan commented May 8, 2026

Uh oh!

eranra commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pavelanni commented May 10, 2026

Uh oh!

eranra commented May 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

cooktheryan commented May 6, 2026 •

edited

Loading

pavelanni commented May 6, 2026 •

edited

Loading

eranra commented May 10, 2026 •

edited

Loading