feat(skills): scope the per-agent skill catalog via a profile allowlist by vprudnikoff · Pull Request #351 · awslabs/cli-agent-orchestrator

vprudnikoff · 2026-06-30T15:54:31Z

Summary

By default CAO injects an "## Available Skills" catalog — every installed skill's name and description — into every runtime-prompt agent (Claude Code / Codex / Gemini / Kimi / Antigravity) at launch. Since #277 added extra_skill_dirs, that catalog spans every registered directory, so an agent in one project is advertised every other project's skills (and a large repo's worth of unrelated entries). This PR adds an optional skills profile field that scopes which skills appear in a given agent's catalog — the per-agent visibility control #277 explicitly left as future work.

Scope: this is catalog scoping / prompt relevance, not an access-control boundary. It changes what an agent sees advertised, not what it can load — load_skill resolution is unchanged, and any installed skill is still loadable by name.

Motivation

I run many agents with different jobs against a skill store that spans every project on my machine (via extra_skill_dirs, #277). extra_skill_dirs made project-local skills first-class but kept the catalog global and flat: resolution is shared, so every agent is advertised every skill across every registered directory. For a multi-project setup — or a single project with many capability skills — that is noisy: a research director gets DB, infra, frontend, ads, etc. skills injected into its prompt, dozens of entries for use cases it will never touch. skills lets each agent advertise only the skills it actually uses, so the injected catalog stays small and on-topic — without copying or unregistering anything.

What changed

models/agent_profile — new optional skills: Optional[List[str]].
utils/skills — build_skill_catalog(skill_filter=None) filters list_skills() by the patterns. Each pattern is an exact skill name or a case-sensitive fnmatch glob (fnmatchcase, so matching is consistent with how skill names resolve on disk). Patterns that match no installed skill are logger.warning-ed (rendered with repr()), to surface profile typos / stale names. None (the default) lists every skill — unchanged behavior; [] advertises none.
services/terminal_service — threads profile.skills into build_skill_catalog, only on the runtime-prompt path; providers that deliver skills natively (Kiro skill:// resources, OpenCode symlink, Q / Copilot baked catalog) are unaffected.
docs — skills.md (new "Scoping the catalog per agent" section, explicit that this scopes the catalog only, plus the runtime-prompt-only caveat) and agent-profile.md (field reference).
tests — exact / glob / mixed / [] / None, case-sensitivity, the unmatched-pattern warning, and the terminal_service pass-through.

Example

---
name: ads-backend-developer
role: developer
skills: ["ads-db", "ads-query-logs"]   # exact names
---

---
name: ads-cto
role: developer
skills: ["ads-*", "cao-*"]              # globs: this project's skills + CAO built-ins
---

Omit skills for the full catalog (backward-compatible); skills: [] advertises none.

Scope / compatibility

Additive and backward-compatible: a profile without the field deserializes and behaves exactly as before (full catalog) — Pydantic ignores the absent field, and None skips the filter. Only the runtime-prompt providers consume it. Skill discovery and resolution (extra_skill_dirs, first-valid-match-wins) is unchanged — this filters the injected catalog, not how a name resolves when load_skill runs, so it is not an access boundary. A hard per-agent allowlist would need enforcement in the load_skill path; that is a larger, separate change and out of scope here.

Testing

pytest — full unit suite green (test/utils/test_skills.py, test/services/test_terminal_service_full.py, and the broader test/ --ignore=test/e2e -m "not integration").
black / isort / mypy — clean.

…` allowlist By default every installed skill is advertised to every CAO agent. Add an optional `skills` field to the agent profile: a list of skill-name patterns, each an exact name or an fnmatch glob (e.g. "ads-*"). When set, only matching skills appear in that agent's "## Available Skills" catalog. Omitting the field keeps the full catalog (backward-compatible); an empty list advertises none. This lets a project's agents see only their own skills instead of every skill registered under a shared extra_skill_dirs, and lets an orchestrator hide capability skills from workers that shouldn't reach for them. An agent cannot load_skill what it cannot see. - models: add AgentProfile.skills - skills: build_skill_catalog(skill_filter) filters list_skills() by fnmatch - terminal_service: thread profile.skills into the catalog builder - docs/skills.md: document the field - tests: filter matching (exact/glob/mixed/empty/None) + profile pass-through

codecov-commenter · 2026-07-01T03:30:33Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@5dcf319). Learn more about missing BASE report.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #351   +/-   ##
=======================================
  Coverage        ?   87.49%           
=======================================
  Files           ?      115           
  Lines           ?    13551           
  Branches        ?        0           
=======================================
  Hits            ?    11856           
  Misses          ?     1695           
  Partials        ?        0

Flag	Coverage Δ
unittests	`87.49% <100.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot

Pull request overview

Adds an optional skills allowlist to agent profiles to scope the injected “Available Skills” catalog per agent, reducing prompt noise in multi-project / multi-skill setups while keeping default behavior unchanged when the field is omitted.

Changes:

Introduces AgentProfile.skills: Optional[List[str]] and threads it into terminal creation for runtime-prompt providers.
Updates skill-catalog generation to support exact-name and case-sensitive fnmatch glob filtering, including warnings for unmatched patterns.
Adds documentation and test coverage for the new filtering behavior and terminal-service pass-through.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/cli_agent_orchestrator/models/agent_profile.py`	Adds optional `skills` field to profiles to configure catalog scoping.
`src/cli_agent_orchestrator/utils/skills.py`	Implements `build_skill_catalog(skill_filter=...)` filtering + unmatched-pattern warning.
`src/cli_agent_orchestrator/services/terminal_service.py`	Passes `profile.skills` into catalog building for runtime-prompt providers.
`test/utils/test_skills.py`	Adds unit tests for exact/glob/mixed filters, empty list, case-sensitivity, and warning logging.
`test/services/test_terminal_service_full.py`	Updates/extends coverage to assert `profile.skills` is threaded through correctly.
`docs/skills.md`	Documents per-agent catalog scoping semantics and provider caveat.
`docs/agent-profile.md`	Documents the new `skills` field in the profile reference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+            logger.warning(
+                "Skill-catalog filter matched no installed skill for pattern(s): %s",
+                ", ".join(unmatched),
+            )


-You can explicitly instruct the agent to load specific skills eagerly in the agent profile body:
+### Scoping the catalog per agent (`skills`)
+
+To advertise only a subset of skills to a given agent, set the `skills` field in its profile frontmatter — a list of skill-name patterns, each an exact name or a case-sensitive [`fnmatch`](https://docs.python.org/3/library/fnmatch.html) glob. Only matching skills appear in that agent's catalog; the rest stay hidden (an agent cannot `load_skill` what it cannot see). A pattern that matches no installed skill is logged as a warning, to catch typos and stale names.


The single "activity since review" flag was noisy — it keyed off the PR's updatedAt, which bumps on anything (CI, labels, bot comments), so it fired on Codecov-only activity (awslabs#350/awslabs#351) and conflated "code changed" with "someone commented." Now two precise signals in metadata + dashboard pills: - code_changed (🔁 re-review): a review exists at an older sha but not the current head → the author pushed, so it genuinely needs re-review. - human_activity (💬 discussion since review): the newest NON-BOT comment or review is later than our review file. Bots (codecov/dependabot/ github-actions/copilot) are excluded so their automated comments don't masquerade as human engagement. Verified: awslabs#231/awslabs#336/awslabs#115 flag human_activity (real maintainer comments); awslabs#350/awslabs#351 (Codecov-only) no longer flag; none flag code_changed (no pushes).

haofeif · 2026-07-01T07:04:16Z

@vprudnikoff Thanks for the PR. I want to make sure I understand the problem it is solving.

We already have cao skills add, which installs skills into CAO’s shared skill store. So can you clarify the concrete use case here that cao skills add does not cover?

As I understand it:

cao skills add` is a global install/discovery mechanism
this PR adds per-agent catalog filtering for runtime-prompt providers

If that is the intent, then this seems more like a prompt-noise / relevance feature than a true skill access-control feature.

That is where I want to clarify scope: right now load_skill still looks global, so a hidden skill name appears to still be loadable directly if the agent knows the name. In other words, this PR seems to filter what is advertised, not what is actually resolvable.

So my question is as below :

Is the goal here just to reduce prompt bloat / irrelevant skill visibility per agent?
Or is the goal to create a real per-agent skill allowlist?

If it is (1), I think the feature makes sense, but the PR/docs should probably describe it as catalog scoping only.

If it is (2), then I think we would also need enforcement in the load_skill path, which looks like a larger change than what this PR currently implements.

… control Address review feedback on the per-agent skill-catalog scoping PR: - docs/skills.md: drop the "an agent cannot load_skill what it cannot see" wording, which overstated the field as an access boundary. Make explicit that this scopes only the injected catalog (what an agent sees advertised); load_skill resolution is unchanged and any installed skill is still loadable by name. Note that a hard per-agent allowlist would need enforcement in the load_skill path (out of scope). - utils/skills: render unmatched filter patterns with repr() in the warning so a newline-y / hostile skill name cannot garble the log line (consistent with validate_tmux_name). - test/utils/test_skills: assert the unmatched pattern is repr-quoted.

vprudnikoff · 2026-07-01T11:13:47Z

Thanks @haofeif — it's (1): catalog scoping for prompt relevance, not an access-control boundary.

The concrete use case: I run many agents with different jobs, against a skill store that spans every project on my machine (via extra_skill_dirs, #277). Today each agent is advertised all of them, so a research director gets DB, infra, frontend, ads, etc. skills injected into its prompt — dozens of entries for use cases it will never touch. I want each agent to see only its own set, so the injected catalog stays small and on-topic instead of carrying every skill that happens to exist anywhere on the box. That's what this field does: it scopes what's advertised per agent.

It deliberately does not touch resolution — load_skill stays global, and a known name is still loadable. Good catch that the docs blurred that line; the "an agent cannot load_skill what it cannot see" parenthetical overstated it.

I've pushed 6744869 and reworded the PR description accordingly:

docs — the skills section (and the PR description) now describe catalog scoping only: it reduces prompt noise / irrelevant skill visibility per agent; load_skill resolution is unchanged and a known name is still loadable. Dropped the access-control phrasing, and noted that a hard allowlist would need enforcement in the load_skill path.
utils/skills — unmatched patterns are now rendered with repr() in the warning (matching validate_tmux_name), per Copilot's note, so newline-y / hostile names can't garble the log line.

A real per-agent allowlist enforced in the load_skill path is a bigger, separate change — happy to follow up with it if you'd want that layer, but I'd keep this PR as the pure advertise-filter, since you'd likely still want catalog scoping as the UX layer on top of any enforcement.

haofeif requested a review from Copilot July 1, 2026 03:28

Copilot started reviewing on behalf of haofeif July 1, 2026 03:29 View session

Copilot AI reviewed Jul 1, 2026

View reviewed changes

haofeif self-assigned this Jul 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(skills): scope the per-agent skill catalog via a profile allowlist#351

feat(skills): scope the per-agent skill catalog via a profile allowlist#351
vprudnikoff wants to merge 2 commits into
awslabs:mainfrom
vprudnikoff:feat/per-agent-skill-catalog

vprudnikoff commented Jun 30, 2026 •

edited

Loading

Uh oh!

codecov-commenter commented Jul 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

haofeif commented Jul 1, 2026

Uh oh!

vprudnikoff commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

vprudnikoff commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

What changed

Example

Scope / compatibility

Testing

Uh oh!

codecov-commenter commented Jul 1, 2026

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

haofeif commented Jul 1, 2026

Uh oh!

vprudnikoff commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

vprudnikoff commented Jun 30, 2026 •

edited

Loading