Skip to content

feat(skills): scope the per-agent skill catalog via a profile allowlist#351

Open
vprudnikoff wants to merge 2 commits into
awslabs:mainfrom
vprudnikoff:feat/per-agent-skill-catalog
Open

feat(skills): scope the per-agent skill catalog via a profile allowlist#351
vprudnikoff wants to merge 2 commits into
awslabs:mainfrom
vprudnikoff:feat/per-agent-skill-catalog

Conversation

@vprudnikoff

@vprudnikoff vprudnikoff commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

By default CAO injects an "## Available Skills" catalog — every installed skill's name and description — into every runtime-prompt agent (Claude Code / Codex / Gemini / Kimi / Antigravity) at launch. Since #277 added extra_skill_dirs, that catalog spans every registered directory, so an agent in one project is advertised every other project's skills (and a large repo's worth of unrelated entries). This PR adds an optional skills profile field that scopes which skills appear in a given agent's catalog — the per-agent visibility control #277 explicitly left as future work.

Scope: this is catalog scoping / prompt relevance, not an access-control boundary. It changes what an agent sees advertised, not what it can loadload_skill resolution is unchanged, and any installed skill is still loadable by name.

Motivation

I run many agents with different jobs against a skill store that spans every project on my machine (via extra_skill_dirs, #277). extra_skill_dirs made project-local skills first-class but kept the catalog global and flat: resolution is shared, so every agent is advertised every skill across every registered directory. For a multi-project setup — or a single project with many capability skills — that is noisy: a research director gets DB, infra, frontend, ads, etc. skills injected into its prompt, dozens of entries for use cases it will never touch. skills lets each agent advertise only the skills it actually uses, so the injected catalog stays small and on-topic — without copying or unregistering anything.

What changed

  • models/agent_profile — new optional skills: Optional[List[str]].
  • utils/skillsbuild_skill_catalog(skill_filter=None) filters list_skills() by the patterns. Each pattern is an exact skill name or a case-sensitive fnmatch glob (fnmatchcase, so matching is consistent with how skill names resolve on disk). Patterns that match no installed skill are logger.warning-ed (rendered with repr()), to surface profile typos / stale names. None (the default) lists every skill — unchanged behavior; [] advertises none.
  • services/terminal_service — threads profile.skills into build_skill_catalog, only on the runtime-prompt path; providers that deliver skills natively (Kiro skill:// resources, OpenCode symlink, Q / Copilot baked catalog) are unaffected.
  • docsskills.md (new "Scoping the catalog per agent" section, explicit that this scopes the catalog only, plus the runtime-prompt-only caveat) and agent-profile.md (field reference).
  • tests — exact / glob / mixed / [] / None, case-sensitivity, the unmatched-pattern warning, and the terminal_service pass-through.

Example

---
name: ads-backend-developer
role: developer
skills: ["ads-db", "ads-query-logs"]   # exact names
---
---
name: ads-cto
role: developer
skills: ["ads-*", "cao-*"]              # globs: this project's skills + CAO built-ins
---

Omit skills for the full catalog (backward-compatible); skills: [] advertises none.

Scope / compatibility

Additive and backward-compatible: a profile without the field deserializes and behaves exactly as before (full catalog) — Pydantic ignores the absent field, and None skips the filter. Only the runtime-prompt providers consume it. Skill discovery and resolution (extra_skill_dirs, first-valid-match-wins) is unchanged — this filters the injected catalog, not how a name resolves when load_skill runs, so it is not an access boundary. A hard per-agent allowlist would need enforcement in the load_skill path; that is a larger, separate change and out of scope here.

Testing

  • pytest — full unit suite green (test/utils/test_skills.py, test/services/test_terminal_service_full.py, and the broader test/ --ignore=test/e2e -m "not integration").
  • black / isort / mypy — clean.

…` allowlist

By default every installed skill is advertised to every CAO agent. Add an
optional `skills` field to the agent profile: a list of skill-name patterns,
each an exact name or an fnmatch glob (e.g. "ads-*"). When set, only matching
skills appear in that agent's "## Available Skills" catalog. Omitting the field
keeps the full catalog (backward-compatible); an empty list advertises none.

This lets a project's agents see only their own skills instead of every skill
registered under a shared extra_skill_dirs, and lets an orchestrator hide
capability skills from workers that shouldn't reach for them. An agent cannot
load_skill what it cannot see.

- models: add AgentProfile.skills
- skills: build_skill_catalog(skill_filter) filters list_skills() by fnmatch
- terminal_service: thread profile.skills into the catalog builder
- docs/skills.md: document the field
- tests: filter matching (exact/glob/mixed/empty/None) + profile pass-through
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (main@5dcf319). Learn more about missing BASE report.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #351   +/-   ##
=======================================
  Coverage        ?   87.49%           
=======================================
  Files           ?      115           
  Lines           ?    13551           
  Branches        ?        0           
=======================================
  Hits            ?    11856           
  Misses          ?     1695           
  Partials        ?        0           
Flag Coverage Δ
unittests 87.49% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional skills allowlist to agent profiles to scope the injected “Available Skills” catalog per agent, reducing prompt noise in multi-project / multi-skill setups while keeping default behavior unchanged when the field is omitted.

Changes:

  • Introduces AgentProfile.skills: Optional[List[str]] and threads it into terminal creation for runtime-prompt providers.
  • Updates skill-catalog generation to support exact-name and case-sensitive fnmatch glob filtering, including warnings for unmatched patterns.
  • Adds documentation and test coverage for the new filtering behavior and terminal-service pass-through.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/cli_agent_orchestrator/models/agent_profile.py Adds optional skills field to profiles to configure catalog scoping.
src/cli_agent_orchestrator/utils/skills.py Implements build_skill_catalog(skill_filter=...) filtering + unmatched-pattern warning.
src/cli_agent_orchestrator/services/terminal_service.py Passes profile.skills into catalog building for runtime-prompt providers.
test/utils/test_skills.py Adds unit tests for exact/glob/mixed filters, empty list, case-sensitivity, and warning logging.
test/services/test_terminal_service_full.py Updates/extends coverage to assert profile.skills is threaded through correctly.
docs/skills.md Documents per-agent catalog scoping semantics and provider caveat.
docs/agent-profile.md Documents the new skills field in the profile reference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +191 to +194
logger.warning(
"Skill-catalog filter matched no installed skill for pattern(s): %s",
", ".join(unmatched),
)
Comment thread docs/skills.md Outdated
You can explicitly instruct the agent to load specific skills eagerly in the agent profile body:
### Scoping the catalog per agent (`skills`)

To advertise only a subset of skills to a given agent, set the `skills` field in its profile frontmatter — a list of skill-name patterns, each an exact name or a case-sensitive [`fnmatch`](https://docs.python.org/3/library/fnmatch.html) glob. Only matching skills appear in that agent's catalog; the rest stay hidden (an agent cannot `load_skill` what it cannot see). A pattern that matches no installed skill is logged as a warning, to catch typos and stale names.
gutosantos82 added a commit to gutosantos82/cli-agent-orchestrator that referenced this pull request Jul 1, 2026
The single "activity since review" flag was noisy — it keyed off the PR's
updatedAt, which bumps on anything (CI, labels, bot comments), so it fired
on Codecov-only activity (awslabs#350/awslabs#351) and conflated "code changed" with
"someone commented."

Now two precise signals in metadata + dashboard pills:
  - code_changed (🔁 re-review): a review exists at an older sha but not the
    current head → the author pushed, so it genuinely needs re-review.
  - human_activity (💬 discussion since review): the newest NON-BOT comment
    or review is later than our review file. Bots (codecov/dependabot/
    github-actions/copilot) are excluded so their automated comments don't
    masquerade as human engagement.

Verified: awslabs#231/awslabs#336/awslabs#115 flag human_activity (real maintainer comments);
awslabs#350/awslabs#351 (Codecov-only) no longer flag; none flag code_changed (no pushes).
@haofeif

haofeif commented Jul 1, 2026

Copy link
Copy Markdown
Collaborator

@vprudnikoff Thanks for the PR. I want to make sure I understand the problem it is solving.

We already have cao skills add, which installs skills into CAO’s shared skill store. So can you clarify the concrete use case here that cao skills add does not cover?

As I understand it:

  • cao skills add` is a global install/discovery mechanism
  • this PR adds per-agent catalog filtering for runtime-prompt providers

If that is the intent, then this seems more like a prompt-noise / relevance feature than a true skill access-control feature.

That is where I want to clarify scope: right now load_skill still looks global, so a hidden skill name appears to still be loadable directly if the agent knows the name. In other words, this PR seems to filter what is advertised, not what is actually resolvable.

So my question is as below :

  1. Is the goal here just to reduce prompt bloat / irrelevant skill visibility per agent?
  2. Or is the goal to create a real per-agent skill allowlist?

If it is (1), I think the feature makes sense, but the PR/docs should probably describe it as catalog scoping only.

If it is (2), then I think we would also need enforcement in the load_skill path, which looks like a larger change than what this PR currently implements.

@haofeif haofeif self-assigned this Jul 1, 2026
… control

Address review feedback on the per-agent skill-catalog scoping PR:

- docs/skills.md: drop the "an agent cannot load_skill what it cannot
  see" wording, which overstated the field as an access boundary. Make
  explicit that this scopes only the injected catalog (what an agent
  sees advertised); load_skill resolution is unchanged and any installed
  skill is still loadable by name. Note that a hard per-agent allowlist
  would need enforcement in the load_skill path (out of scope).
- utils/skills: render unmatched filter patterns with repr() in the
  warning so a newline-y / hostile skill name cannot garble the log line
  (consistent with validate_tmux_name).
- test/utils/test_skills: assert the unmatched pattern is repr-quoted.
@vprudnikoff

Copy link
Copy Markdown
Contributor Author

Thanks @haofeif — it's (1): catalog scoping for prompt relevance, not an access-control boundary.

The concrete use case: I run many agents with different jobs, against a skill store that spans every project on my machine (via extra_skill_dirs, #277). Today each agent is advertised all of them, so a research director gets DB, infra, frontend, ads, etc. skills injected into its prompt — dozens of entries for use cases it will never touch. I want each agent to see only its own set, so the injected catalog stays small and on-topic instead of carrying every skill that happens to exist anywhere on the box. That's what this field does: it scopes what's advertised per agent.

It deliberately does not touch resolution — load_skill stays global, and a known name is still loadable. Good catch that the docs blurred that line; the "an agent cannot load_skill what it cannot see" parenthetical overstated it.

I've pushed 6744869 and reworded the PR description accordingly:

  • docs — the skills section (and the PR description) now describe catalog scoping only: it reduces prompt noise / irrelevant skill visibility per agent; load_skill resolution is unchanged and a known name is still loadable. Dropped the access-control phrasing, and noted that a hard allowlist would need enforcement in the load_skill path.
  • utils/skills — unmatched patterns are now rendered with repr() in the warning (matching validate_tmux_name), per Copilot's note, so newline-y / hostile names can't garble the log line.

A real per-agent allowlist enforced in the load_skill path is a bigger, separate change — happy to follow up with it if you'd want that layer, but I'd keep this PR as the pure advertise-filter, since you'd likely still want catalog scoping as the UX layer on top of any enforcement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants