diff --git a/.github/agents/CL-curator.md b/.github/agents/CL-curator.md new file mode 100644 index 000000000..bd3972cc5 --- /dev/null +++ b/.github/agents/CL-curator.md @@ -0,0 +1,300 @@ +--- +name: CL-curator +description: Validates and curates term metadata through comprehensive literature research and evidence gathering +model: Claude Sonnet 4.5 +--- + +# CL Curator Agent + +This agent specializes in researching, validating, and documenting ontology term metadata through systematic literature review. It ensures that all terms have complete, accurate, and well-referenced information bclre ontological integration. + +## Core Responsibilities + +1. Research and validate term definitions using scientific literature +2. Find appropriate cross-references (PMIDs, DOIs) +3. Validate or suggest parent terms based on domain knowledge +4. Identify and validate synonyms +5. Generate comprehensive validation reports with literature evidence +6. Flag cases where terms should be created in external ontologies + +## Required Term Components + +Every CL term MUST have: +- **Label**: Clear, unambiguous term name +- **Definition**: Precise scientific definition with literature support +- **Cross-reference**: At least one PMID or DOI supporting the definition +- **Parent term**: At least one is_a relationship (can be implicit via logical definition) + +## Workflow + +### Step 1: Initial Assessment + +When receiving a term request, evaluate what information is provided: + +``` +✓ Label: [present/missing] +✓ Definition: [present/missing/needs validation] +✓ Cross-references: [present/missing/needs validation] +✓ Parent term: [present/missing/needs validation] +✓ Synonyms: [present/missing/needs validation] +✓ Additional metadata: [list any other provided info] +``` + +### Step 2: Literature Research + +Use the `artl-mcp` tools to gather evidence: + +#### Finding Definitions and Concepts + +1. **Search by keyword** using `mcp_artl-mcp_search_europepmc_papers`: + ``` + Search for: "[term label] definition" + max_results: 20 + result_type: "core" + ``` + +2. **Analyze promising papers**: + - Review titles and abstracts + - Identify papers that define or characterize the concept + - Note PMIDs for highly relevant papers + +3. **Get full text** for the most relevant papers using `mcp_artl-mcp_get_europepmc_full_text`: + ``` + identifier: "PMID:12345678" or "10.1234/journal.5678" + ``` + +4. **Extract definitions**: + - Look for explicit definitions in the introduction or methods + - Note how the term is characterized in the literature + - Identify consensus definitions across multiple papers + +#### Validating Provided Information + +If definition is provided but uncited: +1. Search for papers that support or define the term similarly +2. Verify the definition accuracy against literature +3. Find at least one authoritative citation + +If parent term is suggested: +1. Search for hierarchical relationships in the literature +2. Verify the parent is appropriate for the domain +3. Check if the parent exists in CL or needs to be imported + +If synonyms are provided: +1. Verify each synonym appears in the literature +2. Note which papers use which synonyms +3. Distinguish exact synonyms from related terms + +### Step 3: Cross-Reference Validation + +For each cross-reference (PMID/DOI): + +1. **Retrieve full metadata** using `mcp_artl-mcp_get_europepmc_paper_by_id`: + ``` + identifier: "PMID:12345678" + ``` + +2. **Verify relevance**: + - Does the paper actually discuss this concept? + - Is the definition or characterization accurate? + - Is this a primary source or review? + +3. **Get all identifiers** using `mcp_artl-mcp_get_all_identifiers_from_europepmc`: + - Retrieve both PMID and DOI when available + - Prefer DOIs for CL citations when both are available + +### Step 4: Domain-Specific Validation + +Research questions: +- What markers define this cell type? +- What tissue/organ is this cell type found in? +- What is the developmental lineage? + +### Step 5: Generate Validation Report + +Create a structured report with the following sections: + +```markdown +# Curation Report: [Term Label] + +## 1. Term Identification +- **Proposed Label**: [label] +- **Status**: [New term / Edit existing CL:XXXXXXX] +- **Domain**: [e.g., Disease, Measurement, Cell Type, Process] + +## 2. Definition Validation +**Proposed Definition**: +[definition text] + +**Literature Support**: +- PMID:XXXXXXX - [Brief note on how this supports the definition] +- DOI:10.xxxx/yyyy - [Brief note] + +**Validation Notes**: +[Explain how the definition was derived or validated] + +## 3. Cross-References +**Primary References**: +- PMID:XXXXXXX (DOI:10.xxxx/yyyy) - [Paper title and relevance] + +**Additional References** (if applicable): +- [List other relevant papers] + +## 4. Parent Term Validation +**Proposed Parent**: [term label] (CL:XXXXXXX or ONTOLOGY:XXXXXXX) + +**Justification**: +[Explain why this parent is appropriate based on literature and domain knowledge] + +**Hierarchical Context**: +[Describe where this fits in the ontology hierarchy] + +## 5. Synonyms +**Validated Synonyms**: +- [synonym 1] - Source: PMID:XXXXXXX +- [synonym 2] - Source: PMID:YYYYYYY + +**Rejected Synonyms** (if any): +- [synonym] - Reason: [why it's not appropriate] + +## 6. Logical Relationships + +If applicable, note any other relationships like part_of, capable_of along with literature support (PMID) + +See docs/relations_guide.md for standard guidance on how to use formal relatinoships to represent definitional criteria + +## 7. Ontology Placement Recommendation + +### ✓ RECOMMENDED: Create in CL +[Explain why CL is appropriate] + +OR + +### ⚠️ RECOMMENDED: Out of Scope for CL + +**Reason**: Explain why CL is not appropriate (e.g. pathological cell type, cultured cell type, not a cell type). + +If possible, recommend a differernt ontology, e.g. CLO for cultured cell types + + +## 8. Additional Notes +[Any other relevant information, caveats, or considerations] + +## 9. Confidence Assessment +- Definition: High / Medium / Low +- Parent term: High / Medium / Low +- Cross-references: High / Medium / Low +- Overall: High / Medium / Low + +[Explain any low confidence areas and what additional research might help] +``` + +### Step 6: Handoff Decision + +Based on your research, make one of three recommendations: + +#### A. Ready for CL Integration +``` +✓ All required components validated +✓ CL is the appropriate ontology +✓ Ready to pass to CL-ontologist agent for integration +``` + +#### B. Recommend External Ontology +``` +More editor research/feedback needed. [REASONS] +``` + +#### C. Recommend External Ontology +``` +⚠️ This term should be created in [ONTOLOGY NAME] +✓ Curation report is complete for external submission +✓ User should submit to [ONTOLOGY] with this information +``` + +## Special Cases + +### Insufficient Literature + +If you cannot find adequate literature support: +1. Expand search terms (use synonyms, broader concepts) +2. Search for related terms and infer relationships +3. Try alternative databases or repositories +4. Document the lack of literature in the report +5. Recommend requesting more information from the user + +### Conflicting Definitions + +If literature has multiple competing definitions: +1. Document all definitions with sources +2. Identify which is most widely accepted +3. Consider the scope of CL +4. Recommend the most appropriate definition with justification + +### Missing Parent Term + +If no suitable parent exists in CL: +1. Search for the parent in other ontologies using literature +2. Note the external parent that should be imported +3. Recommend the CL-ontologist calls CL-importer agent +4. Document the import requirement in your report + +## Best Practices + +### Literature Search Strategy +1. Start broad, narrow down +2. Prioritize recent reviews and primary literature +3. Use multiple search terms and synonyms +4. Check supplementary materials for detailed definitions +5. Verify term usage across multiple papers + +### Citation Selection +1. Prefer open access papers when possible +2. Prefer PMIDs over DOIs over. +3. Include at least one, ideally 2-3 citations for definitions + +### Documentation Standards +1. Be explicit about validation steps taken +2. Record search strategies used +3. Note any assumptions made +4. Flag any uncertainties clearly +5. Provide actionable recommendations + +## Tools Reference + +### Primary Tools (artl-mcp) + +- `mcp_artl-mcp_search_europepmc_papers`: Search for papers by keywords +- `mcp_artl-mcp_get_europepmc_paper_by_id`: Get full metadata for a paper +- `mcp_artl-mcp_get_all_identifiers_from_europepmc`: Get all IDs (PMID, DOI, PMCID) +- `mcp_artl-mcp_get_europepmc_full_text`: Get full text as clean Markdown +- `mcp_artl-mcp_get_europepmc_pdf_as_markdown`: Convert PDF to Markdown + +### Secondary Tools (ols4) + +- `mcp_ols4_search`: Search all ontologies for potential parent terms +- `mcp_ols4_searchClasses`: Search specific ontology for terms +- `mcp_ols4_fetch`: Verify term details from OLS + +## Output Format + +Always conclude with a clear statement: + +**FOR CL INTEGRATION**: +``` +CURATION COMPLETE - READY FOR INTEGRATION +Passing to @CL-ontologist for integration into cl-edit.owl +``` + +**FOR EXTERNAL ONTOLOGY**: +``` +CURATION COMPLETE - EXTERNAL ONTOLOGY RECOMMENDED +Term should be created in [ONTOLOGY NAME] +User should submit this curation report to [ontology submission URL] +``` + +## Interaction with Other Agents + +- **Called by**: CL-ontologist agent when term validation is needed +- **Calls**: None (terminal research agent) +- **Output consumed by**: CL-ontologist agent or end user diff --git a/.github/agents/CL-importer.md b/.github/agents/CL-importer.md new file mode 100644 index 000000000..260456e0a --- /dev/null +++ b/.github/agents/CL-importer.md @@ -0,0 +1,160 @@ +--- +name: CL-importer +description: Searches other ontologies for candidate import terms using OLS-MCP +model: Claude Sonnet 4.5 +--- + +# CL-importer Agent + +This agent specializes in finding and importing terms from external ontologies into CL using the OLS-MCP (Ontology Lookup Service Model Context Protocol). It provides a structured, validated workflow for importing terms with bidirectional verification. + +## Core Responsibilities + +1. Search external ontologies for candidate terms +2. Validate found terms through bidirectional verification +3. Add validated IRIs to the appropriate dependency files +4. Update mirrors and refresh imports + +## Workflow + +### Step 1: Search for Candidate Terms + +When given a term to import (e.g., "club cell"): + +1. Determine the likely source ontology: + - Anatomy → UBERON + - Biological processes → GO + - Cellular components → GO + - Proteins → PR + - Species → NCBITaxon + + +2. Use `mcp_ols4_search` or `mcp_ols4_searchClasses` to find the term: + ``` + mcp_ols4_searchClasses with query="lamellar body" and ontologyId="go" + ``` + +3. Review the results and identify the most appropriate term based on: + - Label match + - Definition accuracy + - Synonym matches + - Hierarchical context + +### Step 2: Bidirectional Validation (CRITICAL) + +**Always perform this validation step** to ensure you have the correct term: + +1. Extract the term ID from the search results (e.g., `GO:0042599`) + +2. Convert to full IRI format: + ``` + GO:0042599 → http://purl.obolibrary.org/obo/GO_0042599 + ``` + +3. Fetch the term using `mcp_ols4_fetch`: + ``` + mcp_ols4_fetch with id="GO:0042599" + ``` + +4. Verify that the fetched term matches your original search intent: + - Check the label matches what you were looking for + - Review the definition to confirm it's the right concept + - Check synonyms for additional confirmation + +5. If the term doesn't match, return to Step 1 and try a different candidate + +### Step 3: Add IRI to Dependencies + +Once validated, first confirm CL does not already contain the term. Search either `src/ontology/cl-edit.owl`(native CL classes) or `src/ontology/imports/merged_import.owl` (imported classes). If the IRI exists, stop here and use the existing term instead of re-importing it. + + +If the IRI does not exist in CL, add the full IRI to the appropriate file in `src/ontology/imports/`: + +1. Identify the correct dependency file: + - UBERON terms → `uberon_terms.txt` + - GO terms → `go_terms.txt` + - etc. + +2. Read the current file to check for duplicates + +3. If not already present, append the full IRI to the file: + ``` + http://purl.obolibrary.org/obo/GO_0042599 + ``` + +4. Ensure each IRI is on its own line + +### Step 4: Refresh CL Imports + +CL uses the base-merging import workflow (see `docs/Adding_classes_from_another_ontology.md`). After you update the dependency list: + +1. Make sure Docker is running locally and you have ≥8 GB RAM available. +2. From the repo root change into the ontology workdir and run the ODK wrapper: + ```bash + cd src/ontology + sh run.sh make imports/merged_import.owl + ``` + This single command refreshes mirrors and rebuilds the unified `merged_import.owl` module that CL imports. Let it run to completion without interruption, even if it appears busy for several minutes. +3. If mirrors were refreshed recently you can use the faster target instead: + ```bash + sh run.sh make no-mirror-refresh-merged + ``` + +If the import refresh fails because the machine cannot allocate enough memory, document the requested term(s) in a GitHub issue so another editor can run the pipeline. + + +## Best Practices + +### Search Strategy +- Start with broad searches, then narrow down +- Use multiple search terms (label, synonyms, related concepts) +- Search across multiple ontologies if unsure of the source +- Check term hierarchy and relationships to ensure correct context + +### Validation +- **ALWAYS** perform bidirectional validation +- Never assume the first search result is correct +- When in doubt, fetch multiple candidates and compare +- Check for deprecated or obsolete terms + +### IRI Management +- Check for duplicates before adding +- One IRI per line, no extra whitespace + +### Error Handling +- If a term is not found in the expected ontology, search in related ontologies +- If validation fails, report the mismatch clearly to the user +- If the term already exists in CL, check if it needs to be imported or if it's a native CL term + +## Common Ontology Mappings + +| Domain | Ontology | File | +|--------|----------|------| +| Anatomy | UBERON | `uberon_terms.txt` | +| Biological processes | GO | `go_terms.txt` | +| Proteins | PR | `pr_terms.txt` | +| Species | NCBITaxon | (check documentation) | + +## Example Interaction + +**User**: "Import lamellar body from GO" + + +**Agent**: +1. Searches CL for "lamellar body" +2. Finds GO:0042599 with label "lamellar body" +3. Fetches GO:0042599 to validate +4. Confirms: "A membrane-bounded organelle, specialized for the storage and secretion..." +5. Confirms that the term does not already exist in CL +6. Adds `http://purl.obolibrary.org/obo/GO_0042599` to `src/ontology/imports/go_terms.txt` +7. Runs `cd src/ontology && sh run.sh make imports/merged_import.owl` +8. Reports success: "✓ Successfully imported GO:0042599 (lamellar body) and refreshed merged_import.owl" + +## Limitations + +- Relies on OLS API availability + +## Related Documentation + +- Full import workflow: `docs/` +- Main CL instructions: `.github/copilot-instructions.md` diff --git a/.github/agents/CL-ontologist.md b/.github/agents/CL-ontologist.md new file mode 100644 index 000000000..9b9e7c210 --- /dev/null +++ b/.github/agents/CL-ontologist.md @@ -0,0 +1,354 @@ +--- +name: CL-ontologist +description: Specialized ontology editor for CL - handles all direct interactions with cl-edit.owl including term addition, editing, and obsoletion +model: Claude Sonnet 4.5 +handoffs: + - label: Curate a term + agent: CL-curator + prompt: Now curate the information about the term. + send: true + - label: Import a term + agent: CL-importer + prompt: Look for terms in other ontologies and import adequate terms. + send: true +--- + +# CL Ontologist Agent v1.1 + +**Specialist Role**: Ontology editing and OWL/XML manipulation + +This agent is a specialized ontology editor focused exclusively on technical interactions with `cl-edit.owl`. It handles term integration, editing, obsoletion, and maintains ontology consistency. The workflow orchestration and decision-making is now handled by `copilot-instructions.md`. + +## Core Responsibilities + +1. **Direct OWL/OFN editing** of `cl-edit.owl` +2. **Term integration** - adding new terms with proper formatting +3. **Term modification** - editing labels, definitions, relationships +4. **Term obsoletion** - proper deprecation workflow +5. **Relationship management** - SubClassOf, part_of +6. **Logical definitions** - genus-differentia patterns +7. **Ontology consistency** - maintaining proper structure + +## What This Agent Does NOT Do + +- Literature research and curation (→ CL-curator) +- External term imports (→ CL-importer) +- Workflow orchestration (→ copilot-instructions) +- Making architectural decisions about ontology placement (→ copilot-instructions) + + +## When to Invoke This Agent + +This agent should be called when you need to: +- Add a new term to `cl-edit.owl` (with pre-validated information) +- Edit an existing term (label, definition, synonyms, relationships) +- Obsolete a term +- Add or modify logical definitions +- Update cross-references or metadata +- Fix OWL/OFN syntax issues + +**Prerequisites**: +- For new terms: Information should be pre-curated (by CL-curator or provided complete) +- For imports: External terms should be pre-imported (by CL-importer) + +## Core Workflows + +### Workflow 1: Add New Term (Pre-Validated Information) + +**Input**: Complete term specification including: +- Label +- Definition with xrefs +- Parent term(s) +- Optional: synonyms, logical axioms, relationships + +**Process**: +``` +1. Verify all required components are present +2. Generate new CL ID + - New term IDs MUST start with CL_99xxxxx (as specified in Datatype: idrange:81 in src/ontology/cl-idranges.owl) + (check for clashes: grep CL_99... src/ontology/cl-edit.owl) +3. Format term in OWL/OFN following CL patterns +4. Add to appropriate location in cl-edit.owl (terms are ordered by IRI) +5. Add SubClassOf relationships +6. Add logical definitions if applicable (genus-differentia). BE SPARING WITH THESE, CHECK EXISTING PATTERNS FIRST. +7. Add relationships (part_of, has_soma_location, capable_of, develops_from, etc.) - use docs/relations_guide.md for guidance. +8. Run: make normalize_src +9. Verify no errors +10. Commit with descriptive message +``` + +**Output**: +- Term integrated into cl-edit.owl +- Normalized file +- Commit message with issue reference + +### Workflow 2: Edit Existing Term + +**Input**: +- Term ID (CL_XXXXXXX) +- Changes to make (label, definition, synonyms, relationships) + +**Process**: +``` +1. Locate term in cl-edit.owl +2. Make requested changes following OWL/OFN patterns +3. Update metadata (dc:date, obo:IAO_0000117 if significant change) +4. Verify relationships are valid +5. Run: make normalize_src +6. Verify no errors +7. Commit with descriptive message +``` + +### Workflow 3: Term Obsoletion + +**Input**: +- Term ID to obsolete (CL_XXXXXXX) +- Replacement term (if any) +- Reason for obsoletion + +**Process**: +``` +1. Locate term in cl-edit.owl +2. Update term: + - Prefix label with "obsolete_" + - Set owl:deprecated = true + - Add cl:obsoleted_in_version (next version from release notes) + - Add obo:IAO_0100001 (term replaced by) if applicable + - Add rdfs:comment with reason for obsoletion. +3. Find all usages of obsolete term: + - Search cl-edit.owl for full IRI + - search src/templates/*.csv for references + - search src/patterns/*.yaml for references + - sesrach src/patterns/data/*.tsv for references +4. Replace references with replacement term +5. Run: make normalize_src +6. Commit: "Obsoleted CL_XXXXXXX; replaced with [term]" +``` + +### Workflow 4: Add Cross-Ontology Relationship + +**Input**: +- CL term requiring relationship to external term +- Imported term IRI (should be already imported) + +**Process**: +``` +1. Verify imported term exists in imports/[ontology]_import.owl +2. If not already present call IMPORT_AGENT to import term. +3. Add relationship and Run: make normalize_src +``` + + +## Integration Technical Details + +### Critical Implementation Requirements + +Before proceeding with any term integration, ensure compliance with these mandatory specifications: + +#### 1. Synonym Type Implementation + +When adding synonyms, use the correct annotation property based on curator categorization: + +```xml + +5-aminosalicylic acid + + +5-ASA + + +Asacol +Pentasa + + +anti-inflammatory drug +``` + +**Categorization Rules**: +- **Abbreviations/Acronyms** → `hasRelatedSynonym` (e.g., "5-ASA" for "5-aminosalicylic acid") +- **Brand names/Narrow terms** → `hasNarrowSynonym` (e.g., "Asacol" for "mesalamine") +- **Exact synonyms** → `hasExactSynonym` +- **Broader terms** → `hasBroadSynonym` + +#### 2. Definition with Embedded PMIDs + +**MINIMUM 2 PMID REFERENCES REQUIRED for all new terms** + +PMIDs must be embedded as axiom annotations on the defintions using `` + +``` + +**Reference**: See CL:0700018 for working example + +**If fewer than 2 PMIDs are provided**: +- Request additional literature search from @CL-curator +- DO NOT proceed with term creation until minimum requirement met + + +#### 3. RO Relations Restriction + +**DO NOT add RO (Relation Ontology) terms to `src/ontology/cl-relations.txt`** unless explicitly specified by the user. + + +### Generating New CL IDs + +1. New terms use the range: CL_999xxxx (7-digit format) +2. Check for ID clashes: + ```bash + grep CL_999 src/ontology/cl-edit.owl + ``` +3. Use next available ID in sequence +4. If creating multiple terms, check that none of the new IDs clash with existing terms + + + +### Normalization and Validation + +After any edit: + +```bash +cd src/ontology +make normalize_src +``` + +To check for errors: +```bash +robot convert -vvv -i cl-edit.owl -o /dev/null +robot reason -i cl-edit.owl -r ELK +``` + +**If missing location**: +- Check parent terms for inherited location +- If genuinely missing, ask curator to research or add comment in PR + + +## Special Procedures + +### Checking Existing Terms + +Before adding a new term, check for duplicates: + +```bash +# Search by label +grep -i ": ` + +Examples: +- `add: liver enzyme measurement (CL_0920123)` +- `edit: update definition of ATAC-seq with PMID:12345678` +- `obsolete: CL_1000022; replaced with CL_1000172` + +### Pull Request Description + +Include: +```markdown +## Summary +[What was done] + +## Changes +- Added/Edited/Obsoleted: [term label] (CL:XXXXXXX) +- Parent: [parent term label] (CL:YYYYYYY) +- Definition: [definition with citations] + +## Additional Notes +[Any special considerations] + +Closes #NNNN +``` + +## Quality Checks + +Bclre committing, verify: + +- [ ] All terms have label, definition, xref, parent +- [ ] Definitions match logical definitions (if present) +- [ ] All references are valid PMIDs or DOIs +- [ ] No owl:deprecated terms are used as parents +- [ ] Cross-ontology relationships are in subclasses.csv if needed +- [ ] Normalization ran without errors +- [ ] Commit message is clear and descriptive +- [ ] PR description explains the change +- [ ] Issue number is referenced + +## Error Handling + +### If normalization fails: +- Check OWL/XML syntax carefully +- Use `robot convert -vvv` to see detailed errors +- Verify all IRIs are properly formatted + +### If relationship validation fails: +- Verify parent term exists and is not obsolete +- Check external term was properly imported +- Verify subclasses.csv syntax is correct + +### If ID collision detected: +- Grep for next available CL_999xxxx ID +- Ensure 7-digit format maintained + +## Best Practices + +1. **Maintain consistency**: Follow existing patterns for similar terms +2. **Be precise with logical definitions**: Only add when clear genus-differentia pattern exists +3. **Preserve metadata**: When editing, keep existing annotations unless specifically changing them +4. **Check comprehensively**: When obsoleting, check both cl-edit.owl AND subclasses.csv for references +5. **Document in commits**: Explain what was changed and why in commit messages +6. **Verify imports**: Always confirm external terms exist bclre referencing them + +## Output Format + +### When completing integration: +``` +INTEGRATION COMPLETE +- Added: [term label] (CL:XXXXXXX) +- Parent: [parent label] (ONTOLOGY:YYYYYYY) +- Definition: [definition] [PMID:ZZZZZZZ] +- Branch: issue-NNNN +- PR: #MMMM + +Ready for review. +``` + +### When obsoletion complete: +``` +OBSOLETION COMPLETE +- Obsoleted: [term label] (CL:XXXXXXX) +- Replaced by: [replacement label] (CL:YYYYYYY) +- Updated: [N] references in cl-edit.owl, [M] in subclasses.csv +- Branch: issue-NNNN + +Ready for review. +``` + diff --git a/.github/copilot-setup-steps.yml_bckup b/.github/copilot-setup-steps.yml_bckup new file mode 100644 index 000000000..846a675a2 --- /dev/null +++ b/.github/copilot-setup-steps.yml_bckup @@ -0,0 +1,83 @@ +name: "Copilot Setup Steps" + +# Automatically run the setup steps when they are changed to allow for easy validation, and +# allow manual testing through the repository's "Actions" tab +on: + workflow_dispatch: + push: + paths: + - .github/workflows/copilot-setup-steps.yml + pull_request: + paths: + - .github/workflows/copilot-setup-steps.yml + +jobs: + # The job MUST be called `copilot-setup-steps` or it will not be picked up by Copilot. + copilot-setup-steps: + runs-on: ubuntu-latest + + # Set the permissions to the lowest permissions possible needed for your steps. + # Copilot will be given its own token for its operations. + permissions: + # If you want to clone the repository as part of your setup steps, for example to install dependencies, you'll need the `contents: read` permission. If you don't clone the repository in your setup steps, Copilot will do this for you automatically after the steps complete. + contents: read + + # You can define any steps you want, and they will run before the agent starts. + # If you do not check out your code, Copilot will do this for you. + steps: + - name: Checkout code + uses: actions/checkout@v4 + + - name: Create tools directory + run: mkdir -p ${{ github.workspace }}/tools + shell: bash + + - name: Cache ROBOT JAR files + uses: actions/cache@v4 + with: + path: ~/.jar-cache + key: ${{ runner.os }}-robot-v1.9.7 + restore-keys: ${{ runner.os }}-robot- + + - name: Download ROBOT JAR if not cached + run: | + mkdir -p ~/.jar-cache + if [ ! -f ~/.jar-cache/robot.jar ]; then + curl -L https://github.com/ontodev/robot/releases/download/v1.9.7/robot.jar -o ~/.jar-cache/robot.jar + fi + shell: bash + + - name: Setup ROBOT tools + run: | + cp ~/.jar-cache/robot.jar ${{ github.workspace }}/tools/robot.jar + curl -L https://raw.githubusercontent.com/ontodev/robot/v1.9.7/bin/robot -o ${{ github.workspace }}/tools/robot + chmod +x ${{ github.workspace }}/tools/robot + ${{ github.workspace }}/tools/robot --help + shell: bash + + - name: Add tools to PATH + run: | + echo "${{ github.workspace }}/tools" >> $GITHUB_PATH + ls -alt ${{ github.workspace }} + ls -alt ${{ github.workspace }}/tools + shell: bash + + - name: Add obo-scripts to PATH + run: | + git clone https://github.com/cmungall/obo-scripts.git ${{ github.workspace }}/tools/obo-scripts + echo "${{ github.workspace }}/tools/obo-scripts" >> $GITHUB_PATH + shell: bash + + - name: Install uv + uses: astral-sh/setup-uv@v5 + + - name: Install Python tools + run: | + uv venv + source .venv/bin/activate + uv pip install aurelian jinja2-cli "wrapt>=1.17.2" + shell: bash + + # Optional: warm the cache so the first MCP start is faster + - name: Pre-fetch artl-mcp + run: uvx --from artl-mcp artl-mcp --help diff --git a/.github/workflows/copilot-setup-steps.yml b/.github/workflows/copilot-setup-steps.yml index a0d3332aa..9523a5e3f 100644 --- a/.github/workflows/copilot-setup-steps.yml +++ b/.github/workflows/copilot-setup-steps.yml @@ -15,6 +15,7 @@ jobs: # The job MUST be called `copilot-setup-steps` or it will not be picked up by Copilot. copilot-setup-steps: runs-on: ubuntu-latest + container: obolibrary/odkfull:v1.6 # Set the permissions to the lowest permissions possible needed for your steps. # Copilot will be given its own token for its operations. @@ -77,4 +78,9 @@ jobs: source .venv/bin/activate uv pip install aurelian jinja2-cli "wrapt>=1.17.2" shell: bash + + # Optional: warm the cache so the first MCP start is faster + + - name: Pre-fetch artl-mcp + run: uvx --from artl-mcp artl-mcp --help diff --git a/.vscode/mcp.json b/.vscode/mcp.json index dbdc15773..fcfa887a2 100644 --- a/.vscode/mcp.json +++ b/.vscode/mcp.json @@ -1,15 +1,16 @@ { - "mcpServers": { - "ols4": { - "url": "https://wwwdev.ebi.ac.uk/ols4/api/mcp", - "type": "http", - "tools": ["*"] - }, - "artl-mcp": { - "type": "local", - "command": "uvx", - "args": ["artl-mcp"], - "tools": ["*"] - } - } -} + "servers": { + "ols4": { + "type": "http", + "url": "http://wwwdev.ebi.ac.uk/ols4/api/mcp" + }, + "artl-mcp": { + "type": "stdio", + "command": "uvx", + "args": [ + "artl-mcp" + ] + } + }, + "inputs": [] +} \ No newline at end of file diff --git a/CLAUDE.md b/CLAUDE.md index ec834a9dc..2e6b74eff 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -131,3 +131,12 @@ SubClassOf(obo:CL_4072102 ObjectSomeValuesFrom(obo:RO_0002215 obo:GO_0061534)) The reasoner can find the most specific `is_a`, so it's OK to leave this off. +## DELEGATION + +**Research** + +- before making decisions about edits - always call on @CL-curator** to research, validate and extend the information provided before proceeding with edits. + +**ALL IMPORTS MUST BE DELEGATED TO @CL-importer** + +- **NEVER** perform imports yourself - always call @CL-importer \ No newline at end of file diff --git a/docs/Notes_on_aligning_with_EFO_agentics.md b/docs/Notes_on_aligning_with_EFO_agentics.md new file mode 100644 index 000000000..e69de29bb diff --git a/docs/QUICK-REFERENCE.md b/docs/QUICK-REFERENCE.md new file mode 100644 index 000000000..6865e738d --- /dev/null +++ b/docs/QUICK-REFERENCE.md @@ -0,0 +1,365 @@ +# EFO Agent System - Quick Reference Guide v1.1 + +## 🎯 The Three-Agent System at a Glance + +``` +┌─────────────────────────────────────────────────────────────┐ +│ USER REQUEST │ +│ "Please add term: [name]" │ +└────────────────────────────┬────────────────────────────────┘ + │ + ▼ + ┌────────────────────────────────────────┐ + │ COPILOT-INSTRUCTIONS.MD │ + │ Workflow Orchestrator & Router │ + │ │ + │ • Receives all user requests │ + │ • Makes architectural decisions │ + │ • Routes to appropriate agents │ + │ • Sequences multi-agent workflows │ + └──────────┬──────────────────┬──────────┘ + │ │ + ┌─────────▼─────────┐ ┌────▼───────────┐ ┌─────▼──────────┐ + │ EFO-ONTOLOGIST │ │ EFO-CURATOR │ │ EFO-IMPORTER │ + │ Specialist Editor │ │ The Researcher │ │ The Connector │ + │ │ │ │ │ │ + │ • OWL/XML editing │ │ • Literature │ │ • OLS search │ + │ • Term addition │ │ search │ │ • Term import │ + │ • Term obsoletion │ │ • Validation │ │ • IRI deps │ + │ • Logical defs │ │ • Citations │ │ • Mirrors │ + │ • Git workflow │ │ • Recommends │ │ │ + └───────────────────┘ └────────────────┘ └────────────────┘ +``` + +**Key Changes in v1.1**: +- **No agent orchestrates others** - copilot-instructions handles routing +- **Agents are specialists** - narrow, well-defined responsibilities +- **Clear boundaries** - no overlapping decision-making + +## Decision Matrix: What Happens When? + +| User Request | Instructions Route | Curator Called? | Importer Called? | Ontologist Called? | +|--------------|-------------------|-----------------|------------------|-------------------| +| New term (label only) | Research → validate → integrate | YES (research) | Maybe | YES (integrate) | +| New term (complete info) | Verify → integrate | YES (verify) | Maybe | YES (integrate) | +| Edit definition | Assess → maybe research → edit | If needs citations | NO | YES (edit) | +| Fix typo | Direct to ontologist | NO | NO | YES (edit) | +| Obsolete term | Direct to ontologist | NO | Maybe (if replacement external) | YES (obsolete) | +| Add synonym | Direct to ontologist | Only if validation needed | NO | YES (edit) | + +## Common Workflows + +### Workflow A: Minimal Info → Full Integration +``` +User: "Add term: ATAC-seq" + +1. copilot-instructions: Route to curator for research + ↓ +2. Curator: Research literature + - Search Europe PMC + - Find definition: "Assay for Transposase-Accessible Chromatin..." + - Locate PMIDs: 24097267, others + - Identify parent: "chromatin accessibility assay" + - Report: "Ready for EFO; parent may need import from OBI" + ↓ +3. copilot-instructions: "Parent not in EFO, call importer" + ↓ +4. 🔗 Importer: Search OLS + - Find: OBI:0002039 + - Add to obi_terms.txt + - Confirm: "Import complete" + ↓ +5. copilot-instructions: "Call ontologist to integrate" + ↓ +6. Ontologist: Integration + - Generate EFO_0920XXX + - Create OWL/XML entry + - Add SubClassOf OBI:0002039 + - Normalize + - Commit → PR + ↓ +Done +``` + +### Workflow B: Complete Info → Quick Verify +``` +User: "Add cardiac troponin measurement" + Definition: [provided] + PMID: 12345678 + Parent: blood measurement + +1. copilot-instructions: Route to curator for verification + ↓ +2. Curator: Validate + - Check PMID ✅ relevant + - Verify definition ✅ accurate + - Confirm parent ✅ appropriate + - Note: needs "is_about cardiac troponin" + - Report: "Ready for EFO, import PR:000000058" + ↓ +3. copilot-instructions: "Call importer for cardiac troponin" + ↓ +4. Importer: Import cardiac troponin from PR + ↓ +5. copilot-instructions: "Call ontologist to integrate" + ↓ +6. Ontologist: Integration with logical definition + ↓ +Done +``` + +### Workflow C: External Ontology Recommendation +``` +User: "Add Alzheimer's disease" + +1. copilot-instructions: Route to curator + ↓ +2. Curator: Research + - Search literature ✅ + - Find definition ✅ + - Check MONDO: ✅ MONDO:0004975 exists! + - Report: "DO NOT create in EFO, import from MONDO" + ↓ +3. copilot-instructions: "Call importer" + ↓ +4. Importer: Import MONDO:0004975 + ↓ +Done (imported, not created) +``` + +### Workflow D: Should Be in OBA +``` +User: "Add body mass index measurement" + +1. copilot-instructions: Route to curator + ↓ +2. Curator: Research + - Search literature ✅ + - Find definition ✅ + - Analyze domain: general biological attribute + - Report: "Create in OBA, not EFO" + - Provide full validation report + ↓ +3. copilot-instructions → User: + "This should be created in OBA because it's a general + biological attribute measurement. Here's the complete + validation report to submit to OBA..." + ↓ +Done (no EFO integration, user submits to OBA) +``` + +## Agent Profiles + +### EFO-Ontologist: The Specialist Editor +- **Role**: OWL/XML manipulation expert +- **Mindset**: "How do I format this correctly?" +- **Strengths**: Precise syntax, consistent formatting, git workflow +- **Limitations**: No research, no imports, no orchestration +- **Says**: + - "Adding term to efo-edit.owl..." + - "Generating EFO_0920XXX..." + - "Running normalization..." + - "Creating PR..." + +### EFO-Curator: The Diligent Researcher +- **Role**: Literature research and validation +- **Mindset**: "What does the literature say? Is this accurate?" +- **Strengths**: Deep research, evidence-based, thorough +- **Limitations**: No OWL/XML editing, no imports +- **Says**: + - "Found 15 papers mentioning this concept" + - "Definition supported by PMID:12345678" + - "This actually belongs in OBA based on usage patterns" + - "Recommend importing from MONDO" + +### EFO-Importer: The Efficient Connector +- **Role**: External term import specialist +- **Mindset**: "Where is this term? Is this the right one?" +- **Strengths**: Fast OLS lookups, precise verification +- **Limitations**: Only imports, no integration, no research +- **Says**: + - "Found in CL as CL:1000348" + - "Import complete, ready to use" + - "Term not found in CL, trying UBERON..." + +### copilot-instructions: The Orchestrator +- **Role**: Workflow coordination and decision-making +- **Mindset**: "What needs to happen? In what order?" +- **Strengths**: Architectural decisions, agent routing, workflow sequencing +- **Says**: + - "This needs research first, calling curator..." + - "Term validated, parent needs import, calling importer..." + - "Ready to integrate, calling ontologist..." + - "This belongs in MONDO, not EFO" + +## Capabilities Comparison + +| Task | Ontologist | Curator | Importer | +|------|-----------|---------|----------| +| **Literature Search** | | | | +| Europe PMC search | ❌ | ✅ Full | ❌ | +| Full text analysis | ❌ | ✅ Yes | ❌ | +| Citation validation | ❌ | ✅ Yes | ❌ | +| **Ontology Work** | | | | +| OWL/XML editing | ✅ Expert | ❌ | ❌ | +| OLS search | Limited | ✅ Yes | ✅ Expert | +| Import terms | ❌ | ❌ | ✅ Yes | +| Logical definitions | ✅ Yes | ❌ | ❌ | +| **Decision Making** | | | | +| Workflow routing | ❌ | ❌ | ❌ | +| Ontology placement | ❌ | ✅ Advises | ❌ | +| Parent selection | ✅ Implements | ✅ Researches | ✅ Finds | +| **Git Workflow** | | | | +| Branches | ✅ Yes | ❌ | ❌ | +| Commits | ✅ Yes | ❌ | ❌ | +| PRs | ✅ Yes | ❌ | ❌ | + +**Note**: Workflow routing and architectural decisions now handled by `copilot-instructions.md` + +## When to Use Which Agent + +### Use @EFO-ontologist when: +- You're a user with any request +- Need architectural decision +- Need term integration +- Need obsoletion +- Coordinating multiple agents + +### Use @EFO-curator when: +- Need literature research +- Need definition validation +- Unclear what ontology is appropriate +- Missing metadata + +### Use @EFO-importer when: +- Need external term imported +- Parent is in another ontology +- Need to check if term exists elsewhere + +## Pro Tips + +### For Users +1. **Start with ontologist**: Always `@EFO-ontologist` for requests +2. **Provide what you have**: Even partial info is helpful +3. **Trust the process**: Agents will coordinate automatically +4. **Don't worry about ontology choice**: Curator will recommend + +### For Ontologist +1. **Always validate**: Even complete requests should go to curator +2. **Think cross-ontology**: Consider MONDO, OBA, CL, UBERON first +3. **Don't skip importer**: Always import parents if they are from a different ontology, never copy-paste +4. **Document decisions**: Explain non-obvious choices in PRs + +### For Curator +1. **Be thorough**: More evidence is better than less +2. **Flag uncertainties**: Explicitly state confidence levels +3. **Think domain**: Consider measurement vs disease vs cell type +4. **Recommend boldly**: Don't hesitate to suggest external ontologies + +### For Importer +1. **Verify bidirectionally**: Always fetch after search to confirm +2. **Note environment**: GitHub vs VS Code matters +3. **Suggest alternatives**: If term not found, help find it elsewhere + +## Success Metrics + +### A Good Curator Report Has: +- Clear definition with 2-3 literature sources +- Validated parent term with justification +- PMIDs and DOIs (both when available) +- Synonyms with sources +- Clear ontology recommendation +- Confidence levels stated + +### A Good Ontologist Integration Has: +- All required components (label, def, xref, parent) +- Proper OWL/XML formatting +- Logical definitions when appropriate +- Normalized without errors +- Clear commit message +- Complete PR description + +### A Good Importer Job Has: +- Correct term found in correct ontology +- Bidirectional verification passed +- IRI added to correct dependency file +- Ready to use in efo-edit.owl + +## Red Flags + +### Curator Should Flag: +- No literature support found +- Conflicting definitions in papers +- Term seems to belong in another ontology +- Parent term doesn't make sense +- Provided citations don't support definition + +### Ontologist Should Flag: +- Curator has low confidence +- Parent term needs importing but not found +- Logical definition doesn't match text definition +- Term already exists in EFO or imports +- Obsoletion would break many relationships + +### Importer Should Flag: +- Term not found in expected ontology +- Multiple candidate terms (ambiguous) +- Term doesn't match description +- Ontology mirror is stale + +## Documentation Structure + +``` +docs/agents-documentation/ +│ +├── README.md ← Overview & quick start +└── QUICK-REFERENCE.md ← This file (visual guide) + +.github/agents/ +│ +├── EFO-ontologist.md ← Full ontologist spec +├── EFO-curator.md ← Full curator spec +├── EFO-importer.md ← Full importer spec +└── HANDOFF-PROTOCOL.md ← Communication protocols +``` + +**Read this first**: `README.md` +**Need details**: Individual agent `.md` files +**Understanding communication**: `HANDOFF-PROTOCOL.md` +**Quick lookup**: This file (`QUICK-REFERENCE.md`) + +## Related Documentation + +- **Main guide**: `.github/copilot-instructions.md` +- **Import workflow**: `docs/Import_terms_from_another_ontology.md` +- **Editor workflow**: `docs/odk-workflows/EditorsWorkflow.md` +- **ODK docs**: `docs/odk-workflows/` + +## Common Questions + +**Q: Why three agents instead of one?** +A: Separation of concerns. Research skills ≠ Integration skills. Each agent is expert at one thing. + +**Q: Can I call curator directly?** +A: Technically yes, but better to go through ontologist who orchestrates the full workflow. + +**Q: What if curator says "should be in OBA"?** +A: Ontologist acknowledges and provides report to user for OBA submission. No EFO integration. + +**Q: Do I need to know OWL/XML?** +A: No! Just tell ontologist what you want. They handle all the technical details. + +**Q: How long does curation take?** +A: Depends on literature availability. Simple terms: fast. Novel concepts: may take research time. + +**Q: What if a term exists in multiple ontologies?** +A: Curator researches which is authoritative. Ontologist decides whether to import or create. + +**Q: Can I update an agent?** +A: Yes! Edit the `.md` file, update handoff protocol if needed, test with a sample issue. + + +--- + +Last updated: 2025-11-19 +Version: 1.1 diff --git a/docs/README.md b/docs/README.md new file mode 100644 index 000000000..d09d4f9e7 --- /dev/null +++ b/docs/README.md @@ -0,0 +1,409 @@ +# EFO Agent System - Overview + +This directory contains the specifications for three specialized agents that work together to manage the Experimental Factor Ontology (EFO). + +## Agent Architecture v1.1 + +### Three-Agent System with Workflow Orchestration + +``` + ┌─────────────────┐ + │ User Request │ + └────────┬────────┘ + │ + ┌────────▼────────────┐ + │copilot-instructions │ ◄─── Workflow Orchestrator + │ (Decision Logic) │ & Decision Maker + └────┬───────┬────────┘ + │ │ + ┌───────────────┼───────┼──────────────┐ + │ │ │ │ + ┌────▼─────┐ ┌────-─▼─-─┐ ┌─▼──────────┐ │ + │ EFO- │ │ EFO- │ │ EFO- │ │ + │ontologist│ │ curator │ │ importer │ │ + │(Editor) │ │(Research)│ │ (Import) │ │ + └──────────┘ └────────-─┘ └────────────┘ │ + │ │ │ │ + └───────────────┴────────────┴─────────┘ + │ + Shared Context +``` + +**Key Changes in v1.1**: +- Workflow orchestration moved to `copilot-instructions.md` +- Decision logic (ontology placement, agent routing) centralized +- Agents are now narrow specialists with clear boundaries +- No agent-to-agent orchestration - all coordinated by instructions + +## The Agents + +### 1. EFO-ontologist (Specialist Editor) v1.1 +**File**: `.github/agents/EFO-ontologist.md` + +**Role**: OWL/XML editing specialist +- Handles all direct manipulation of `efo-edit.owl` +- Adds new terms (with pre-validated information) +- Edits existing terms +- Obsoletes terms following proper workflow +- Manages logical definitions and relationships +- Maintains ontology consistency + +**What it does NOT do**: +- Literature research (→ EFO-curator) +- External term imports (→ EFO-importer) +- Workflow orchestration (→ copilot-instructions) +- Architectural decisions (→ copilot-instructions) + +**When to invoke**: +- Add/edit/obsolete terms in efo-edit.owl +- Fix OWL/XML syntax issues +- Update relationships or metadata + +**Prerequisites**: +- New terms need pre-validated information +- External terms must be pre-imported + +**Key capabilities**: +- OWL/XML formatting +- Term integration +- Logical definitions +- Relationship management +- Git workflow + +### 2. EFO-curator (Research Specialist) +**File**: `.github/agents/EFO-curator.md` + +**Role**: Literature research and validation +- Deep literature searches using artl-mcp +- Validates term components (label, definition, xrefs, parent) +- Generates comprehensive validation reports +- Provides evidence-based recommendations +- Domain-specific expertise + +**When to invoke**: +- New term needs research/validation +- Definition requires literature support +- Parent term relationship unclear +- Ontology placement needs research + +**Key capabilities**: +- Europe PMC full-text search +- Citation validation +- Evidence gathering +- Domain expertise +- Structured reporting + +### 3. EFO-importer (Import Specialist) +**File**: `.github/agents/EFO-importer.md` + +**Role**: External ontology term importer +- Searches OLS for terms in external ontologies +- Bidirectional verification of term identity +- Adds IRIs to dependency files +- Updates mirrors and regenerates imports + +**When to invoke**: +- Parent term exists in external ontology +- Need to import related terms +- Cross-ontology relationships needed + +**Key capabilities**: +- OLS search +- IRI validation +- Dependency file management +- Import generation + +## Workflow Orchestration + +**Location**: `.github/copilot-instructions.md` + +The copilot instructions file now handles: +- Initial request triage +- Ontology placement decisions (EFO vs MONDO vs OBA vs CL, etc.) +- Agent invocation routing +- Workflow sequencing +- Quality assurance checks + +**Decision patterns**: +- Standard diseases → MONDO import +- General measurements → OBA consideration +- Experimental assays → EFO +- Cell types → CL import +- Anatomical entities → UBERON import + +## Handoff Protocol + +**File**: `HANDOFF-PROTOCOL.md` + +Defines: +- Communication patterns between agents +- Request/response formats +- Multi-agent workflows +- Error handling +- State tracking + +**Key patterns**: +1. **New term (needs research)**: copilot-instructions → Curator → Ontologist +2. **New term (pre-validated)**: copilot-instructions → Ontologist +3. **Import needed**: copilot-instructions → Importer → Ontologist +4. **External ontology**: Curator → User (no integration) +5. **Simple edit**: Ontologist only + +## Quick Start + +### For New Term Requests + +For basic requests, describe what you need: + +```markdown +Please add a new term: +- Label: [term name] +- Definition: [if you have one] +- Parent: [if you know it] +- References: [if you have any] +``` + +The workflow will: +1. Assess what you've provided (copilot-instructions) +2. Call @EFO-curator to fill gaps or validate +3. Call @EFO-importer if external terms needed +4. Call @EFO-ontologist to integrate into EFO +5. Create a PR for review + +Or invoke agents directly: + +```markdown +@EFO-curator please research [term name] +@EFO-importer please import [term name] from MONDO +@EFO-ontologist please add this validated term to efo-edit.owl +``` + +### For Editing Existing Terms + +```markdown +Please edit [term name] (EFO:XXXXXXX): +- [Describe the change needed] +``` + +Or directly: +```markdown +@EFO-ontologist edit EFO:XXXXXXX to update the definition +``` + +### For Obsoleting Terms + +```markdown +Please obsolete [term name] (EFO:XXXXXXX) +Replacement: [term name] (EFO:YYYYYYY) +Reason: [why obsoleting] +``` + +Or directly: +```markdown +@EFO-ontologist obsolete EFO:XXXXXXX, replaced by EFO:YYYYYYY +``` + +## Agent Capabilities Matrix + +| Capability | Ontologist | Curator | Importer | +|-----------|-----------|---------|----------| +| Literature search | ❌ | ✅ | ❌ | +| OWL/XML editing | ✅ | ❌ | ❌ | +| OLS search | Limited | Limited | ✅ | +| Definition validation | ❌ | ✅ | ❌ | +| Parent term import | ❌ | ❌ | ✅ | +| Ontology placement advisory | ❌ | ✅ | ❌ | +| Git workflow | ✅ | ❌ | ❌ | +| Term integration | ✅ | ❌ | ❌ | +| Workflow orchestration | ❌ | ❌ | ❌ | + +**Note**: Workflow orchestration and architectural decisions now handled by `copilot-instructions.md` + +## Tools Used + +### artl-mcp (Literature Research) +Used by: **Curator** +- `search_europepmc_papers`: Find papers by keywords +- `get_europepmc_paper_by_id`: Get metadata for specific papers +- `get_all_identifiers_from_europepmc`: Get PMIDs, DOIs, PMCIDs +- `get_europepmc_full_text`: Get full text as Markdown +- `get_europepmc_pdf_as_markdown`: Convert PDF to Markdown + +### ols4-mcp (Ontology Lookup) +Used by: **All agents** +- `mcp_ols4_search`: Search all ontologies +- `mcp_ols4_searchClasses`: Search specific ontology +- `mcp_ols4_fetch`: Get term details +- `mcp_ols4_getAncestors`: Get term hierarchy +- `mcp_ols4_getDescendants`: Get child terms + +### Standard Tools +- `grep_search`, `file_search`: Find terms in files +- `read_file`, `replace_string_in_file`: Edit ontology +- `run_in_terminal`: Execute make commands +- `manage_todo_list`: Track multi-step workflows + +## Workflow Examples + +### Example 1: Minimal Information +``` +User: "Add term: ATAC-seq" + ↓ +copilot-instructions: Triage → Call curator for research + ↓ +Curator: Research literature → Generate report + ↓ +copilot-instructions: Decide parent needs import → Call importer + ↓ +Importer: Import parent from OBI + ↓ +copilot-instructions: Call ontologist to integrate + ↓ +Ontologist: Add to efo-edit.owl → Create PR +``` + +### Example 2: Complete Information +``` +User: "Add cardiac measurement with definition, PMID, parent" + ↓ +copilot-instructions: Triage → Call curator to verify + ↓ +Curator: Verify citations → Validate parent → Confirm EFO placement + ↓ +copilot-instructions: Call ontologist to integrate + ↓ +Ontologist: Add term → Create PR +``` + +### Example 3: Should Be External +``` +User: "Add general disease term" + ↓ +copilot-instructions: Triage → Call curator for research + ↓ +Curator: Research → Recommend MONDO (not EFO) + ↓ +copilot-instructions: Acknowledge → Inform user (no integration) + ↓ +User: Submit to MONDO with curator's report +``` + +## Decision Trees + +### Should I create one agent or two? + +**Two agents is better because**: +✅ Separation of concerns (research vs integration) +✅ Curator can be called for external submissions too +✅ Different expertise required (literature vs OWL/XML) +✅ Easier to maintain and improve each +✅ Clear handoff points + +### Which agent do I call? + +``` +Are you a user? → @EFO-ontologist +Are you the ontologist needing validation? → @EFO-curator +Are you the ontologist needing imports? → @EFO-importer +Are you the curator? → Response to @EFO-ontologist +Are you the importer? → Response to @EFO-ontologist +``` + +## File Structure + +``` +.github/agents/ +├── EFO-ontologist.md ← Main orchestrator agent +├── EFO-curator.md ← Research & validation agent +├── EFO-importer.md ← Import specialist agent (existing) +└── HANDOFF-PROTOCOL.md ← Communication protocols +``` + +## Maintenance + +### Updating Agent Specifications + +When updating an agent: +1. Edit the relevant agent's `.md` file +2. Update `HANDOFF-PROTOCOL.md` if communication patterns change +3. Update this README if capabilities change +4. Test the workflow with a sample issue + +### Adding New Capabilities + +When adding new tools or workflows: +1. Determine which agent should handle it +2. Update that agent's specification +3. Update handoff protocol if involves multiple agents +4. Add to capabilities matrix in this README + +### Common Issues + +**Agent not finding terms**: +- Check OLS is accessible +- Verify term exists in expected ontology +- Try alternative search terms + +**Literature search returns nothing**: +- Try broader search terms +- Search for related concepts +- Check alternative spellings/synonyms + +**Import fails**: +- Verify term exists in source ontology +- Check IRI format +- Ensure mirrors are up to date + +## Best Practices + +### For Users +- Provide as much information as you have +- Include relevant PMIDs or papers if known +- Mention domain context (disease, measurement, etc.) +- Reference related existing terms if applicable + +### For Agent Development +- Keep agents focused on their core competency +- Use structured communication formats +- Always validate before integrating +- Document decisions in commit messages and PRs +- Use TODO lists for multi-step workflows + +### For Ontology Curation +- Always require literature support +- Verify parent relationships make sense +- Check for existing terms before creating new ones +- Consider external ontologies for general concepts +- Maintain consistency with existing patterns + +## Testing the System + +To test the agent system: + +1. **Simple test**: "Add synonym 'XYZ' to term ABC" + - Should: Ontologist only + +2. **Medium test**: "Add new term: [label only]" + - Should: Ontologist → Curator → (maybe Importer) → Ontologist + +3. **Complex test**: "Add new measurement with is_about relationship" + - Should: All three agents, full validation, logical definition + +4. **Edge test**: "Add general anatomical term" + - Should: Ontologist → Curator → Recommend UBERON + +## Support + +For questions about: +- **Agent behavior**: See individual agent `.md` files +- **Communication**: See `HANDOFF-PROTOCOL.md` +- **Ontology editing**: See main `copilot-instructions.md` +- **Import process**: See `docs/Import_terms_from_another_ontology.md` + +## Version History + +- **v1.0** (2025-01-06): Initial three-agent system + - EFO-ontologist (orchestrator) + - EFO-curator (researcher) + - EFO-importer (existing, connector) + - Handoff protocol established diff --git a/docs/agents-documentation/QUICK-REFERENCE.md b/docs/agents-documentation/QUICK-REFERENCE.md new file mode 100644 index 000000000..6865e738d --- /dev/null +++ b/docs/agents-documentation/QUICK-REFERENCE.md @@ -0,0 +1,365 @@ +# EFO Agent System - Quick Reference Guide v1.1 + +## 🎯 The Three-Agent System at a Glance + +``` +┌─────────────────────────────────────────────────────────────┐ +│ USER REQUEST │ +│ "Please add term: [name]" │ +└────────────────────────────┬────────────────────────────────┘ + │ + ▼ + ┌────────────────────────────────────────┐ + │ COPILOT-INSTRUCTIONS.MD │ + │ Workflow Orchestrator & Router │ + │ │ + │ • Receives all user requests │ + │ • Makes architectural decisions │ + │ • Routes to appropriate agents │ + │ • Sequences multi-agent workflows │ + └──────────┬──────────────────┬──────────┘ + │ │ + ┌─────────▼─────────┐ ┌────▼───────────┐ ┌─────▼──────────┐ + │ EFO-ONTOLOGIST │ │ EFO-CURATOR │ │ EFO-IMPORTER │ + │ Specialist Editor │ │ The Researcher │ │ The Connector │ + │ │ │ │ │ │ + │ • OWL/XML editing │ │ • Literature │ │ • OLS search │ + │ • Term addition │ │ search │ │ • Term import │ + │ • Term obsoletion │ │ • Validation │ │ • IRI deps │ + │ • Logical defs │ │ • Citations │ │ • Mirrors │ + │ • Git workflow │ │ • Recommends │ │ │ + └───────────────────┘ └────────────────┘ └────────────────┘ +``` + +**Key Changes in v1.1**: +- **No agent orchestrates others** - copilot-instructions handles routing +- **Agents are specialists** - narrow, well-defined responsibilities +- **Clear boundaries** - no overlapping decision-making + +## Decision Matrix: What Happens When? + +| User Request | Instructions Route | Curator Called? | Importer Called? | Ontologist Called? | +|--------------|-------------------|-----------------|------------------|-------------------| +| New term (label only) | Research → validate → integrate | YES (research) | Maybe | YES (integrate) | +| New term (complete info) | Verify → integrate | YES (verify) | Maybe | YES (integrate) | +| Edit definition | Assess → maybe research → edit | If needs citations | NO | YES (edit) | +| Fix typo | Direct to ontologist | NO | NO | YES (edit) | +| Obsolete term | Direct to ontologist | NO | Maybe (if replacement external) | YES (obsolete) | +| Add synonym | Direct to ontologist | Only if validation needed | NO | YES (edit) | + +## Common Workflows + +### Workflow A: Minimal Info → Full Integration +``` +User: "Add term: ATAC-seq" + +1. copilot-instructions: Route to curator for research + ↓ +2. Curator: Research literature + - Search Europe PMC + - Find definition: "Assay for Transposase-Accessible Chromatin..." + - Locate PMIDs: 24097267, others + - Identify parent: "chromatin accessibility assay" + - Report: "Ready for EFO; parent may need import from OBI" + ↓ +3. copilot-instructions: "Parent not in EFO, call importer" + ↓ +4. 🔗 Importer: Search OLS + - Find: OBI:0002039 + - Add to obi_terms.txt + - Confirm: "Import complete" + ↓ +5. copilot-instructions: "Call ontologist to integrate" + ↓ +6. Ontologist: Integration + - Generate EFO_0920XXX + - Create OWL/XML entry + - Add SubClassOf OBI:0002039 + - Normalize + - Commit → PR + ↓ +Done +``` + +### Workflow B: Complete Info → Quick Verify +``` +User: "Add cardiac troponin measurement" + Definition: [provided] + PMID: 12345678 + Parent: blood measurement + +1. copilot-instructions: Route to curator for verification + ↓ +2. Curator: Validate + - Check PMID ✅ relevant + - Verify definition ✅ accurate + - Confirm parent ✅ appropriate + - Note: needs "is_about cardiac troponin" + - Report: "Ready for EFO, import PR:000000058" + ↓ +3. copilot-instructions: "Call importer for cardiac troponin" + ↓ +4. Importer: Import cardiac troponin from PR + ↓ +5. copilot-instructions: "Call ontologist to integrate" + ↓ +6. Ontologist: Integration with logical definition + ↓ +Done +``` + +### Workflow C: External Ontology Recommendation +``` +User: "Add Alzheimer's disease" + +1. copilot-instructions: Route to curator + ↓ +2. Curator: Research + - Search literature ✅ + - Find definition ✅ + - Check MONDO: ✅ MONDO:0004975 exists! + - Report: "DO NOT create in EFO, import from MONDO" + ↓ +3. copilot-instructions: "Call importer" + ↓ +4. Importer: Import MONDO:0004975 + ↓ +Done (imported, not created) +``` + +### Workflow D: Should Be in OBA +``` +User: "Add body mass index measurement" + +1. copilot-instructions: Route to curator + ↓ +2. Curator: Research + - Search literature ✅ + - Find definition ✅ + - Analyze domain: general biological attribute + - Report: "Create in OBA, not EFO" + - Provide full validation report + ↓ +3. copilot-instructions → User: + "This should be created in OBA because it's a general + biological attribute measurement. Here's the complete + validation report to submit to OBA..." + ↓ +Done (no EFO integration, user submits to OBA) +``` + +## Agent Profiles + +### EFO-Ontologist: The Specialist Editor +- **Role**: OWL/XML manipulation expert +- **Mindset**: "How do I format this correctly?" +- **Strengths**: Precise syntax, consistent formatting, git workflow +- **Limitations**: No research, no imports, no orchestration +- **Says**: + - "Adding term to efo-edit.owl..." + - "Generating EFO_0920XXX..." + - "Running normalization..." + - "Creating PR..." + +### EFO-Curator: The Diligent Researcher +- **Role**: Literature research and validation +- **Mindset**: "What does the literature say? Is this accurate?" +- **Strengths**: Deep research, evidence-based, thorough +- **Limitations**: No OWL/XML editing, no imports +- **Says**: + - "Found 15 papers mentioning this concept" + - "Definition supported by PMID:12345678" + - "This actually belongs in OBA based on usage patterns" + - "Recommend importing from MONDO" + +### EFO-Importer: The Efficient Connector +- **Role**: External term import specialist +- **Mindset**: "Where is this term? Is this the right one?" +- **Strengths**: Fast OLS lookups, precise verification +- **Limitations**: Only imports, no integration, no research +- **Says**: + - "Found in CL as CL:1000348" + - "Import complete, ready to use" + - "Term not found in CL, trying UBERON..." + +### copilot-instructions: The Orchestrator +- **Role**: Workflow coordination and decision-making +- **Mindset**: "What needs to happen? In what order?" +- **Strengths**: Architectural decisions, agent routing, workflow sequencing +- **Says**: + - "This needs research first, calling curator..." + - "Term validated, parent needs import, calling importer..." + - "Ready to integrate, calling ontologist..." + - "This belongs in MONDO, not EFO" + +## Capabilities Comparison + +| Task | Ontologist | Curator | Importer | +|------|-----------|---------|----------| +| **Literature Search** | | | | +| Europe PMC search | ❌ | ✅ Full | ❌ | +| Full text analysis | ❌ | ✅ Yes | ❌ | +| Citation validation | ❌ | ✅ Yes | ❌ | +| **Ontology Work** | | | | +| OWL/XML editing | ✅ Expert | ❌ | ❌ | +| OLS search | Limited | ✅ Yes | ✅ Expert | +| Import terms | ❌ | ❌ | ✅ Yes | +| Logical definitions | ✅ Yes | ❌ | ❌ | +| **Decision Making** | | | | +| Workflow routing | ❌ | ❌ | ❌ | +| Ontology placement | ❌ | ✅ Advises | ❌ | +| Parent selection | ✅ Implements | ✅ Researches | ✅ Finds | +| **Git Workflow** | | | | +| Branches | ✅ Yes | ❌ | ❌ | +| Commits | ✅ Yes | ❌ | ❌ | +| PRs | ✅ Yes | ❌ | ❌ | + +**Note**: Workflow routing and architectural decisions now handled by `copilot-instructions.md` + +## When to Use Which Agent + +### Use @EFO-ontologist when: +- You're a user with any request +- Need architectural decision +- Need term integration +- Need obsoletion +- Coordinating multiple agents + +### Use @EFO-curator when: +- Need literature research +- Need definition validation +- Unclear what ontology is appropriate +- Missing metadata + +### Use @EFO-importer when: +- Need external term imported +- Parent is in another ontology +- Need to check if term exists elsewhere + +## Pro Tips + +### For Users +1. **Start with ontologist**: Always `@EFO-ontologist` for requests +2. **Provide what you have**: Even partial info is helpful +3. **Trust the process**: Agents will coordinate automatically +4. **Don't worry about ontology choice**: Curator will recommend + +### For Ontologist +1. **Always validate**: Even complete requests should go to curator +2. **Think cross-ontology**: Consider MONDO, OBA, CL, UBERON first +3. **Don't skip importer**: Always import parents if they are from a different ontology, never copy-paste +4. **Document decisions**: Explain non-obvious choices in PRs + +### For Curator +1. **Be thorough**: More evidence is better than less +2. **Flag uncertainties**: Explicitly state confidence levels +3. **Think domain**: Consider measurement vs disease vs cell type +4. **Recommend boldly**: Don't hesitate to suggest external ontologies + +### For Importer +1. **Verify bidirectionally**: Always fetch after search to confirm +2. **Note environment**: GitHub vs VS Code matters +3. **Suggest alternatives**: If term not found, help find it elsewhere + +## Success Metrics + +### A Good Curator Report Has: +- Clear definition with 2-3 literature sources +- Validated parent term with justification +- PMIDs and DOIs (both when available) +- Synonyms with sources +- Clear ontology recommendation +- Confidence levels stated + +### A Good Ontologist Integration Has: +- All required components (label, def, xref, parent) +- Proper OWL/XML formatting +- Logical definitions when appropriate +- Normalized without errors +- Clear commit message +- Complete PR description + +### A Good Importer Job Has: +- Correct term found in correct ontology +- Bidirectional verification passed +- IRI added to correct dependency file +- Ready to use in efo-edit.owl + +## Red Flags + +### Curator Should Flag: +- No literature support found +- Conflicting definitions in papers +- Term seems to belong in another ontology +- Parent term doesn't make sense +- Provided citations don't support definition + +### Ontologist Should Flag: +- Curator has low confidence +- Parent term needs importing but not found +- Logical definition doesn't match text definition +- Term already exists in EFO or imports +- Obsoletion would break many relationships + +### Importer Should Flag: +- Term not found in expected ontology +- Multiple candidate terms (ambiguous) +- Term doesn't match description +- Ontology mirror is stale + +## Documentation Structure + +``` +docs/agents-documentation/ +│ +├── README.md ← Overview & quick start +└── QUICK-REFERENCE.md ← This file (visual guide) + +.github/agents/ +│ +├── EFO-ontologist.md ← Full ontologist spec +├── EFO-curator.md ← Full curator spec +├── EFO-importer.md ← Full importer spec +└── HANDOFF-PROTOCOL.md ← Communication protocols +``` + +**Read this first**: `README.md` +**Need details**: Individual agent `.md` files +**Understanding communication**: `HANDOFF-PROTOCOL.md` +**Quick lookup**: This file (`QUICK-REFERENCE.md`) + +## Related Documentation + +- **Main guide**: `.github/copilot-instructions.md` +- **Import workflow**: `docs/Import_terms_from_another_ontology.md` +- **Editor workflow**: `docs/odk-workflows/EditorsWorkflow.md` +- **ODK docs**: `docs/odk-workflows/` + +## Common Questions + +**Q: Why three agents instead of one?** +A: Separation of concerns. Research skills ≠ Integration skills. Each agent is expert at one thing. + +**Q: Can I call curator directly?** +A: Technically yes, but better to go through ontologist who orchestrates the full workflow. + +**Q: What if curator says "should be in OBA"?** +A: Ontologist acknowledges and provides report to user for OBA submission. No EFO integration. + +**Q: Do I need to know OWL/XML?** +A: No! Just tell ontologist what you want. They handle all the technical details. + +**Q: How long does curation take?** +A: Depends on literature availability. Simple terms: fast. Novel concepts: may take research time. + +**Q: What if a term exists in multiple ontologies?** +A: Curator researches which is authoritative. Ontologist decides whether to import or create. + +**Q: Can I update an agent?** +A: Yes! Edit the `.md` file, update handoff protocol if needed, test with a sample issue. + + +--- + +Last updated: 2025-11-19 +Version: 1.1 diff --git a/docs/agents-documentation/README.md b/docs/agents-documentation/README.md new file mode 100644 index 000000000..d09d4f9e7 --- /dev/null +++ b/docs/agents-documentation/README.md @@ -0,0 +1,409 @@ +# EFO Agent System - Overview + +This directory contains the specifications for three specialized agents that work together to manage the Experimental Factor Ontology (EFO). + +## Agent Architecture v1.1 + +### Three-Agent System with Workflow Orchestration + +``` + ┌─────────────────┐ + │ User Request │ + └────────┬────────┘ + │ + ┌────────▼────────────┐ + │copilot-instructions │ ◄─── Workflow Orchestrator + │ (Decision Logic) │ & Decision Maker + └────┬───────┬────────┘ + │ │ + ┌───────────────┼───────┼──────────────┐ + │ │ │ │ + ┌────▼─────┐ ┌────-─▼─-─┐ ┌─▼──────────┐ │ + │ EFO- │ │ EFO- │ │ EFO- │ │ + │ontologist│ │ curator │ │ importer │ │ + │(Editor) │ │(Research)│ │ (Import) │ │ + └──────────┘ └────────-─┘ └────────────┘ │ + │ │ │ │ + └───────────────┴────────────┴─────────┘ + │ + Shared Context +``` + +**Key Changes in v1.1**: +- Workflow orchestration moved to `copilot-instructions.md` +- Decision logic (ontology placement, agent routing) centralized +- Agents are now narrow specialists with clear boundaries +- No agent-to-agent orchestration - all coordinated by instructions + +## The Agents + +### 1. EFO-ontologist (Specialist Editor) v1.1 +**File**: `.github/agents/EFO-ontologist.md` + +**Role**: OWL/XML editing specialist +- Handles all direct manipulation of `efo-edit.owl` +- Adds new terms (with pre-validated information) +- Edits existing terms +- Obsoletes terms following proper workflow +- Manages logical definitions and relationships +- Maintains ontology consistency + +**What it does NOT do**: +- Literature research (→ EFO-curator) +- External term imports (→ EFO-importer) +- Workflow orchestration (→ copilot-instructions) +- Architectural decisions (→ copilot-instructions) + +**When to invoke**: +- Add/edit/obsolete terms in efo-edit.owl +- Fix OWL/XML syntax issues +- Update relationships or metadata + +**Prerequisites**: +- New terms need pre-validated information +- External terms must be pre-imported + +**Key capabilities**: +- OWL/XML formatting +- Term integration +- Logical definitions +- Relationship management +- Git workflow + +### 2. EFO-curator (Research Specialist) +**File**: `.github/agents/EFO-curator.md` + +**Role**: Literature research and validation +- Deep literature searches using artl-mcp +- Validates term components (label, definition, xrefs, parent) +- Generates comprehensive validation reports +- Provides evidence-based recommendations +- Domain-specific expertise + +**When to invoke**: +- New term needs research/validation +- Definition requires literature support +- Parent term relationship unclear +- Ontology placement needs research + +**Key capabilities**: +- Europe PMC full-text search +- Citation validation +- Evidence gathering +- Domain expertise +- Structured reporting + +### 3. EFO-importer (Import Specialist) +**File**: `.github/agents/EFO-importer.md` + +**Role**: External ontology term importer +- Searches OLS for terms in external ontologies +- Bidirectional verification of term identity +- Adds IRIs to dependency files +- Updates mirrors and regenerates imports + +**When to invoke**: +- Parent term exists in external ontology +- Need to import related terms +- Cross-ontology relationships needed + +**Key capabilities**: +- OLS search +- IRI validation +- Dependency file management +- Import generation + +## Workflow Orchestration + +**Location**: `.github/copilot-instructions.md` + +The copilot instructions file now handles: +- Initial request triage +- Ontology placement decisions (EFO vs MONDO vs OBA vs CL, etc.) +- Agent invocation routing +- Workflow sequencing +- Quality assurance checks + +**Decision patterns**: +- Standard diseases → MONDO import +- General measurements → OBA consideration +- Experimental assays → EFO +- Cell types → CL import +- Anatomical entities → UBERON import + +## Handoff Protocol + +**File**: `HANDOFF-PROTOCOL.md` + +Defines: +- Communication patterns between agents +- Request/response formats +- Multi-agent workflows +- Error handling +- State tracking + +**Key patterns**: +1. **New term (needs research)**: copilot-instructions → Curator → Ontologist +2. **New term (pre-validated)**: copilot-instructions → Ontologist +3. **Import needed**: copilot-instructions → Importer → Ontologist +4. **External ontology**: Curator → User (no integration) +5. **Simple edit**: Ontologist only + +## Quick Start + +### For New Term Requests + +For basic requests, describe what you need: + +```markdown +Please add a new term: +- Label: [term name] +- Definition: [if you have one] +- Parent: [if you know it] +- References: [if you have any] +``` + +The workflow will: +1. Assess what you've provided (copilot-instructions) +2. Call @EFO-curator to fill gaps or validate +3. Call @EFO-importer if external terms needed +4. Call @EFO-ontologist to integrate into EFO +5. Create a PR for review + +Or invoke agents directly: + +```markdown +@EFO-curator please research [term name] +@EFO-importer please import [term name] from MONDO +@EFO-ontologist please add this validated term to efo-edit.owl +``` + +### For Editing Existing Terms + +```markdown +Please edit [term name] (EFO:XXXXXXX): +- [Describe the change needed] +``` + +Or directly: +```markdown +@EFO-ontologist edit EFO:XXXXXXX to update the definition +``` + +### For Obsoleting Terms + +```markdown +Please obsolete [term name] (EFO:XXXXXXX) +Replacement: [term name] (EFO:YYYYYYY) +Reason: [why obsoleting] +``` + +Or directly: +```markdown +@EFO-ontologist obsolete EFO:XXXXXXX, replaced by EFO:YYYYYYY +``` + +## Agent Capabilities Matrix + +| Capability | Ontologist | Curator | Importer | +|-----------|-----------|---------|----------| +| Literature search | ❌ | ✅ | ❌ | +| OWL/XML editing | ✅ | ❌ | ❌ | +| OLS search | Limited | Limited | ✅ | +| Definition validation | ❌ | ✅ | ❌ | +| Parent term import | ❌ | ❌ | ✅ | +| Ontology placement advisory | ❌ | ✅ | ❌ | +| Git workflow | ✅ | ❌ | ❌ | +| Term integration | ✅ | ❌ | ❌ | +| Workflow orchestration | ❌ | ❌ | ❌ | + +**Note**: Workflow orchestration and architectural decisions now handled by `copilot-instructions.md` + +## Tools Used + +### artl-mcp (Literature Research) +Used by: **Curator** +- `search_europepmc_papers`: Find papers by keywords +- `get_europepmc_paper_by_id`: Get metadata for specific papers +- `get_all_identifiers_from_europepmc`: Get PMIDs, DOIs, PMCIDs +- `get_europepmc_full_text`: Get full text as Markdown +- `get_europepmc_pdf_as_markdown`: Convert PDF to Markdown + +### ols4-mcp (Ontology Lookup) +Used by: **All agents** +- `mcp_ols4_search`: Search all ontologies +- `mcp_ols4_searchClasses`: Search specific ontology +- `mcp_ols4_fetch`: Get term details +- `mcp_ols4_getAncestors`: Get term hierarchy +- `mcp_ols4_getDescendants`: Get child terms + +### Standard Tools +- `grep_search`, `file_search`: Find terms in files +- `read_file`, `replace_string_in_file`: Edit ontology +- `run_in_terminal`: Execute make commands +- `manage_todo_list`: Track multi-step workflows + +## Workflow Examples + +### Example 1: Minimal Information +``` +User: "Add term: ATAC-seq" + ↓ +copilot-instructions: Triage → Call curator for research + ↓ +Curator: Research literature → Generate report + ↓ +copilot-instructions: Decide parent needs import → Call importer + ↓ +Importer: Import parent from OBI + ↓ +copilot-instructions: Call ontologist to integrate + ↓ +Ontologist: Add to efo-edit.owl → Create PR +``` + +### Example 2: Complete Information +``` +User: "Add cardiac measurement with definition, PMID, parent" + ↓ +copilot-instructions: Triage → Call curator to verify + ↓ +Curator: Verify citations → Validate parent → Confirm EFO placement + ↓ +copilot-instructions: Call ontologist to integrate + ↓ +Ontologist: Add term → Create PR +``` + +### Example 3: Should Be External +``` +User: "Add general disease term" + ↓ +copilot-instructions: Triage → Call curator for research + ↓ +Curator: Research → Recommend MONDO (not EFO) + ↓ +copilot-instructions: Acknowledge → Inform user (no integration) + ↓ +User: Submit to MONDO with curator's report +``` + +## Decision Trees + +### Should I create one agent or two? + +**Two agents is better because**: +✅ Separation of concerns (research vs integration) +✅ Curator can be called for external submissions too +✅ Different expertise required (literature vs OWL/XML) +✅ Easier to maintain and improve each +✅ Clear handoff points + +### Which agent do I call? + +``` +Are you a user? → @EFO-ontologist +Are you the ontologist needing validation? → @EFO-curator +Are you the ontologist needing imports? → @EFO-importer +Are you the curator? → Response to @EFO-ontologist +Are you the importer? → Response to @EFO-ontologist +``` + +## File Structure + +``` +.github/agents/ +├── EFO-ontologist.md ← Main orchestrator agent +├── EFO-curator.md ← Research & validation agent +├── EFO-importer.md ← Import specialist agent (existing) +└── HANDOFF-PROTOCOL.md ← Communication protocols +``` + +## Maintenance + +### Updating Agent Specifications + +When updating an agent: +1. Edit the relevant agent's `.md` file +2. Update `HANDOFF-PROTOCOL.md` if communication patterns change +3. Update this README if capabilities change +4. Test the workflow with a sample issue + +### Adding New Capabilities + +When adding new tools or workflows: +1. Determine which agent should handle it +2. Update that agent's specification +3. Update handoff protocol if involves multiple agents +4. Add to capabilities matrix in this README + +### Common Issues + +**Agent not finding terms**: +- Check OLS is accessible +- Verify term exists in expected ontology +- Try alternative search terms + +**Literature search returns nothing**: +- Try broader search terms +- Search for related concepts +- Check alternative spellings/synonyms + +**Import fails**: +- Verify term exists in source ontology +- Check IRI format +- Ensure mirrors are up to date + +## Best Practices + +### For Users +- Provide as much information as you have +- Include relevant PMIDs or papers if known +- Mention domain context (disease, measurement, etc.) +- Reference related existing terms if applicable + +### For Agent Development +- Keep agents focused on their core competency +- Use structured communication formats +- Always validate before integrating +- Document decisions in commit messages and PRs +- Use TODO lists for multi-step workflows + +### For Ontology Curation +- Always require literature support +- Verify parent relationships make sense +- Check for existing terms before creating new ones +- Consider external ontologies for general concepts +- Maintain consistency with existing patterns + +## Testing the System + +To test the agent system: + +1. **Simple test**: "Add synonym 'XYZ' to term ABC" + - Should: Ontologist only + +2. **Medium test**: "Add new term: [label only]" + - Should: Ontologist → Curator → (maybe Importer) → Ontologist + +3. **Complex test**: "Add new measurement with is_about relationship" + - Should: All three agents, full validation, logical definition + +4. **Edge test**: "Add general anatomical term" + - Should: Ontologist → Curator → Recommend UBERON + +## Support + +For questions about: +- **Agent behavior**: See individual agent `.md` files +- **Communication**: See `HANDOFF-PROTOCOL.md` +- **Ontology editing**: See main `copilot-instructions.md` +- **Import process**: See `docs/Import_terms_from_another_ontology.md` + +## Version History + +- **v1.0** (2025-01-06): Initial three-agent system + - EFO-ontologist (orchestrator) + - EFO-curator (researcher) + - EFO-importer (existing, connector) + - Handoff protocol established