Incorporate Waza for standardized skill development and evaluation
Summary
This issue proposes integrating Waza (Microsoft's CLI/framework for agent skills) into openstudio-mcp to standardize skill creation, evaluation, and improvement processes. This will enhance skill quality, enable automated validation, and provide measurable improvements in our skill ecosystem.
Benefits of Incorporating Waza
1. Standardized Skill Creation
- Consistent skill structure with proper YAML frontmatter
- Automatic scaffolding of skills and evaluation suites
- Reduced onboarding friction for new contributors
2. Automated Evaluation & Validation
- Generate test cases from skill definitions
- Run standardized evaluations with multiple models
- CI/CD integration for automated skill validation
- Skill readiness checking (compliance, token budget, spec adherence)
3. Quality Improvement
- LLM-as-judge quality assessment across multiple dimensions
- Iterative skill improvement with Copilot suggestions
- Token optimization for MCP host constraints
- Comparative analysis across models and versions
4. Enhanced Collaboration
- Standardized reporting and visualization
- Historical tracking of evaluation results
- Cloud storage for team result sharing
- Session logging for debugging complex interactions
5. Data-Driven Development
- Metrics on skill performance over time
- Coverage analysis of skills vs. evaluations
- Regression detection through result comparison
- Evidence-based skill improvement decisions
Detailed Implementation Plan
Phase 1: Foundation & Pilot Skill (Weeks 1-2)
- Install waza CLI in development environment
- Select pilot skill (e.g.,
add-hvac) for initial integration
- Migrate pilot skill to waza-standardized format:
- Use
waza new skill add-hvac to create standardized structure
- Generate evaluation suite with
waza new eval add-hvac
- Migrate existing workflow instructions to SKILL.md frontmatter
- Establish baseline evaluation with
waza run
- Document pilot process for team adoption
Phase 2: Tooling & CI Integration (Weeks 3-4)
- Create
.waza.yaml configuration file for openstudio-mcp
- Add waza evaluation to GitHub Actions workflow:
- name: Run skill evaluations
run: waza run evals/<skill-name>/eval.yaml --format github-comment
- name: Check skill readiness
run: waza check skills/<name> || exit 1
- name: Token budget validation
run: waza tokens compare main --skills --threshold 10 --strict --format json
- Integrate skill quality checks into PR validation
- Set up result storage configuration (local initially, optional cloud)
- Create contributor documentation for waza workflows
Phase 3: Full Migration & Advanced Features (Weeks 5-6)
- Migrate all existing skills to waza format using systematic approach:
- For each skill:
waza new eval <skill-name> to generate evaluation
- Manually transfer workflow content to SKILL.md
- Verify equivalence with existing eval.md files
- Implement advanced features:
- Session logging for complex skill debugging
- Cross-model comparison capabilities
- Skill coverage analysis with
waza coverage
- Token optimization suggestions
- Establish skill quality baseline metrics
Phase 4: Optimization & Governance (Ongoing)
- Regular skill health checks via automated workflows
- Continuous improvement based on evaluation results
- Community contribution guidelines updated with waza processes
- Periodic review of waza configuration and thresholds
- Knowledge sharing sessions on effective skill development
Success Metrics
- Reduction in skill creation time for new contributors
- Increase in skill evaluation pass rates
- Decrease in token usage per skill while maintaining functionality
- Improved consistency in skill structure and documentation
- Faster identification of skill regressions through automated testing
- Enhanced contributor satisfaction with standardized processes
Dependencies & Considerations
- Requires Go 1.26+ for waza installation (already met in dev environment)
- Need to adapt waza templates to match openstudio-mcp's specific skill format
- Initial time investment for skill migration (estimated 15-30 mins per skill)
- Training required for team on new workflows
- Optional Azure storage configuration for team result sharing (can start local)
Next Steps for Discussion
- Confirm interest in pursuing this integration
- Select pilot skill for initial implementation
- Determine timeline and resource allocation
- Decide on cloud storage configuration preferences
- Establish review process for migrated skills
This implementation would transform openstudio-mcp's skill development from informal practices to a standardized, measurable, and continuously improvable process aligned with industry best practices for AI agent skill development.
Incorporate Waza for standardized skill development and evaluation
Summary
This issue proposes integrating Waza (Microsoft's CLI/framework for agent skills) into openstudio-mcp to standardize skill creation, evaluation, and improvement processes. This will enhance skill quality, enable automated validation, and provide measurable improvements in our skill ecosystem.
Benefits of Incorporating Waza
1. Standardized Skill Creation
2. Automated Evaluation & Validation
3. Quality Improvement
4. Enhanced Collaboration
5. Data-Driven Development
Detailed Implementation Plan
Phase 1: Foundation & Pilot Skill (Weeks 1-2)
add-hvac) for initial integrationwaza new skill add-hvacto create standardized structurewaza new eval add-hvacwaza runPhase 2: Tooling & CI Integration (Weeks 3-4)
.waza.yamlconfiguration file for openstudio-mcpPhase 3: Full Migration & Advanced Features (Weeks 5-6)
waza new eval <skill-name>to generate evaluationwaza coveragePhase 4: Optimization & Governance (Ongoing)
Success Metrics
Dependencies & Considerations
Next Steps for Discussion
This implementation would transform openstudio-mcp's skill development from informal practices to a standardized, measurable, and continuously improvable process aligned with industry best practices for AI agent skill development.