Skip to content

Add HiveMind gateway integration#16

Open
pookNast wants to merge 29 commits into
hexo-ai:mainfrom
pookNast:hivemind-split
Open

Add HiveMind gateway integration#16
pookNast wants to merge 29 commits into
hexo-ai:mainfrom
pookNast:hivemind-split

Conversation

@pookNast

@pookNast pookNast commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds a HiveMind/local-LLM reference target agent template
  • Wires --hivemind CLI selection into orchestration
  • Adds HiveMind endpoint/model configuration via environment overrides

Test plan

  • ruff check /home/pook/sia/sia /home/pook/sia/tests
  • python -m pytest tests -q locally on the split branch

Split out from #15 per review feedback.

🤖 Generated with OpenClaude

pookNast and others added 29 commits May 29, 2026 18:34
Extract hardcoded model names, timeouts, limits, and defaults
into a single config module with environment variable overrides.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace 8 inline hardcoded defaults with Config.* references.
No behavioral change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hardcoded truncation limits and default models with
config imports. No behavioral change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Environment variables (SIA_META_MODEL, SIA_TASK_MODEL, etc.) now
override defaults with lower priority than explicit CLI flags.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broad except Exception with specific exception types
(JSONDecodeError, OSError, SubprocessError) for better error
diagnosis. Keep a safety-net handler at the generation loop level.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace broad except Exception with specific exception types
(OSError, JSONDecodeError, RuntimeError) for better error handling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace shell=True piped tee with direct subprocess.run(arg_list)
and file write. Adds configurable timeout via Config.EVAL_TIMEOUT.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace shell=True piped tee with Popen streaming stdout to both
console and log file simultaneously. Removes bash dependency.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When --sandbox docker is specified, target agents run inside a
Docker container with read-only dataset mount, read-write working
directory, no network access, and resource limits.

Default is sandbox=none (current behavior unchanged).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Document the security model, execution modes, bypassPermissions
rationale, and Docker sandbox usage.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Section 1 of main() extracted into a load_task_files() function
returning a TaskFiles dataclass. Improves readability and testability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Section 2 of main() extracted into setup_run_directory() and
_create_venv() helper functions. Returns RunSetup dataclass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Meta-agent and feedback-agent prompt templates extracted into
dedicated builder functions for maintainability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The 400+ line generation loop extracted into run_generation() and
helper functions. main() is now a thin orchestrator.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Smoke tests for --help, missing args, and invalid task name.
Also adds __main__.py to support `python -m sia` invocation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _safe_read_file() and _safe_load_json() helpers that enforce
Config.MAX_CONTEXT_FILE_SIZE limits. Prevents unbounded memory usage
from oversized execution logs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enforce Config.MAX_EXECUTION_LOG_SIZE on trajectory files and
results.json. Oversized files are skipped with a warning instead
of loading into memory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace remaining magic number truncation limits with
Config constants for discoverability and tuning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test default values, env var overrides, and invalid value fallback.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mock subprocess tests for skipped, success, failure, and timeout
scenarios.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test target agent execution and context tracking with mocked
subprocess. Verifies directory structure and context.md creation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify feedback agent invocation, directory creation across
generations, and context.md tracking.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify Docker command construction, mount flags, and sandbox
mode selection with mocked subprocess.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Test _safe_read_file and _safe_load_json with files at, above,
and below size limits.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix 13 ruff lint errors (unused imports, f-string, sorted imports)
- Change DEFAULT_MAX_TURNS/CONTEXT_SUMMARY_MAX_TURNS from str to int
- Replace sys.exit(1) with graceful return in _run_target_agent
- Add truncation to single-trajectory JSON in feedback prompt
- Replace print() with logger in context_manager
- Mock _generate_llm_summary in test_multiple_generations_track_deltas

All 61 tests passing, ruff clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add HiveMind config fields (endpoint, model, task_model) to Config
- Add --hivemind CLI flag to route target agents through local LLMs
- Add reference_target_agent_hivemind.py template using OpenAI-compatible API
- Support SIA_HIVEMIND_ENDPOINT and SIA_HIVEMIND_MODEL env var overrides

Usage: sia --task gpqa --max_gen 3 --hivemind
Routes target agents to qwen3.6-27b via HiveMind (:8400),
meta/feedback agents still use Claude for quality reasoning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address the reviewer-requested security contact email and clean up Ruff lint failures so the hardening PR can pass CI.

Co-Authored-By: OpenClaude (gpt-5.5) <openclaude@gitlawb.com>
- Add HiveMind config fields (endpoint, model, task_model) to Config
- Add --hivemind CLI flag to route target agents through local LLMs
- Add reference_target_agent_hivemind.py template using OpenAI-compatible API
- Support SIA_HIVEMIND_ENDPOINT and SIA_HIVEMIND_MODEL env var overrides

Usage: sia --task gpqa --max_gen 3 --hivemind
Routes target agents to qwen3.6-27b via HiveMind (:8400),
meta/feedback agents still use Claude for quality reasoning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant