Skip to content

feat: add local Ollama VLM input plugin for multimodal visual reasoning#2467

Open
Wanbogang wants to merge 2 commits intoOpenMind:mainfrom
Wanbogang:feat/vlm-ollama-local
Open

feat: add local Ollama VLM input plugin for multimodal visual reasoning#2467
Wanbogang wants to merge 2 commits intoOpenMind:mainfrom
Wanbogang:feat/vlm-ollama-local

Conversation

@Wanbogang
Copy link
Collaborator

Summary

Adds VLM_Ollama_Local, a new input plugin for offline visual reasoning
using a locally running Ollama multimodal model (e.g., llava, moondream).

This addresses the multimodal gap noted in config/ollama.json5 where
llava was listed as a supported model but no visual input plugin existed.

Changes

  • src/inputs/plugins/vlm_ollama_local.py — new plugin
  • tests/inputs/plugins/test_vlm_ollama_local.py — 22 tests, 100% coverage

Usage

ollama pull llava
ollama serve
agent_inputs: [
  {
    type: "VLM_Ollama_Local",
    config: {
      model: "llava",
      prompt: "Briefly describe what you see.",
    },
  },
],

Add VLM_Ollama_Local plugin that captures webcam frames and sends them
to a locally running Ollama instance (e.g., llava, moondream) for
offline visual reasoning without cloud dependency.

- Follows existing FuserInput pattern (vlm_local_yolo, vlm_coco_local)
- Uses aiohttp to POST base64-encoded frames to Ollama /api/chat
- Supports any Ollama multimodal model via config (default: llava)
- Gracefully handles camera failures, API errors, and timeouts
- 22 tests with 100% coverage

Addresses the llava multimodal gap noted in config/ollama.json5
@Wanbogang Wanbogang requested review from a team as code owners March 12, 2026 05:33
@github-actions github-actions bot added robotics Robotics code changes python Python code tests Test files labels Mar 12, 2026
@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Python code robotics Robotics code changes tests Test files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant