VLM-enriched object detection based navigation and fixed semantic navigation #967

sinha7y · 2026-01-09T00:57:03Z

Changes

VLM enrichment for ObjectDB labels: YOLO "person" → "person in white shirt" for precise navigation
Object navigation with rich labels: Agent can navigate to specific objects like "person in white" or "red coffee mug"
Environment awareness: New skill allows agent to list all detected objects when asked "what do you see?"
New blueprint agentic_detection: Enables object detection navigation (can swap to agentic if we want everything in one)
Fix RPC timeouts: Return lightweight dicts instead of full Object3D (120s → 0.2s)
Fix SpatialMemory: Resolve TF timing issues and remove image fetching (120s → 0.15s)
New skills: navigate_to_detected_object() and list_detected_objects()

Former-commit-id: 94ad22c

Former-commit-id: 8f0368b

Please enter the commit message for your changes. Lines starting Former-commit-id: d62355a

Former-commit-id: bd20bd7

Former-commit-id: 908c205

Former-commit-id: aad0f0c

Former-commit-id: 6e192a4

Former-commit-id: d000bc8

Former-commit-id: 479754b

Former-commit-id: eca3684

greptile-apps

Greptile Overview

Greptile Summary

This PR adds VLM-enriched object detection for precise navigation and fixes performance issues in semantic navigation.

Key Changes

New Navigation Skills:

navigate_to_detected_object(): Navigate to objects detected by vision system using ObjectDB lookup
list_detected_objects(): Query and list all detected objects for situational awareness

VLM Enrichment:

ObjectDB now uses VLM to enrich YOLO labels (e.g., "person" → "person in white shirt")
VLM enrichment occurs on object creation and re-enriches every 10 detections
Fallback to YOLO labels when VLM fails or is disabled

Performance Fixes:

RPC methods now return lightweight dicts instead of full Object3D objects (fixes 120s timeouts)
SpatialMemory no longer fetches images from visual memory during queries
Commented out verbose logging

Frame ID Changes:

Changed TF frame from "map" to "world" across navigation and spatial perception modules

New Blueprint:

Added agentic_detection blueprint combining detection, spatial memory, and agent capabilities

Critical Issue

The Object3D initialization never sets yolo_label from detection.name, causing the entire VLM enrichment and object tracking system to fail with None values.

Confidence Score: 1/5

Critical bug will cause object detection and navigation to fail completely
Object3D.init never initializes yolo_label from detection.name, leaving it as None. This breaks: (1) object tracking matching logic that compares yolo_label, (2) VLM enrichment that uses yolo_label in prompts and fallbacks, (3) object name assignment that falls back to yolo_label. The entire object detection navigation feature will fail.
dimos/perception/detection/moduleDB.py - Missing yolo_label initialization in Object3D.init

Important Files Changed

File Analysis

Filename	Score	Overview
dimos/perception/detection/moduleDB.py	1/5	Critical bug: Object3D.init never initializes yolo_label from detection.name, causing None values throughout VLM enrichment and object tracking logic
dimos/agents/skills/navigation.py	4/5	Added navigate_to_detected_object and list_detected_objects skills. Changed frame_id from "map" to "world" throughout
dimos/perception/spatial_perception.py	4/5	Changed SpatialMemory base class to SkillModule, updated frame_id from "map" to "world", added 1-second sleep in start(), improved error handling
dimos/agents_deprecated/memory/spatial_vector_db.py	5/5	Commented out image fetching and logging to fix RPC timeouts (performance optimization)
dimos/robot/all_blueprints.py	5/5	Added new blueprint "unitree-go2-agentic-detection" to registry
dimos/robot/unitree_webrtc/unitree_go2_blueprints.py	5/5	Added agentic_detection blueprint combining detection, spatial_memory, utilization, llm_agent, and common agentic components

Sequence Diagram

sequenceDiagram
    participant User
    participant Agent
    participant NavigationSkill
    participant ObjectDB
    participant VLM
    participant Navigator

    User->>Agent: "Navigate to person in white shirt"
    Agent->>NavigationSkill: navigate_to_detected_object("person in white shirt")
    NavigationSkill->>ObjectDB: lookup("person in white shirt")
    
    Note over ObjectDB: Search objects by name<br/>(matches VLM-enriched labels)
    
    alt Object found
        ObjectDB-->>NavigationSkill: [{"name": "person in white shirt", "pos_x": 1.2, "pos_y": 3.4, ...}]
        NavigationSkill->>Navigator: set_goal(PoseStamped)
        Navigator-->>NavigationSkill: Navigation complete
        NavigationSkill-->>Agent: "Successfully navigated to 'person in white shirt'"
    else Object not found
        ObjectDB-->>NavigationSkill: []
        NavigationSkill-->>Agent: "No detected object matching..."
    end

    Note over User,Navigator: Parallel: Detection & Enrichment Flow
    
    participant Camera
    participant Detection3D
    
    Camera->>Detection3D: Image + PointCloud
    Detection3D->>ObjectDB: Detection3DPC (name="person")
    
    alt New object
        ObjectDB->>ObjectDB: create_new_object()
        Note over ObjectDB: BUG: yolo_label not set!<br/>Should be: yolo_label = detection.name
        ObjectDB->>VLM: _enrich_with_vlm(obj)
        Note over VLM: Uses obj.yolo_label (None!)<br/>in prompt and fallbacks
        VLM-->>ObjectDB: rich_label
        ObjectDB->>ObjectDB: Store as Object3D<br/>(name=rich_label, yolo_label=None)
    else Existing object
        ObjectDB->>ObjectDB: add_to_object()
        alt Every 10 detections
            ObjectDB->>VLM: Re-enrich
            VLM-->>ObjectDB: Updated rich_label
        end
    end

greptile-apps · 2026-01-09T01:01:15Z

dimos/agents/skills/navigation.py

+        except Exception:
+            return f"Error querying ObjectDB for '{object_name}'"


[P3] Bare except Exception: swallows the actual error. Consider logging the exception for debugging: except Exception as e: logger.error(f"Error querying ObjectDB: {e}") before returning the error message.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-01-09T01:01:17Z

Additional Comments (1)

dimos/perception/detection/moduleDB.py
[P0] yolo_label is never initialized in __init__, causing it to remain None. This breaks object tracking (line 242 matches on yolo_label), VLM enrichment (lines 217, 223, 225, 236 use yolo_label), and object creation (line 296 sets name = yolo_label). Add self.yolo_label = detection.name after line 69.

greptile-apps

Greptile Overview

Greptile Summary

This PR introduces VLM-enriched object detection for navigation, enabling the robot to navigate to specific objects with rich descriptions (e.g., "person in white shirt" instead of just "person"). Key improvements include:

New Features:

VLM enrichment pipeline that enhances YOLO labels with detailed descriptions using Qwen VLM
Two new navigation skills: navigate_to_detected_object() for object-based navigation and list_detected_objects() for environment awareness
New agentic_detection blueprint combining detection, navigation, spatial memory, and agent capabilities
Lightweight RPC methods (lookup, get_all_detected_objects) that return dicts instead of full Object3D instances

Performance Fixes:

RPC timeout fix: Returns lightweight dicts reduce response time from 120s to 0.2s
SpatialMemory improvements: TF timing fixes and image fetching removal reduce latency from 120s to 0.15s
Reduced logging verbosity in spatial memory components

Architecture Changes:

Frame reference migration: Changes from "map" frame to "world" frame across navigation and spatial perception
SpatialMemory now extends SkillModule instead of Module
Object tracking now separates YOLO labels (for matching) from VLM labels (for display)

The changes enable more precise object-based navigation by allowing agents to distinguish between similar objects based on visual characteristics.

Confidence Score: 1/5

Critical runtime error will occur when agent_encode() is called
The PR contains a P0 blocking bug in moduleDB.py line 311 where len(obj.detections) is called on an integer field. This will cause TypeError at runtime whenever the agent_encode() method is invoked. Additionally, there are P1 type safety issues with potential None returns that violate the function contract. While the VLM enrichment feature and performance improvements are valuable, the critical bug must be fixed before merging.
dimos/perception/detection/moduleDB.py requires immediate attention due to TypeError bug in agent_encode() method

Important Files Changed

File Analysis

Filename	Score	Overview
dimos/agents/skills/navigation.py	3/5	Adds two new navigation skills (navigate_to_detected_object, list_detected_objects) and changes frame references from "map" to "world". Contains typos in error messages.
dimos/perception/detection/moduleDB.py	1/5	Major changes: VLM enrichment for object labels, new RPC methods (lookup, get_all_detected_objects), and lightweight dict returns. Critical bug: len() called on int in agent_encode().
dimos/perception/spatial_perception.py	4/5	Changes SpatialMemory to extend SkillModule, adds 1s startup delay, changes frame references from "map" to "world", reduces logging verbosity. Minor warning message inconsistency.
dimos/agents_deprecated/memory/spatial_vector_db.py	4/5	Comments out logging and image retrieval from visual memory to improve performance.
dimos/robot/unitree_webrtc/unitree_go2_blueprints.py	5/5	Adds new agentic_detection blueprint combining detection with navigation, spatial memory, and agent capabilities.

Sequence Diagram

sequenceDiagram
    participant Agent
    participant NavigationSkill
    participant ObjectDB
    participant VLM as Qwen VLM
    participant Detection as YOLO Detection
    participant Nav as Navigation

    Detection->>ObjectDB: Detection3DPC (YOLO label)
    ObjectDB->>ObjectDB: Check closest_object()
    
    alt New Object
        ObjectDB->>VLM: _enrich_with_vlm(obj)
        VLM-->>ObjectDB: Rich description
        ObjectDB->>ObjectDB: Create Object3D<br/>(yolo_label + vlm_label)
    else Existing Object (every 10 detections)
        ObjectDB->>ObjectDB: Add to existing
        ObjectDB->>VLM: _enrich_with_vlm(obj)
        VLM-->>ObjectDB: Updated description
    end

    Agent->>NavigationSkill: "Navigate to person in white"
    NavigationSkill->>ObjectDB: lookup("person in white")
    ObjectDB-->>NavigationSkill: [dict with pos_x, pos_y, frame_id]
    NavigationSkill->>NavigationSkill: Create PoseStamped
    NavigationSkill->>Nav: _navigate_to(goal_pose)
    Nav-->>NavigationSkill: Success/Failure
    NavigationSkill-->>Agent: "Successfully navigated to..."

    Agent->>NavigationSkill: "What do you see?"
    NavigationSkill->>ObjectDB: get_all_detected_objects()
    ObjectDB-->>NavigationSkill: [list of dicts]
    NavigationSkill-->>Agent: "I've detected N objects..."

greptile-apps · 2026-01-09T01:32:30Z

dimos/perception/detection/moduleDB.py

-            # we need at least 3 detectieons to consider it a valid object
-            # for this to be serious we need a ratio of detections within the window of observations
-            if len(obj.detections) < 4:  # type: ignore[arg-type]
+            if len(obj.detections) < 4:


[P0] obj.detections is an int, not a list. Calling len() on it will cause TypeError: object of type 'int' has no len() at runtime. Should be if obj.detections < 4: instead.

Suggested change

if len(obj.detections) < 4:

if obj.detections < 4:

greptile-apps · 2026-01-09T01:32:32Z

dimos/perception/detection/moduleDB.py

+        if not self.enable_vlm_enrichment or self.vlm_model is None:
+            return obj.yolo_label  # Fall back to YOLO label


[P1] Function signature promises str return, but returns obj.yolo_label which is typed as str | None. If yolo_label is None (possible per field definition), this violates the type contract and could cause issues in callers expecting a string.

greptile-apps · 2026-01-09T01:32:33Z

dimos/perception/detection/moduleDB.py

+                logger.warning(f"No image for {obj.track_id}, using YOLO label")
+                return obj.yolo_label
+
+            prompt = f"Describe this {obj.yolo_label} in detail. Include color, appearance, and distinguishing features. Keep it concise (under 10 words)."


[P2] If obj.yolo_label is None, the prompt becomes "Describe this None in detail...", which is semantically incorrect. Should handle None case or assert yolo_label is not None before this point.

greptile-apps · 2026-01-09T01:32:40Z

dimos/perception/spatial_perception.py

+            return
+
+        if tf is None:
+            logger.warning("No TF transform (map -> base_link) available")


[P3] Warning message says "(map -> base_link)" but code at line 223 actually requests "world" to "base_link" transform. Message should match the actual frame names.

Suggested change

logger.warning("No TF transform (map -> base_link) available")

logger.warning("No TF transform (world -> base_link) available")

greptile-apps · 2026-01-09T01:32:43Z

Additional Comments (2)

dimos/agents/skills/navigation.py
[P3] Typo: "Faild" should be "Failed"

            return "Error: Failed to reach the tagged location."

dimos/agents/skills/navigation.py
[P3] Typo: "Successfuly" should be "Successfully"

            f"Successfully arrived at location tagged '{robot_location.name}' from query '{query}'."

spomichter · 2026-01-09T01:38:17Z

dimos/agents/skills/navigation.py

+            return f"Failed to reach '{object_name}'. Navigation was cancelled or failed."
+
+    @skill()
+    def list_detected_objects(self) -> str:


This should be in the objectDBModule, otherwise its possible that if you have a blueprint without that module it will fail

spomichter · 2026-01-09T01:38:48Z

dimos/agents/skills/navigation.py

+
+        # Get objects
+        try:
+            get_all_rpc = self.get_rpc_calls("ObjectDBModule.get_all_detected_objects")


Yeah see here this is a hardcoded ObjectDBModule check - not good practice easier to just put this skill in there directly

spomichter · 2026-01-09T01:39:13Z

dimos/robot/unitree_webrtc/unitree_go2_blueprints.py

-    agentic,
-    MCPModule.blueprint(),
-)
+agentic_detection = autoconnect(


paul-nechifor · 2026-01-09T03:50:00Z

dimos/agents/skills/navigation.py

+
+        return "\n".join(lines)
+
+    def navigate_with_text(self, query: str) -> str:


Suggested change

def navigate_with_text(self, query: str) -> str:

@skill()

def navigate_with_text(self, query: str) -> str:

…itance - Add DetectedObjectInfo dataclass for passing object data to navigation (can move to dimos.msgs if needed) - Improve lookup with hybrid substring/token matching - Clean up skill logging

sinha7y · 2026-01-10T23:05:58Z

@greptile

greptile-apps · 2026-01-10T23:11:07Z

Greptile Overview

Greptile Summary

Overview

This PR adds VLM-enriched object detection for precise robot navigation. Objects detected by YOLO are enhanced with rich descriptions from a Vision Language Model (e.g., "person" → "person in white shirt"), enabling navigation to specific objects like "go to the red coffee mug" or "navigate to person in white."

Key Features:

VLM Label Enrichment: ObjectDB automatically enriches YOLO labels using QwenVLM with detailed descriptions including color and appearance
New Navigation Skill: navigate_to_detected_object() allows agents to navigate to detected objects by description
Object Listing: list_detected_objects() skill lets agents report what they see
Performance Fixes: Removed image fetching from RPC responses (120s → 0.2s) and fixed TF timing issues in SpatialMemory
Frame ID Standardization: Changed from "map" to "world" frame_id consistently across navigation and spatial memory

Architecture:

ObjectDB enriches detections on creation and re-enriches every 10 detections
Navigation skill performs fuzzy label matching with token-based search
DetectedObjectInfo dataclass provides lightweight RPC responses instead of full Object3D

Critical Issues Found

🔴 Object Tracking Completely Broken (moduleDB.py)

The Object3D.__add__() method creates uninitialized objects, causing AttributeError when adding detections to existing objects. This breaks the entire object tracking system since objects can't accumulate detections over time.

🔴 Navigation Always Reports Failure (navigation.py)

Type mismatch in navigate_to_detected_object(): compares boolean return value to string "SUCCESS", making all navigation attempts report failure even when successful.

🔴 TypeError in Object Listing (moduleDB.py)

Calling len() on integer obj.detections will crash when agents try to list detected objects.

Performance Improvements

The RPC optimizations are well-executed:

Removed image fetching from query results (fixes 120s timeout)
Commented out verbose logging in spatial_vector_db
Return lightweight DetectedObjectInfo instead of full Object3D
Added TF initialization sleep in SpatialMemory to prevent timing issues

Confidence Score: 1/5

This PR has multiple critical bugs that will cause runtime failures and break core functionality
Score of 1/5 due to three critical logic errors: (1) Object3D.add creates uninitialized objects causing AttributeError on every object update, completely breaking object tracking; (2) Type mismatch in navigate_to_detected_object makes navigation always report failure; (3) len() called on integer will crash object listing. These are not edge cases - they affect the primary code paths and will fail immediately when the new features are used
Critical attention required: dimos/perception/detection/moduleDB.py (Object3D initialization) and dimos/agents/skills/navigation.py (type mismatch in navigation result checking)

Important Files Changed

File Analysis

Filename	Score	Overview
dimos/agents/skills/navigation.py	2/5	Added navigate_to_detected_object() skill with critical type mismatch bug: compares bool return value to string "SUCCESS", causing navigation to always report failure
dimos/perception/detection/moduleDB.py	1/5	Major issues: Object3D.add creates uninitialized objects causing AttributeError; len() called on int; type mismatch in DetectedObjectInfo creation. VLM enrichment and lookup() implementation otherwise sound
dimos/perception/spatial_perception.py	4/5	Changed from "map" to "world" frame_id consistently. Minor: error message still references old "map" frame. Added SkillModule inheritance and improved TF timing with sleep
dimos/agents_deprecated/memory/spatial_vector_db.py	5/5	Performance optimization: commented out image fetching and verbose logging to reduce RPC timeouts. Changes are safe and beneficial
dimos/robot/all_blueprints.py	5/5	Added new blueprint "unitree-go2-agentic-detection" mapping. Simple configuration change with no logic issues
dimos/robot/unitree_webrtc/unitree_go2_blueprints.py	5/5	Added agentic_detection blueprint combining detection + spatial_memory + agent. Clean composition of existing modules with proper configuration

Sequence Diagram

sequenceDiagram
    participant Agent as LLM Agent
    participant NavSkill as NavigationSkillContainer
    participant ObjectDB as ObjectDBModule
    participant VLM as QwenVlModel
    participant Detection as Detection3DModule
    participant Nav as NavigationInterface

    Note over Agent,Nav: VLM-Enriched Object Detection Navigation

    Detection->>ObjectDB: ImageDetections3DPC
    ObjectDB->>ObjectDB: add_detection(detection)
    ObjectDB->>ObjectDB: create_new_object(detection)
    ObjectDB->>VLM: query(image, prompt)
    VLM-->>ObjectDB: rich_label (e.g., "person in white shirt")
    ObjectDB->>ObjectDB: Store object with yolo_label + vlm_label

    Agent->>NavSkill: navigate_to_detected_object("person in white")
    NavSkill->>ObjectDB: lookup("person in white")
    ObjectDB->>ObjectDB: Match objects by name/label
    ObjectDB-->>NavSkill: [DetectedObjectInfo]
    NavSkill->>NavSkill: Create goal_pose from obj.pose
    NavSkill->>Nav: set_goal(goal_pose)
    
    loop Poll navigation state
        NavSkill->>Nav: get_state()
        Nav-->>NavSkill: NavigationState
    end
    
    NavSkill->>Nav: is_goal_reached()
    Nav-->>NavSkill: bool (success/failure)
    NavSkill-->>Agent: "Successfully navigated to 'person in white shirt'"

    Note over Agent,NavSkill: Alternative: List Detected Objects
    Agent->>ObjectDB: list_detected_objects()
    ObjectDB-->>Agent: "I've detected 5 object(s):\n- person in white shirt (12 detections)\n- red coffee mug (8 detections)"

greptile-apps

_{5 files reviewed, 5 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-10T23:11:10Z

dimos/agents/skills/navigation.py

+
+        result = self._navigate_to(goal_pose)
+
+        if result == "SUCCESS":


Type mismatch: _navigate_to() returns bool (True/False), but this code compares it to the string "SUCCESS". This condition will always be False, causing navigation to always report failure even when successful.

The method should check if result: instead of if result == "SUCCESS":.

Suggested change

if result == "SUCCESS":

if result:

greptile-apps · 2026-01-10T23:11:15Z

dimos/perception/detection/moduleDB.py

-            # we need at least 3 detectieons to consider it a valid object
-            # for this to be serious we need a ratio of detections within the window of observations
-            if len(obj.detections) < 4:  # type: ignore[arg-type]
+            if len(obj.detections) < 4:


obj.detections is an int, not a collection, so calling len() on it will raise TypeError: object of type 'int' has no len().

Suggested change

if len(obj.detections) < 4:

if obj.detections < 4:

greptile-apps · 2026-01-10T23:11:16Z

dimos/perception/detection/moduleDB.py

+        """Convert Object3D to DetectedObjectInfo."""
+        try:
+            return DetectedObjectInfo(
+                track_id=obj.track_id,


DetectedObjectInfo expects track_id: str (non-optional), but obj.track_id can be None (type is str | None from line 56). If track_id is None, this will cause a type error when creating the dataclass instance.

While the exception handler at line 366 will catch this and return None, objects with None track_ids should be filtered out before calling _to_detection_info() to avoid unnecessary exceptions.

greptile-apps · 2026-01-10T23:11:17Z

dimos/perception/spatial_perception.py

+            return
+
+        if tf is None:
+            logger.warning("No TF transform (map -> base_link) available")


Error message refers to "map -> base_link" but the code at line 223 actually queries "world -> base_link". The error message should be updated to match the actual frame IDs being used.

Suggested change

logger.warning("No TF transform (map -> base_link) available")

logger.warning("No TF transform (world -> base_link) available")

greptile-apps · 2026-01-10T23:11:19Z

Additional Comments (1)

dimos/perception/detection/moduleDB.py
Critical initialization bug in __add__ method: Creating Object3D(self.track_id) with detection=None causes __init__ to return early at line 78, leaving the new object completely uninitialized. All subsequent attribute assignments (lines 97-116) will fail because the object's attributes were never created.

This will cause AttributeError exceptions when trying to add detections to existing objects, breaking the entire object tracking system.

The __add__ method needs to either:

Call super().__init__() to properly initialize the parent class, or
Manually initialize all required attributes before the early return in __init__, or
Restructure the initialization to avoid the early return pattern

spomichter · 2026-01-11T04:22:39Z

dimos/agents/skills/navigation.py

        if not self._skill_started:
            raise ValueError(f"{self} has not been started.")
-        tf = self.tf.get("map", "base_link", time_tolerance=2.0)
+        tf = self.tf.get("world", "base_link", time_tolerance=2.0)


This was a breaking change that i thinik was never tested or caught

spomichter · 2026-01-11T04:23:18Z

dimos/agents/skills/navigation.py

-        semantic map for a location matching the description.
-        CALL THIS SKILL FOR ONE SUBJECT AT A TIME. For example: "Go to the person wearing a blue shirt in the living room",
-        you should call this skill twice, once for the person wearing a blue shirt and once for the living room.
+    def navigate_to_detected_object(self, object_name: str) -> str:


object name cant be a string need type

spomichter · 2026-01-11T04:24:17Z

dimos/agents/skills/navigation.py

+        if not self._skill_started:
+            raise ValueError(f"{self} has not been started.")
+
+        lookup_rpc = self.get_rpc_calls("ObjectDBModule.lookup")


@paul-nechifor we need to fix this rpc shit with modules super weird and hard coded

spomichter · 2026-01-11T04:28:13Z

dimos/agents/skills/navigation.py

+
+        goal_pose = PoseStamped(
+            position=obj.pose.position,
+            orientation=Quaternion(),  


Lets calculate goal_pose via a seperate method. This is goal point which is just position. Goal_pose requires /global_map and /odom and the /goal_point and it picks the closest point within some buffer due to the size of the robot that is not in collision and then points the right way using trig.

spomichter · 2026-01-11T04:33:26Z

dimos/agents/skills/navigation.py

        if success_msg:
            return success_msg

-        logger.info(f"No tagged location found for {query}")


spomichter · 2026-01-11T04:33:36Z

dimos/agents/skills/navigation.py

        goal_pose = PoseStamped(
            position=make_vector3(*robot_location.position),
            orientation=Quaternion.from_euler(Vector3(*robot_location.rotation)),
-            frame_id="map",


bug fix from frame thing

spomichter · 2026-01-11T04:34:00Z

dimos/agents_deprecated/memory/spatial_vector_db.py

        )

-        logger.info(f"Added image vector {vector_id} with metadata: {metadata}")
+        #logger.info(f"Added image vector {vector_id} with metadata: {metadata}")


spomichter · 2026-01-11T04:46:33Z

dimos/perception/detection/moduleDB.py

+logger = setup_logger()
+
+@dataclass
+class DetectedObjectInfo:


use detection2d type or extend (maybe but prob not) with a super minimal thing

spomichter · 2026-01-11T04:50:24Z

dimos/perception/detection/moduleDB.py

+            self._vlm_model = QwenVlModel()
+        return self._vlm_model
+
+    def _enrich_with_vlm(self, obj: Object3D) -> str:


This cant be here. Needs to be in the VLM stuff in models/vl

read base.py

sinha7y and others added 10 commits January 8, 2026 16:00

added Agentic Detection to navigate to objects registered in ObjectDB

c586171

Former-commit-id: 94ad22c

Minor changes after testing

3fdd964

Former-commit-id: 8f0368b

updated debugging for spatial perception

d081bc9

Please enter the commit message for your changes. Lines starting Former-commit-id: d62355a

spatial memory working

f3aa43e

Former-commit-id: bd20bd7

function: spatial memory and detection-based navigation

a9b722d

Former-commit-id: 908c205

CI code cleanup

a3b4046

Former-commit-id: aad0f0c

updated module DB for rich labels

cd47a33

Former-commit-id: 6e192a4

Added detection answering ability for agent

d129ede

Former-commit-id: d000bc8

CI code cleanup

0dcfe6c

Former-commit-id: 479754b

clean up code

d998f67

Former-commit-id: eca3684

sinha7y requested a review from a team January 9, 2026 00:57

greptile-apps bot reviewed Jan 9, 2026

View reviewed changes

Bug fix for objectDB labels

2dec4f7

sinha7y closed this Jan 9, 2026

sinha7y reopened this Jan 9, 2026

greptile-apps bot reviewed Jan 9, 2026

View reviewed changes

spomichter requested changes Jan 9, 2026

View reviewed changes

paul-nechifor reviewed Jan 9, 2026

View reviewed changes

- Move list_detected_objects to ObjectDBModule with SkillModule inher…

779ec10

…itance - Add DetectedObjectInfo dataclass for passing object data to navigation (can move to dimos.msgs if needed) - Improve lookup with hybrid substring/token matching - Clean up skill logging

greptile-apps bot reviewed Jan 10, 2026

View reviewed changes

Greptile bug fixes

b74bb40

spomichter requested changes Jan 11, 2026

View reviewed changes

		except Exception:
		return f"Error querying ObjectDB for '{object_name}'"

		if not self.enable_vlm_enrichment or self.vlm_model is None:
		return obj.yolo_label # Fall back to YOLO label

	logger.warning("No TF transform (map -> base_link) available")
	logger.warning("No TF transform (world -> base_link) available")


		return "\n".join(lines)

		def navigate_with_text(self, query: str) -> str:

	def navigate_with_text(self, query: str) -> str:
	@skill()
	def navigate_with_text(self, query: str) -> str:


		result = self._navigate_to(goal_pose)

		if result == "SUCCESS":

VLM-enriched object detection based navigation and fixed semantic navigation #967

Are you sure you want to change the base?

VLM-enriched object detection based navigation and fixed semantic navigation #967

Uh oh!

Conversation

sinha7y commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Key Changes

Critical Issue

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 9, 2026

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Greptile Overview

Greptile Summary

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Jan 9, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sinha7y commented Jan 10, 2026

Uh oh!

greptile-apps bot commented Jan 10, 2026

Greptile Overview

Greptile Summary

Overview

Critical Issues Found

🔴 Object Tracking Completely Broken (moduleDB.py)

🔴 Navigation Always Reports Failure (navigation.py)

🔴 TypeError in Object Listing (moduleDB.py)

Performance Improvements

Confidence Score: 1/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

sinha7y commented Jan 9, 2026 •

edited

Loading