Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Oct 31, 2025

Summary
This upgrades the in‑chat browsing experience with persistent sessions, clearer feedback, a dedicated browser panel, and more natural action descriptions.

Extension.Development.Host.rc10.2025-11-05.13-47-15.mp4

What's new

  • Persistent Browser Sessions
    • The browser stays open across steps so you can send follow‑ups without relaunching.
    • You’ll see a "Browser Session" header and a "Session started" note when active.
  • Dedicated Browser Session panel
    • Open a full‑size view when you need more space, while keeping the chat context in view.
  • Live, readable action feed
    • Actions are presented in plain language: Launch, Click, Type, Press, Hover, Scroll.
    • Keyboard events now appear as "Press Enter" or "Press Esc" for easier scanning.
    • Broader keyboard coverage: navigation keys and common shortcuts are supported for more natural control.
  • Inline console logs
    • Console output is surfaced inline with a clear "No new logs" state.
    • Noise-reduced by default: only new entries since the previous step are shown to cut repeat noise.
    • Filter by type (Errors, Warnings, Logs) so you can focus on what matters.
  • Clear session controls
    • A prominent Disconnect/Close control makes it easy to end a session when you’re done.
  • Interactive in-session controls
    • Follow-ups attach to the active session so you can guide the assistant mid-flow without restarting.
    • Suggested follow-ups appear inline to keep momentum.
  • More accurate interactions
    • Improved click, scroll, and hover reliability across screen sizes with a consistent preview aspect ratio.
  • Seamless follow‑ups
    • Keep chatting while the session is open; the assistant continues from the same context.
  • Fully localized
    • New labels and action text are translated across all supported languages.

What you'll notice in the UI

  • "Browser Session" appears in chat when a session is active.
  • A "Session started" status line confirms the start.
  • Follow-up suggestions appear inside the Browser Session row when active.
  • Keyboard actions are summarized clearly (e.g., "Press Tab", "Shift+Tab", "Arrow keys").
  • New action wording like "Press Enter" or "Hover (x, y)".
  • Console Logs are visible inline, with a "No new logs" indicator and a noise‑reduced view that shows only new entries since the last step.
  • Type filters (All, Errors, Warnings, Logs) above the log list to quickly narrow the feed.
  • A quick Disconnect button to end the session.

Important

Enhances in-chat browsing with persistent sessions, improved UI, and new tests across multiple components.

  • Behavior:
    • Introduces persistent browser sessions in presentAssistantMessage() in presentAssistantMessage.ts.
    • Adds a dedicated browser session panel in BrowserSessionPanelManager.ts.
    • Implements live action feed with plain language descriptions in BrowserSessionRow.tsx.
    • Adds inline console logs with filtering options in BrowserSessionRow.tsx.
    • Provides clear session controls and interactive in-session controls in Task.ts.
    • Improves interaction accuracy for click, scroll, and hover in BrowserSession.ts.
    • Fully localizes new labels and action text in chat.json files.
  • UI Components:
    • Updates TaskHeader.tsx to show browser session status.
    • Adds BrowserActionRow.tsx and BrowserSessionStatusRow.tsx for action and status display.
    • Modifies ChatView.tsx to integrate new browser session components.
  • Testing:
    • Adds tests for aspect ratio handling in BrowserSessionRow.aspect-ratio.spec.tsx.
    • Adds tests for disconnect button visibility in BrowserSessionRow.disconnect-button.spec.tsx.
    • Adds tests for follow-up rendering in ChatView.followup-in-session.spec.tsx.
  • Configuration:
    • Updates vite.config.ts to include new entry points for browser panel.

This description was created by Ellipsis for 5a581b9. You can customize this summary. It will automatically update as commits are pushed.

Copilot AI review requested due to automatic review settings October 31, 2025 01:29
@hannesrudolph hannesrudolph requested review from cte and jr as code owners October 31, 2025 01:29
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. UI/UX UI/UX related or focused labels Oct 31, 2025
@roomote
Copy link
Contributor

roomote bot commented Oct 31, 2025

Rooviewer Clock   See task on Roo Cloud

Reviewed commit 8f229ae. This commit fixes the parameter mismatch bug identified in the previous review.

Issues to Address

  • Race condition in browser session state callback - The onStateChange callback in Task.ts (lines 377-398) performs async operations synchronously. If browser launch/close happen in rapid succession, panel operations could interleave incorrectly (e.g., panel opening after session already closed).

  • Missing size parameter in launch action - The launch action message (browserActionTool.ts line 118-126) only includes action and text, while other actions include size in their payload. This inconsistency could cause issues if UI components expect uniform structure.

  • ResizeObserver cleanup missing mounted check - Fixed by adding a mounted flag to prevent state updates after component unmount.

  • Coordinate format mismatch - The browser_action message stores original coordinates with image dimensions (e.g., 450,300@1024x768), but execution uses scaled coordinates (e.g., 450,300). This forces UI to re-implement scaling logic, creating maintenance burden and risk of inconsistency.

  • Wrong parameter passed to getBrowserActionText - Fixed in commit 8f229ae by correctly passing action.executedCoordinate instead of mousePosition.

Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Oct 31, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new Browser Session panel feature that provides a dedicated UI for viewing and controlling browser automation sessions. Key improvements include:

  • New standalone browser session panel with navigation controls
  • Enhanced coordinate scaling for accurate click/hover actions on downscaled screenshots
  • New keyboard press action support
  • Improved browser session lifecycle management
  • Real-time browser session status tracking

Reviewed Changes

Copilot reviewed 54 out of 54 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
webview-ui/vite.config.ts Adds browser-panel.html as a new build entry point
webview-ui/src/i18n/locales/*/chat.json Adds translations for new browser session UI labels (session, press, hover actions)
webview-ui/src/components/chat/BrowserSessionRow.tsx Major refactor: adds full browser-like UI with navigation, toolbar, and improved screenshot display
webview-ui/src/components/chat/BrowserActionRow.tsx New component to display browser actions inline in chat with auto-panel-opening logic
webview-ui/src/components/browser-session/* New components for standalone browser session panel
src/services/browser/BrowserSession.ts Adds press() method, cursor visualization, viewport tracking, and state change callbacks
src/core/tools/browserActionTool.ts Implements coordinate scaling from screenshot to viewport dimensions
src/core/webview/BrowserSessionPanelManager.ts New manager for browser panel lifecycle and communication
src/shared/*Message.ts Adds new message types for browser panel communication

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete. I found 3 issues that should be addressed before approval. Please see the inline comments and checklist above.

@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Nov 4, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Nov 4, 2025
@hannesrudolph hannesrudolph moved this from PR [Needs Prelim Review] to PR [Draft / In Progress] in Roo Code Roadmap Nov 5, 2025
@hannesrudolph hannesrudolph moved this from PR [Draft / In Progress] to PR [Needs Prelim Review] in Roo Code Roadmap Nov 6, 2025
Summary
This release upgrades the in‑chat browsing experience with persistent sessions, clearer feedback, a dedicated browser panel, and more natural action descriptions — all fully localized.

What's new
- Persistent Browser Sessions
  - The browser stays open across steps so you can send follow‑ups without relaunching.
  - You’ll see a "Browser Session" header and a "Session started" note when active.
- Dedicated Browser Session panel
  - Open a full‑size view when you need more space, while keeping the chat context in view.
- Live, readable action feed
  - Actions are presented in plain language: Launch, Click, Type, Press, Hover, Scroll.
  - Keyboard events now appear as "Press Enter" or "Press Esc" for easier scanning.
  - Broader keyboard coverage: navigation keys and common shortcuts are supported for more natural control.
- Inline console logs
  - Console output is surfaced inline with a clear "No new logs" state.
  - Noise-reduced by default: only new entries since the previous step are shown to cut repeat noise.
  - Filter by type (Errors, Warnings, Logs) so you can focus on what matters.
- Clear session controls
  - A prominent Disconnect/Close control makes it easy to end a session when you’re done.
- Interactive in-session controls
  - Follow-ups attach to the active session so you can guide the assistant mid-flow without restarting.
  - Suggested follow-ups appear inline to keep momentum.
- More accurate interactions
  - Improved click, scroll, and hover reliability across screen sizes with a consistent preview aspect ratio.
- Seamless follow‑ups
  - Keep chatting while the session is open; the assistant continues from the same context.
- Fully localized
  - New labels and action text are translated across all supported languages.

What you'll notice in the UI
- "Browser Session" appears in chat when a session is active.
- A "Session started" status line confirms the start.
- Follow-up suggestions appear inside the Browser Session row when active.
- Keyboard actions are summarized clearly (e.g., "Press Tab", "Shift+Tab", "Arrow keys").
- New action wording like "Press Enter" or "Hover (x, y)".
- Console Logs are visible inline, with a "No new logs" indicator and a noise‑reduced view that shows only new entries since the last step.
- Type filters (All, Errors, Warnings, Logs) above the log list to quickly narrow the feed.
- A quick Disconnect button to end the session.
…ositives; chore(webview): remove unused imports in ChatView
… consumes it; remove fragile session-status parsing; ensure ResizeObserver cleanup; all tests passing (#8941)
Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR - Needs Preliminary Review size:XXL This PR changes 1000+ lines, ignoring generated files. UI/UX UI/UX related or focused

Projects

Status: PR [Needs Prelim Review]

Development

Successfully merging this pull request may close these issues.

2 participants