Skip to content

Ensure summary message is returned in non-streaming mode#416

Open
johnkrussell wants to merge 1 commit intonlweb-ai:mainfrom
johnkrussell:patch-1
Open

Ensure summary message is returned in non-streaming mode#416
johnkrussell wants to merge 1 commit intonlweb-ai:mainfrom
johnkrussell:patch-1

Conversation

@johnkrussell
Copy link

Fix: Ensure summary message is returned in non-streaming mode

Overview

This PR fixes an inconsistency between streaming and non-streaming responses in NLWeb where the summary generated during post-ranking is not returned when streaming=false.

Currently, the summary is only sent via:

await self.handler.send_stream_event(...)

This works correctly in streaming mode, but when streaming is disabled, the summary is not included in the final response payload because it is never added to the handler’s message collection.


Change

Adds a call to:

await self.handler.send_message(message)

in post_ranking.py when emitting the summary.

This ensures the summary is:

  • Included in the message list (handler.messages)
  • Returned in non-streaming responses
  • Still sent correctly in streaming mode (no regression)

Why this is needed

When streaming=false, /ask returns the accumulated messages from handler.messages.

However:

  • The summary is only streamed
  • It is never persisted as a message
  • Therefore it is missing from the final JSON response

This results in inconsistent behaviour:

Mode Summary returned
streaming=true Yes
streaming=false No

This PR resolves that inconsistency.


Implementation

Minimal change in:

post_ranking.py

Before

await self.handler.send_stream_event(message)

After

await self.handler.send_stream_event(message)
await self.handler.send_message(message)

Design considerations

  • No changes to API shape
  • No new parameters or config
  • No changes to retrieval or LLM providers
  • No additional abstractions introduced
  • Keeps behaviour consistent across streaming modes

This aligns with NLWeb’s goal of keeping the repo simple and minimally complex.


Testing

Tested manually using:

/ask?query=cake&site=Recipe&mode=summarize&streaming=true
/ask?query=cake&site=Recipe&mode=summarize&streaming=false

Before fix

  • Streaming: summary present
  • Non-streaming: summary missing

After fix

  • Streaming: summary present
  • Non-streaming: summary present

Also verified:

  • No duplicate summary in streaming mode
  • No impact to result batching or other message types

Notes

This change does not alter how summaries are generated, only ensures they are returned consistently regardless of transport mode.


Checklist

  • Minimal, focused change
  • No breaking changes
  • No new dependencies
  • Tested in both streaming and non-streaming modes
  • Aligns with NLWeb contribution principles

CLA

I understand that this contribution requires agreeing to the Microsoft CLA and will follow the instructions from the CLA bot if prompted.

Summary messages are not currently served back to the browser when streaming mode is turned off.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant