Skip to content

[Bug] JSONDecodeError "Extra data" when parsing streamed raw JSON lines (Agent Engine stream_query) #2042

@0end

Description

@0end

Summary

google-genai fails to parse streaming responses from Vertex AI Agent Engine (stream_query) with JSONDecodeError: Extra data. The raw response shows multiple JSON objects concatenated.

Environment

  • google-genai: 1.62.0
  • google-adk: 1.22.0
  • google-cloud-aiplatform: 1.136.0
  • httpx: 0.28.1
  • Python: 3.13.12
  • Backend: Vertex AI Agent Engine
  • API: stream_query with ?alt=sse
  • Observed Content-Type: application/json

Call Chain

google-cloud-aiplatform (vertexai) uses google-genai as its internal HTTP client. The error occurs inside google-genai when parsing the streamed response:

app code
  → vertexai.AgentEngine.stream_query()                       # google-cloud-aiplatform
    → vertexai._genai.agent_engines._stream_query()            # google-cloud-aiplatform
      → google.genai._api_client.request_streamed()            # google-genai
        → HttpResponse.segments()                              # google-genai
          → HttpResponse._iter_response_stream()  ← brace-count buffering bug
          → HttpResponse._load_json_from_response()  ← JSONDecodeError raised here

Observed Request/Response (sanitized app-side logs)

Source: application-side logs (not genai internal logs).
Interpretation: Although ?alt=sse is requested, the backend responds with Content-Type: application/json and chunked transfer. The raw response is not SSE data: lines but plain JSON lines.

[INFO] response_content_type=application/json
[INFO] response_headers={'content-type': 'application/json', 'transfer-encoding': 'chunked', 'date': 'REDACTED', 'server': 'scaffolding on HTTPServer2'}
[INFO] request method=POST url=https://redacted-aiplatform.googleapis.com/v1beta1/projects/REDACTED/locations/REDACTED/reasoningEngines/REDACTED:streamQuery?alt=sse accept=*/*

Evidence of concatenated JSON (sanitized app-side logs)

Source: application-side logs (not genai internal logs).
These logs are from read-only instrumentation (a wrapper around HttpResponse._iter_response_stream that only observes and logs without modifying behavior) that records each line's brace delta ({ count minus } count) and running balance. concat=True means the line contains a }{ boundary indicating concatenated JSON objects.

Interpretation: The first raw JSON line is large and has delta=+1, so the brace counter never returns to zero. The next JSON event is appended to the same buffer, producing a }{ boundary and json.loads fails with Extra data.

[WARNING] raw_line len=143547 bytes=143547 concat=True delta=1 balance=1 sample={"content": {"parts": [{"function_response": ...}] ...}}
[ERROR] json_decode_error response_len=152621 response_bytes=152621 concat=True concat_ctx=[REDACTED]}{[REDACTED]
lines=idx=1 kind=raw len=143547 bytes=143547 concat=True delta=1 balance=1 ...
| idx=2 kind=raw len=9074 bytes=9074 concat=False delta=0 balance=1 ...

Error Message (actual)

google.genai.errors.UnknownApiResponseError: Failed to parse response as JSON.
Raw response: {"content": {"parts": [{"function_response": ...}] ... }{"model_version": "gemini-2.5-pro", ...}

Stack Trace (excerpt)

... _api_client.py:1404 request_streamed
... _api_client.py:284 segments
... _api_client.py:447 _load_json_from_response
json.decoder.JSONDecodeError: Extra data

Reproduction Steps

  1. Deploy an Agent Engine with tools that return large responses containing {/} in text (e.g., internal wiki pages or documents with logs/JSON snippets).
  2. Invoke stream_query().
  3. When tool output is large, the response is Content-Type: application/json and appears as raw JSON lines.
  4. Parsing fails with JSONDecodeError: Extra data.

Minimal Repro (standalone)

This reproduces the same failure locally without Vertex AI by feeding raw JSON lines into google.genai._api_client.HttpResponse.
Although it uses internal classes, it exercises the same path used by request_streamed() -> HttpResponse.segments(),
which is where stream_query() fails in production when the backend returns raw JSON lines with Content-Type: application/json.

#!/usr/bin/env python3
"""Minimal local reproduction using google-genai internals.

This feeds raw (non "data: ") lines into google.genai._api_client.HttpResponse
so its brace-count buffering is exercised. Braces inside JSON strings unbalance
the counter, leading to concatenated JSON and a json.loads "Extra data" error.
"""

from __future__ import annotations

import json

import httpx
from google.genai import _api_client as genai_api_client
from google.genai import errors


def main() -> None:
    # Line1 simulates a large tool response JSON that contains an extra "{"
    # inside a string. This unbalances the brace counter.
    line1 = json.dumps(
        {
            "content": {
                "parts": [
                    {
                        "function_response": {
                            "id": "tool-1",
                            "name": "get_document",
                            "response": {
                                "id": "12345",
                                "title": "example",
                                "content": {
                                    "value": "log line with unmatched brace {",
                                    "format": "markdown",
                                },
                            },
                        }
                    }
                ]
            },
            "id": "evt1",
        }
    )

    # Line2 simulates the next JSON event.
    line2 = json.dumps(
        {
            "model_version": "gemini-2.5-pro",
            "content": {"parts": [{"text": "summary"}]},
            "id": "evt2",
        }
    )

    lines = [line1, line2]
    content = ("\n".join(lines)).encode("utf-8")
    response = httpx.Response(200, content=content)
    http_response = genai_api_client.HttpResponse(response_stream=response, headers={})

    print("Raw line lengths:", [len(line) for line in lines])
    print("Brace deltas:", [line.count("{") - line.count("}") for line in lines])
    print("Attempting to parse via google.genai._api_client.HttpResponse.segments()")

    try:
        # This should raise UnknownApiResponseError due to concatenated JSON.
        list(http_response.segments())
    except errors.UnknownApiResponseError as exc:
        print("Caught UnknownApiResponseError")
        print(f"Message: {exc}")
        if exc.__cause__:
            print(f"Cause: {exc.__cause__}")
    else:
        print("No error raised (unexpected).")


if __name__ == "__main__":
    main()

Install:

pip install google-genai==1.62.0 httpx==0.28.1

Run:

python repro_genai_json_decode_error.py

Output:

Raw line lengths: [243, 94]
Brace deltas: [1, 0]
Attempting to parse via google.genai._api_client.HttpResponse.segments()
Caught UnknownApiResponseError
Message: Failed to parse response as JSON. Raw response: {"content": {"parts": [{"function_response": {"id": "tool-1", "name": "confluence_get_page", "response": {"id": "2982183028", "title": "example", "content": {"value": "log line with unmatched brace {", "format": "markdown"}}}}]}, "id": "evt1"}{"model_version": "gemini-2.5-pro", "content": {"parts": [{"text": "summary"}]}, "id": "evt2"}
Cause: Extra data: line 1 column 244 (char 243)

Actual Behavior

google-genai raises UnknownApiResponseError due to JSONDecodeError: Extra data when multiple JSON objects appear concatenated in the raw response.

Expected Behavior

Streaming responses should be parsed successfully, or concatenated JSON objects should be split safely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority: p2Moderately-important priority. Fix may not be included in next release.type: bugError or flaw in code with unintended results or allowing sub-optimal usage patterns.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions