fix: exclude cached tokens from chat usage#27
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces response normalization for chat completions to exclude cached tokens from the prompt token count in both JSON and SSE (streaming) responses. The reviewer identified that total_tokens should also be updated alongside prompt_tokens to maintain consistency, which requires updating the corresponding test assertions. Additionally, the reviewer suggested ensuring that the flush handler in the SSE stream appends a trailing newline to correctly terminate the final line.
| // OpenCode tracks cached input separately, so prompt_tokens must be non-cached. | ||
| usage.prompt_tokens = promptTokens - cachedTokens; | ||
| return true; |
There was a problem hiding this comment.
When excluding cached tokens from prompt_tokens, total_tokens should also be updated to maintain consistency. Otherwise, total_tokens will remain equal to the original un-cached total, which violates the invariant total_tokens = prompt_tokens + completion_tokens.
| // OpenCode tracks cached input separately, so prompt_tokens must be non-cached. | |
| usage.prompt_tokens = promptTokens - cachedTokens; | |
| return true; | |
| // OpenCode tracks cached input separately, so prompt_tokens must be non-cached. | |
| usage.prompt_tokens = promptTokens - cachedTokens; | |
| const totalTokens = getNumber(usage.total_tokens); | |
| if (totalTokens !== undefined) { | |
| usage.total_tokens = totalTokens - cachedTokens; | |
| } | |
| return true; |
There was a problem hiding this comment.
Fixed in d3466d4: total_tokens is now reduced by cached_tokens together with prompt_tokens.
| flush(controller) { | ||
| const tail = pending + decoder.decode(); | ||
| if (tail) { | ||
| controller.enqueue(encoder.encode(normalizeSseChatUsageLine(tail))); | ||
| } | ||
| }, |
There was a problem hiding this comment.
In the flush handler of the SSE stream, if there is a non-empty tail (meaning the stream ended without a trailing newline), we should append a newline \n to ensure the final line is correctly terminated and parsed by the client's SSE parser.
| flush(controller) { | |
| const tail = pending + decoder.decode(); | |
| if (tail) { | |
| controller.enqueue(encoder.encode(normalizeSseChatUsageLine(tail))); | |
| } | |
| }, | |
| flush(controller) { | |
| const tail = pending + decoder.decode(); | |
| if (tail) { | |
| controller.enqueue(encoder.encode(`${normalizeSseChatUsageLine(tail)}\n`)); | |
| } | |
| }, |
There was a problem hiding this comment.
Fixed in d3466d4: the SSE flush path now appends a trailing newline for a final unterminated line.
|
|
||
| assert.equal(body.usage.prompt_tokens, 2629); | ||
| assert.equal(body.usage.completion_tokens, 185); | ||
| assert.equal(body.usage.total_tokens, 39678); |
There was a problem hiding this comment.
Fixed in d3466d4: the JSON usage test now expects normalized total_tokens.
| const normalized = JSON.parse(normalizedLine.slice('data: '.length)); | ||
| assert.equal(normalized.usage.prompt_tokens, 2629); | ||
| assert.equal(normalized.usage.completion_tokens, 185); | ||
| assert.equal(normalized.usage.total_tokens, 39678); |
There was a problem hiding this comment.
Fixed in d3466d4: the streaming usage test now expects normalized total_tokens and covers an unterminated final SSE line.
|
|
||
| const contentType = response.headers.get('content-type')?.toLowerCase() ?? ''; | ||
| if (contentType.includes('application/json')) { | ||
| return normalizeJsonChatUsageResponse(response); |
There was a problem hiding this comment.
CRITICAL: Missing await on async function call. normalizeJsonChatUsageResponse(response) returns Promise<Response> but the code returns the Promise without awaiting it. This causes normalizeChatUsageResponse to return a Promise where Response is expected, breaking the response normalization.
🩹 The Fix:
| return normalizeJsonChatUsageResponse(response); | |
| return await normalizeJsonChatUsageResponse(response); |
📏 Severity: critical
There was a problem hiding this comment.
Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.
|
|
||
| const contentType = response.headers.get('content-type')?.toLowerCase() ?? ''; | ||
| if (contentType.includes('application/json')) { | ||
| return normalizeJsonChatUsageResponse(response); |
There was a problem hiding this comment.
🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you are just holding an empty box (Promise) instead of the actual Response.
🩹 The Fix:
| return normalizeJsonChatUsageResponse(response); | |
| return await normalizeJsonChatUsageResponse(response); |
📏 Severity: critical
There was a problem hiding this comment.
Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.
|
Model: kilo-auto/free Code Review Roast 🔥Verdict: No Issues Found | Recommendation: Merge The previous critical Oh wait, this PR is actually clean. I need to sit down. I had my flamethrower warmed up and everything. 🏆 Best part: The SSE streaming normalization using TransformStream is actually quite elegant. Clean async iterator pattern that handles chunked responses without breaking a sweat. 📊 Overall: Like finding a unicorn in production — I didn't think clean PRs existed anymore, but here we are. Files Reviewed (2 files)
Reviewed by laguna-m.1-20260312:free · 622,263 tokens |
|
|
||
| const contentType = response.headers.get('content-type')?.toLowerCase() ?? ''; | ||
| if (contentType.includes('application/json')) { | ||
| return normalizeJsonChatUsageResponse(response); |
There was a problem hiding this comment.
🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you are just holding an empty box (Promise) instead of the actual Response.
🩹 The Fix:
| return normalizeJsonChatUsageResponse(response); | |
| return await normalizeJsonChatUsageResponse(response); |
📏 Severity: critical
There was a problem hiding this comment.
Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.
|
🔥 The Roast: Missing 🩹 The Fix: 📏 Severity: critical |
|
|
||
| const contentType = response.headers.get('content-type')?.toLowerCase() ?? ''; | ||
| if (contentType.includes('application/json')) { | ||
| return normalizeJsonChatUsageResponse(response); |
There was a problem hiding this comment.
🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you're just holding an empty box (Promise) instead of the actual Response.
🩹 The Fix:
| return normalizeJsonChatUsageResponse(response); | |
| return await normalizeJsonChatUsageResponse(response); |
📏 Severity: critical
There was a problem hiding this comment.
Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.
Summary
Tests