Skip to content

fix: exclude cached tokens from chat usage#27

Open
makcimbx wants to merge 3 commits into
Alph4d0g:mainfrom
makcimbx:fix/cached-token-usage
Open

fix: exclude cached tokens from chat usage#27
makcimbx wants to merge 3 commits into
Alph4d0g:mainfrom
makcimbx:fix/cached-token-usage

Conversation

@makcimbx
Copy link
Copy Markdown

Summary

  • Normalize OmniRoute chat completion usage so prompt_tokens excludes cached_tokens when OpenCode tracks cached input separately.
  • Apply the fix to JSON and streaming chat completion responses while preserving non-usage chunks.
  • Add regression coverage for JSON and SSE cached-token accounting.

Tests

  • npm run build
  • npm test
  • npm run check:exports

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces response normalization for chat completions to exclude cached tokens from the prompt token count in both JSON and SSE (streaming) responses. The reviewer identified that total_tokens should also be updated alongside prompt_tokens to maintain consistency, which requires updating the corresponding test assertions. Additionally, the reviewer suggested ensuring that the flush handler in the SSE stream appends a trailing newline to correctly terminate the final line.

Comment thread src/plugin.ts
Comment on lines +1034 to +1036
// OpenCode tracks cached input separately, so prompt_tokens must be non-cached.
usage.prompt_tokens = promptTokens - cachedTokens;
return true;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When excluding cached tokens from prompt_tokens, total_tokens should also be updated to maintain consistency. Otherwise, total_tokens will remain equal to the original un-cached total, which violates the invariant total_tokens = prompt_tokens + completion_tokens.

Suggested change
// OpenCode tracks cached input separately, so prompt_tokens must be non-cached.
usage.prompt_tokens = promptTokens - cachedTokens;
return true;
// OpenCode tracks cached input separately, so prompt_tokens must be non-cached.
usage.prompt_tokens = promptTokens - cachedTokens;
const totalTokens = getNumber(usage.total_tokens);
if (totalTokens !== undefined) {
usage.total_tokens = totalTokens - cachedTokens;
}
return true;

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3466d4: total_tokens is now reduced by cached_tokens together with prompt_tokens.

Comment thread src/plugin.ts
Comment on lines +977 to +982
flush(controller) {
const tail = pending + decoder.decode();
if (tail) {
controller.enqueue(encoder.encode(normalizeSseChatUsageLine(tail)));
}
},
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

In the flush handler of the SSE stream, if there is a non-empty tail (meaning the stream ended without a trailing newline), we should append a newline \n to ensure the final line is correctly terminated and parsed by the client's SSE parser.

Suggested change
flush(controller) {
const tail = pending + decoder.decode();
if (tail) {
controller.enqueue(encoder.encode(normalizeSseChatUsageLine(tail)));
}
},
flush(controller) {
const tail = pending + decoder.decode();
if (tail) {
controller.enqueue(encoder.encode(`${normalizeSseChatUsageLine(tail)}\n`));
}
},

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3466d4: the SSE flush path now appends a trailing newline for a final unterminated line.

Comment thread test/plugin.test.mjs Outdated

assert.equal(body.usage.prompt_tokens, 2629);
assert.equal(body.usage.completion_tokens, 185);
assert.equal(body.usage.total_tokens, 39678);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the assertion to reflect the normalized total_tokens value (39678 - 36864 = 2814).

Suggested change
assert.equal(body.usage.total_tokens, 39678);
assert.equal(body.usage.total_tokens, 2814);

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3466d4: the JSON usage test now expects normalized total_tokens.

Comment thread test/plugin.test.mjs Outdated
const normalized = JSON.parse(normalizedLine.slice('data: '.length));
assert.equal(normalized.usage.prompt_tokens, 2629);
assert.equal(normalized.usage.completion_tokens, 185);
assert.equal(normalized.usage.total_tokens, 39678);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Update the assertion to reflect the normalized total_tokens value (39678 - 36864 = 2814).

Suggested change
assert.equal(normalized.usage.total_tokens, 39678);
assert.equal(normalized.usage.total_tokens, 2814);

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in d3466d4: the streaming usage test now expects normalized total_tokens and covers an unterminated final SSE line.

Copy link
Copy Markdown
Contributor

@kilo-code-bot kilo-code-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review complete.

Comment thread src/plugin.ts Outdated

const contentType = response.headers.get('content-type')?.toLowerCase() ?? '';
if (contentType.includes('application/json')) {
return normalizeJsonChatUsageResponse(response);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Missing await on async function call. normalizeJsonChatUsageResponse(response) returns Promise<Response> but the code returns the Promise without awaiting it. This causes normalizeChatUsageResponse to return a Promise where Response is expected, breaking the response normalization.

🩹 The Fix:

Suggested change
return normalizeJsonChatUsageResponse(response);
return await normalizeJsonChatUsageResponse(response);

📏 Severity: critical

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.

Comment thread src/plugin.ts Outdated

const contentType = response.headers.get('content-type')?.toLowerCase() ?? '';
if (contentType.includes('application/json')) {
return normalizeJsonChatUsageResponse(response);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you are just holding an empty box (Promise) instead of the actual Response.

🩹 The Fix:

Suggested change
return normalizeJsonChatUsageResponse(response);
return await normalizeJsonChatUsageResponse(response);

📏 Severity: critical

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 26, 2026

Model: kilo-auto/free

Code Review Roast 🔥

Verdict: No Issues Found | Recommendation: Merge

The previous critical await issue on line 927 has been fixed. The code now properly awaits the async normalizeJsonChatUsageResponse function.

Oh wait, this PR is actually clean. I need to sit down. I had my flamethrower warmed up and everything.

🏆 Best part: The SSE streaming normalization using TransformStream is actually quite elegant. Clean async iterator pattern that handles chunked responses without breaking a sweat.

📊 Overall: Like finding a unicorn in production — I didn't think clean PRs existed anymore, but here we are.

Files Reviewed (2 files)
  • src/plugin.ts - Previously had 1 critical issue (fixed)
  • test/plugin.test.mjs - No issues (tests are well-written)

Reviewed by laguna-m.1-20260312:free · 622,263 tokens

Comment thread src/plugin.ts Outdated

const contentType = response.headers.get('content-type')?.toLowerCase() ?? '';
if (contentType.includes('application/json')) {
return normalizeJsonChatUsageResponse(response);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you are just holding an empty box (Promise) instead of the actual Response.

🩹 The Fix:

Suggested change
return normalizeJsonChatUsageResponse(response);
return await normalizeJsonChatUsageResponse(response);

📏 Severity: critical

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 26, 2026

🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you're just holding an empty box (Promise) instead of the actual Response.

🩹 The Fix:

    return await normalizeJsonChatUsageResponse(response);

📏 Severity: critical

Comment thread src/plugin.ts Outdated

const contentType = response.headers.get('content-type')?.toLowerCase() ?? '';
if (contentType.includes('application/json')) {
return normalizeJsonChatUsageResponse(response);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 The Roast: Missing await on async function call. This is like ordering a pizza and walking away before it arrives - you're just holding an empty box (Promise) instead of the actual Response.

🩹 The Fix:

Suggested change
return normalizeJsonChatUsageResponse(response);
return await normalizeJsonChatUsageResponse(response);

📏 Severity: critical

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 43ecd27: added the explicit await before normalizeJsonChatUsageResponse(response). Build, plugin tests, check:exports, and full npm test pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant