Skip to content

fix(tokenizer): resolve prepend_id scope issue in encode()#421

Open
heloanc wants to merge 1 commit intokarpathy:masterfrom
heloanc:fix/tokenizer-encode-prepend-scope
Open

fix(tokenizer): resolve prepend_id scope issue in encode()#421
heloanc wants to merge 1 commit intokarpathy:masterfrom
heloanc:fix/tokenizer-encode-prepend-scope

Conversation

@heloanc
Copy link
Copy Markdown

@heloanc heloanc commented Mar 26, 2026

Summary

Fix NameError and ValueError in Tokenizer.encode() when prepend is a string.

Problem

  1. prepend_id is assigned inside if prepend is not None: but referenced in a nested block — if encode_single_token() raises, prepend_id is never defined
  2. encode_single_token() raises ValueError for multi-token prepend strings
  3. Inner guard checks prepend instead of prepend_id, masking failed computation

Fix

  • Initialize prepend_id = None before conditional blocks
  • Wrap encode_single_token() in try/except with fallback to encode_ordinary()
  • Check prepend_id is not None instead of prepend is not None for insertion

Fixes #348

Initialize prepend_id before conditional blocks to prevent NameError,
and add try/except fallback for multi-token prepend strings that cause
encode_single_token to raise ValueError.

Fixes karpathy#348

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Agolid pushed a commit to Agolid/autoresearch that referenced this pull request Mar 27, 2026
- Initialize prepend_id = None before conditional blocks
- Wrap encode_single_token() in try/except with fallback to encode_ordinary()
- Check prepend_id is not None instead of prepend for insertion
- Prevents NameError when encode_single_token() raises
- Handles multi-token prepend strings gracefully

Fixes karpathy#421
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Tokenizer.encode: prepend_id scope issue causes NameError when prepend is string

1 participant