Skip to content

Conversation

jinzhuer
Copy link

What

  • Remove the conflicting <|reserved_200018|> (duplicate of <|endofprompt|> id=200018).
  • Skip already-used ids when generating reserved_* specials.
  • Add tests/test_token_ids_unique.py to ensure token-id uniqueness.

Why

Special token ids must be unique. The duplicate breaks the one-id-per-token invariant.

Related Issue

Fixes #457

Tests

The new test fails before this change and passes after.
All existing tests pass.

Compatibility

No functional change; only removes the redundant <|reserved_200018|> entry.

@jinzhuer jinzhuer force-pushed the fix/special-ids-unique branch from b38b4ee to a825d1a Compare October 13, 2025 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: o200k_harmony duplicates special token id 200018

1 participant