HappyTokenizer

Interactive GPT Tokenizer Visualization Tool

Explore how different GPT models tokenize your text in real-time. Privacy-focused, client-side processing.

Features

Real-time Visualization - See your text being tokenized as you type
Multiple Model Support - Compare tokenization across all OpenAI GPT models
Cost Optimization - Understand token counts to optimize your API usage
Privacy First - All processing happens in your browser, no data sent to servers
Lightning Fast - Instant tokenization with OpenAI's official tiktoken library
Comprehensive Models - Support for 60+ models including GPT-4o, GPT-3.5, and code models

Why Tokenization Matters

When using OpenAI's API, you're charged per token, not per word. Understanding tokenization helps you:

Reduce API costs by writing efficient prompts
Avoid hitting token limits in your requests
Optimize model performance by understanding how text is processed
Compare different models for your specific use case

Example: "Hello, world!" might be:

3 tokens in GPT-4o
4 tokens in GPT-3.5-turbo
This directly affects your API costs!

Getting Started

Quick Start

Visit happytokenizer.com and start typing in the text area. That's it!

Using the Tool

Enter or paste your text in the main text area
Select a model from the dropdown to see how it tokenizes your text
View token breakdown including token IDs and character mappings
Compare models to find the most cost-effective option for your needs

Supported Models

Modern Models (o200k_base)

GPT-4o series (gpt-4o, gpt-4o-mini, etc.)
Latest OpenAI models with optimized encoding

Chat Models (cl100k_base)

GPT-3.5 series (gpt-3.5-turbo, gpt-3.5-turbo-16k)
GPT-4 series (gpt-4, gpt-4-32k)

Code Models (p50k_base)

Code-davinci series
Optimized for code generation

And Many More...

Legacy models (r50k_base)
Edit models (p50k_edit)
Open-source models (o200k_harmony)

View the complete list at happytokenizer.com/models

For Developers

Local Development

Want to run HappyTokenizer locally or contribute? Here's how:

# Clone the repository
git clone https://github.com/yourusername/happytokenizer.git
cd happytokenizer

# Install dependencies
pnpm install

# Start development server
pnpm dev

# Build for production
pnpm build

Tech Stack

Framework: Astro with React
Styling: Tailwind CSS with custom design system
Tokenization: OpenAI's official tiktoken library
Type Safety: TypeScript throughout
Build Tool: Vite

Architecture

src/
├── components/
│   ├── react/          # React components with client-side logic
│   └── Footer.astro    # Footer component
├── layouts/
│   └── MainLayout.astro  # SEO-optimized base layout
├── pages/
│   ├── index.astro     # Main tokenizer page
│   └── models.astro    # Complete model manifest
└── utils/
    └── modelEncodings.ts  # Model configurations

Privacy & Security

HappyTokenizer is designed with privacy as a core principle:

No data sent to servers - All processing happens locally
No tracking or analytics - Your data stays yours
Open source - Transparent and auditable code
No cookies - Zero tracking technologies

Contributing

We welcome contributions! Here's how you can help:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes and test them
Commit your changes: git commit -m 'Add amazing feature'
Push to the branch: git push origin feature/amazing-feature
Open a Pull Request

Areas for Contribution

UI/UX improvements - Better visualization ideas
Mobile enhancements - Responsive design improvements
Performance - Faster tokenization or loading
Documentation - Better guides and examples
Bug fixes - Report and fix issues

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenAI for the tiktoken library
The Astro team for the amazing framework
All contributors who help improve this tool

Contact

_{Built for the AI community}

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
.vscode		.vscode
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
astro.config.mjs		astro.config.mjs
components.json		components.json
eslint.config.ts		eslint.config.ts
future_plan.md		future_plan.md
og.jpeg		og.jpeg
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
renovate.json		renovate.json
tsconfig.json		tsconfig.json
wrangler.jsonc		wrangler.jsonc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HappyTokenizer

Features

Why Tokenization Matters

Getting Started

Quick Start

Using the Tool

Supported Models

Modern Models (o200k_base)

Chat Models (cl100k_base)

Code Models (p50k_base)

And Many More...

For Developers

Local Development

Tech Stack

Architecture

Privacy & Security

Contributing

Areas for Contribution

License

Acknowledgments

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

HappyTokenizer

Features

Why Tokenization Matters

Getting Started

Quick Start

Using the Tool

Supported Models

Modern Models (o200k_base)

Chat Models (cl100k_base)

Code Models (p50k_base)

And Many More...

For Developers

Local Development

Tech Stack

Architecture

Privacy & Security

Contributing

Areas for Contribution

License

Acknowledgments

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages