Tokens are the fundamental units that LLMs process. Instead of working with raw text (characters or whole words), LLMs convert input text into a sequence of numeric IDs called tokens using a model-specific tokenizer.

A single token typically represents

a common word like (“hello”)

a subword (“un” + “derstanding”)

a punctuation mark or space

and Occasionally a single character (for rare/unseen text)

It is roughly ~4 characters ≈ 1 token in English text (actual ratio varies by language and tokenizer).

How Tokenization Works

Encoding – Input text is greedily split into the largest possible chunks that exist in the model’s fixed vocabulary (typically 32k–200k entries).

Lookup – Each chunk is replaced by its numeric ID from the vocabulary.

Processing – The LLM performs all computation on these numbers (not on text).

Decoding – Output token IDs are converted back into human-readable text.

Different providers use different tokenizers

OpenAI: tiktoken/cl100k/o200k series

Anthropic: custom

Google: SentencePiece-based

Meta: Llama tokenizer, etc..

The same prompt can therefore consume dramatically different token counts across models

“Hello world!” is 3–11 tokens depending on the model.

Tokens are the real currency and the real constraint of modern LLMs. Understanding tokenization is no longer optional—it is a core, for both engineering effectiveness and financial predictability when building or scaling LLM-powered products.