Tokens are the fundamental units that LLMs process. Instead of working with raw text (characters or whole words), LLMs convert input text into a sequence of numeric IDs called tokens using a model-specific tokenizer.
A single token typically represents
a common word like (“hello”)
a subword (“un” + “derstanding”)
a punctuation mark or space
and Occasionally a single character (for rare/unseen text)
It is roughly ~4 characters ≈ 1 token in English text (actual ratio varies by language and tokenizer).


How Tokenization Works
Encoding – Input text is greedily split into the largest possible chunks that exist in the model’s fixed vocabulary (typically 32k–200k entries).
Lookup – Each chunk is replaced by its numeric ID from the vocabulary.
Processing – The LLM performs all computation on these numbers (not on text).
Decoding – Output token IDs are converted back into human-readable text.
Different providers use different tokenizers
OpenAI: tiktoken/cl100k/o200k series
Anthropic: custom
Google: SentencePiece-based
Meta: Llama tokenizer, etc..
The same prompt can therefore consume dramatically different token counts across models
“Hello world!” is 3–11 tokens depending on the model.

Tokens are the real currency and the real constraint of modern LLMs. Understanding tokenization is no longer optional—it is a core, for both engineering effectiveness and financial predictability when building or scaling LLM-powered products.

Brian Wang is a Futurist Thought Leader and a popular Science blogger with 1 million readers per month. His blog Nextbigfuture.com is ranked #1 Science News Blog. It covers many disruptive technology and trends including Space, Robotics, Artificial Intelligence, Medicine, Anti-aging Biotechnology, and Nanotechnology.
Known for identifying cutting edge technologies, he is currently a Co-Founder of a startup and fundraiser for high potential early-stage companies. He is the Head of Research for Allocations for deep technology investments and an Angel Investor at Space Angels.
A frequent speaker at corporations, he has been a TEDx speaker, a Singularity University speaker and guest at numerous interviews for radio and podcasts. He is open to public speaking and advising engagements.