Understanding Tokens in AI
When we talk to AI like chatGPT, it does not read "words" like we do, it reads "Tokens".
Think of tokens like LEGO blocks for words. A token can be:
- a full word
- part of a word
- or even a small piece like “ing” or “tion”
This breaking of text into small pieces is called tokenization.
For GPT-4, an average token is approximately 3/4 the length of a word. So, 100 tokens are approximately 75 words.
Why does AI do this?
- It understands meaning better (cook + ing instead of just cooking)
- It works faster by using fewer building blocks
- It can understand new words it has never seen before
So next time someone says “tokens”, just remember: AI reads text the way kids build castles—with small blocks that can create anything.