Skip to main content

Understanding Tokens in AI

When we talk to AI like chatGPT, it does not read "words" like we do, it reads "Tokens".

Think of tokens like LEGO blocks for words. A token can be:

  • a full word
  • part of a word
  • or even a small piece like “ing” or “tion”

This breaking of text into small pieces is called tokenization.

For GPT-4, an average token is approximately 3/4 the length of a word. So, 100 tokens are approximately 75 words.

Why does AI do this?

  • It understands meaning better (cook + ing instead of just cooking)
  • It works faster by using fewer building blocks
  • It can understand new words it has never seen before

So next time someone says “tokens”, just remember: AI reads text the way kids build castles—with small blocks that can create anything.