The Breezy blog | How do transformers actually work?

Breezy blog

What is a 'transformer' and how do they work?

June 2025

In this post we explain what a 'transformer' is and why software, not a magical machine changed the world as we know it. A set of mathematical instructions that run on normal computer chips yet unlocked the world of LLM's.

5 minute read

Insights

When people hear the word “Transformer,” they often picture a big piece of hardware that looks like a black box sitting in a data centre, whirring away with blinking lights. The truth is a transformer is not a machine you can touch. It’s software, a set of mathematical instructions that run on normal computer chips. The brilliance is in the code, not the metal.

How does it actually work?

When you type something into ChatGPT, your words are split into small chunks called “tokens.” Think of them as Lego bricks, the building blocks of language. Computers don’t understand words, only numbers. So each token is turned into a set of numbers (a 'vector') that captures some meaning about the word. The transformer looks at all the tokens in a sentence and decides which ones matter most to each other.

‍

For example, in the sentence “I want to change my booking from Saturday to Sunday.” The model pays special attention to how “change” relates to “booking,” and how “Saturday” relates to “Sunday.” It’s like a really fast, really consistent reader who underlines the most important words in every sentence. Transformers don’t just do this once. They do it across dozens of layers, refining the meaning at each stage. Early layers might notice grammar, while later layers understand intent. Finally, the model uses probabilities to pick the next most likely token, then the next, and so on until you see a full, human-like response.

Math, not magic

All of this happens through math via matrix multiplication, probabilities, and statistics running on normal computer hardware (GPUs are popular because they can crunch lots of numbers at once). There’s no hidden “thinking chip” or secret brain. The transformer is simply a very clever program, doing billions of calculations incredibly quickly.

Why did this change chatbots?

Before transformers, most chatbots were built in two ways:

Rule-based: they followed scripts. If a customer typed “cancel booking,” the bot looked for that exact phrase and replied with a pre-written message. Anything outside the script left it confused.

Older AI models: these read text one word at a time, remembering some context but struggling with long sentences or complex requests. They often forgot what was said earlier in the conversation.

This meant old chatbots felt clunky, rigid, and often frustrating. They could answer simple, repetitive questions, but not much more. It is why expensive customer service tools still relied heavily on a human agent at the end of the line. Transformers changed the game because they can:

Look at an entire message at once, not just word by word
Handle long conversations without losing context
Learn from vast amounts of data, so they can respond naturally to questions they’ve never seen before

That’s why ChatGPT feels human-like where older bots felt robotic.

Why should you care?

Understanding that transformers are software helps cut through the hype. They’re not robotic brains replacing people, they’re tools built on code. They’re flexible, you can run them in the cloud, in apps, or even on your laptop. They’re improving quickly because once the software design was invented, the only limit was computing power and data.

Transformers are the reason AI feels human-like today. But they’re not magic, and they’re not hardware. They’re a clever piece of software design that anyone from Google to a small business owner who uses Breezy can benefit from.

What is a 'transformer' and how do they work?

How does it actually work?

Math, not magic

Why did this change chatbots?

Why should you care?

What is a neural network?

How does AI learns from data?

Big tech, for the everyday business