What is a 'transformer' and how do they work?
In this post we explain what a 'transformer' is and why a set of mathematical instructions that run on normal computer changed the world as we know it
5 minute read
Insights
In this post we explain what a 'transformer' is and why a set of mathematical instructions that run on normal computer changed the world as we know it
5 minute read
Insights
When people hear the word 'Transformer', they often imagine a large piece of hardware sitting in a data centre, humming away behind blinking lights. In reality, a transformer is not a physical machine at all. You cannot touch it. It is software, a set of mathematical instructions that run on standard computer chips. The AI breakthrough was not new hardware. It was a new way of processing language. The intelligence lives in the code, not the metal.
When you type a message into a system like ChatGPT, the model does not see words the way you do. Your text is first broken into small pieces called tokens. You can think of these as the basic building blocks of language. As computers only work with numbers, each token is converted into a numerical representation, often called a vector. This vector captures aspects of meaning, such as how the word is typically used and how it relates to other words.
The Transformer then looks at all of the tokens in the sentence at the same time. This is the key difference from older models. Instead of processing text step by step, it examines the full message and works out how each part relates to every other part. For example, in the sentence:
'I want to change my booking from Saturday to Sunday'
The model learns that 'change' is closely linked to 'booking', and that 'Saturday' and 'Sunday' form a meaningful pair. It also understands that the relationship between those days matters more than the filler words around them. You can think of it like a very fast, very consistent reader who highlights the most important parts of a sentence before responding.
This process is powered by something called attention. Attention allows the model to decide which words matter most in a given context. Not every word carries equal importance. Transformers apply attention repeatedly across many layers. Early layers tend to focus on structure, such as grammar and word order. Later layers begin to capture intent and meaning. Each layer refines the interpretation slightly.
By the time the model is ready to respond, it is not guessing blindly. It is calculating probabilities. Based on everything it has seen so far, it chooses the most likely next token, then the next, and so on, until it forms a complete response that reads naturally to a human.
It is important to demystify what is happening here. There is no hidden 'thinking chip' or artificial brain. Everything the Transformer does is mathematics using matrix multiplication, probability distributions, and statistical optimisation.
This is why GPUs are commonly used. They are good at performing many calculations at once. The model is not thinking in the human sense. It is calculating at enormous speed. Understanding this helps remove unnecessary fear. Transformers are powerful, but they are still tools. They do exactly what the software design allows them to do, nothing more.
Before Transformers, most chatbots fell into one of two categories. The first were rule-based systems. These relied on scripts. If a customer typed a specific phrase like “cancel booking”, the system returned a predefined response. Anything slightly unexpected caused the conversation to break down.
The second used older AI models that processed language sequentially. These systems could remember some context, but they struggled with longer messages or multi-step requests. Important details were often lost. This is why older chatbots felt rigid and frustrating. They worked for simple questions, but failed as soon as a request became nuanced. Transformers changed this because they can:
That is why modern AI assistants feel more fluid and human-like, even though they are still operating purely through mathematics.
Understanding that a Transformer is software, not machine, helps cut through the hype. They are not replacing people. They are tools that apply rules and language understanding at scale. As it is software, it is flexible. Transformers can run in the cloud, inside apps, or alongside tools you already use. They improve quickly because once the design existed, progress depended mainly on data and computing power.
Transformers are the reason AI feels different today. But they are not mysterious and they are not out of reach. They are a clever piece of engineering that businesses of all sizes can benefit from. Whether you are a global tech company or a small business using Breezy to manage bookings, you are tapping into the same underlying idea. The value comes not from owning the technology, but from using it thoughtfully to reduce friction, improve consistency, and serve customers better.

Breezy is used by businesses across the UK, Europe and America. Our mission is to ensure that all businesses, regardless of size, can take advantage of the AI revolution.
Free trial