Breezy blog

The discovery of the Transformer

May 2025

In 2017,  a group of researchers at Google published a paper titled “Attention Is All You Need”. This set the ground for a revolution in computing power and the AI driven world we see around us today.

In the world of AI, progress does not always arrive with noise. Sometimes a single technical idea quietly changes everything. That is what happened in 2017, when a group of researchers at Google published a paper with an unassuming title attention is all you need.

The paper did not describe a product. It did not announce a new feature. But inside it was the blueprint for the Transformer, the architecture that later made tools like ChatGPT, GPT-4, Claude and Gemini possible. Nearly every AI system reshaping how businesses operate today traces back to that moment.

What is a Transformer?

To understand why this mattered, it helps to think about how humans read. Imagine a customer writes:

“Hi, I want to change my booking from Saturday to Sunday. But only if there’s space.”

You do not process this word by word in isolation. You understand that “change” relates to “booking”, that “Saturday” and “Sunday” are connected, and that “only if” changes the meaning of the entire request. You hold the whole sentence in your head at once.

Older AI models did not work this way. They processed text like a conveyor belt, one word at a time, slowly passing information forward and often losing important context along the way. Transformers work differently. They look at the entire message at once and identify relationships between words, even when they are far apart. This gives them a much richer understanding of meaning.

The role of attention

The key idea behind Transformers is something called attention. Despite the name, it is a very practical concept. Attention simply means the model decides which parts of the input matter most. Not every word carries equal weight.

In the booking example above, “change”, “Sunday” and “only if there’s space” matter far more than “Hi”.

A useful analogy is customer service triage. When a request comes in, not every detail is urgent. Attention allows the model to focus on what actually drives the decision. This ability to prioritise meaning is what makes Transformer-based models feel more natural and more useful than earlier systems.

Why was it a breakthrough?

Transformers changed AI for three important reasons.

  • First, they process language in parallel rather than step by step, this made model training dramatically faster
  • Because of this efficiency, transformers could grow into models with billions (now trillions) of parameters
  • They don’t lose context, so they produce much more natural and useful answers

Without transformers, tools like ChatGPT, Bard, and Claude simply wouldn’t exist.

Why didn’t Google capitalise on it?

This is where the story becomes interesting from a business perspective. Google invented the Transformer, but it was OpenAI that turned it into a widely used product. This was not because Google failed to understand the technology. It was because Google was deeply tied to an existing business model.

Google’s core revenue came from search advertising. A system that gave direct answers instead of lists of links threatened that model. On top of that, large language models can sometimes be wrong. For a company operating under intense regulatory scrutiny, moving slowly was the rational choice.

OpenAI faced a different set of incentives. As a smaller player, it could take risks, release early and iterate in public. That speed allowed it to define a new interaction model before incumbents fully committed. This difference in incentives mattered more than technical capability.

Take away

The lesson here is not about Google versus OpenAI. It is about how change actually reaches the market. Breakthrough ideas often start in research labs. They then become tools that reshape everyday work far sooner than people expect. Large organisations tend to move carefully when a new idea threatens existing revenue. Smaller businesses have the opposite advantage, flexibility.

What felt “too technical” in 2017 became mainstream in just a few years. The same pattern is playing out again with AI agents, automated bookings and AI-driven customer service. For a small business, the opportunity is not to invent the next breakthrough. It is to recognise when one becomes practical and adopt it early.

Google discovered the Transformer but could not immediately act on it. You do not have that constraint. If a tool improves response time, reduces errors or removes friction from bookings, you can use it now. That willingness to experiment early is how smaller businesses stay competitive while larger players deliberate. If you are interested in a more in-depth breakdown of how transformers actually work then we recommend this blog by geeksforgeeks.

An illustrated lightening bolt icon

What is a neural network?

The core architecture behind most AI models.

Read post
An icon showing reading glasses on top of a book

How do transformers actually work?

AI isn't magic. It is a set of mathematical instructions.

Read post
A photo of two people looking at a laptop, the laptop displays a Breezy reporting screen showing the number of customer contacts Breezy resolved that month.
Designed for you

Big tech, for the everyday business

Breezy is used by businesses across the UK, Europe and America. Our mission is to ensure that all businesses, regardless of size, can take advantage of the AI revolution.

Free trial