Investing

The Little-Known Chip Powering the Very Future of AI

As more AI shifts from the training to inference phase, TPUs are primed to steal the spotlight

Editor’s note: “The Little-Known Chip Powering the Very Future of AI” was previously published in June 2025 with the title, “How Google’s TPU Is Powering the Very Future of AI.” It has since been updated to include the most relevant information available.

For nearly three years now, Nvidia (NVDA) has seemingly had AI hardware locked up tighter than Fort Knox. 

The company’s GPUs (Graphics Processing Units) have been used to train every headline-grabbing AI model, from ChatGPT to Gemini. And as a result, Nvidia stock has gone supernova, up nearly 950% since November ‘22. 

For a while, it looked to own the keys to the entire AI kingdom.

But what if I told you there’s another chip quietly emerging from the shadows; one that OpenAI – one of the world’s most important AI companies – is starting to use instead?

It’s one that most investors haven’t yet heard of; and that Nvidia doesn’t specialize in… 

A chip that, thanks to a seismic shift from Nvidia-dominated GPU training to inference and deployment at scale, could soon become the hottest chip on the market…

A Tensor Processing Unit, or TPU.

A TPU is a custom-built chip that Google designed specifically for running AI models. 

Unlike Nvidia’s GPUs, which were originally built for rendering video game graphics and later repurposed for AI, TPUs were born with one job: execute computations at blistering speed and maximum efficiency.

In essence, you can think of GPUs as general-purpose race cars and TPUs as hyper-optimized rockets.

And while GPUs are best at training AI models, TPUs were made to dominate inferencing: the part where AI actually thinks, reasons, and responds to users in real time.

That’s where AI models are going – and why TPUs could start to steal the show in a big way. 

TPUs Matter More Than Ever in the ‘Age of Inference’

In some ways, creating genius-level AI is a lot like raising a child.

You teach them everything they need to know, with books, flashcards, lectures, and thousands of examples. It takes time. It’s expensive. And for machines, it’s computationally brutal.

When they achieve inference, they can actually answer questions, solve problems, write custom pieces, or create unique visual art.

Historically, the AI race has been all about training; and Nvidia GPUs became the workhorses of that race.

But once the model is trained, inference is forever. The AI runs billions of times to serve billions of people. 

That’s where the money – and the demand for computational power – starts compounding.

DeepSeek’s Inference Shift: A Breakthrough Moment

Remember the DeepSeek saga that unfolded earlier this year?

Back before the trade war started (which seems like forever ago, I know), a Chinese AI lab named DeepSeek dropped a bomb on the industry.

It had trained a GPT-4-class model for just $6 million… a fraction of the $100 million-plus price tags seen in Western AI labs. But that wasn’t the headline. The real innovation was architectural: DeepSeek built its model to do less thinking upfront and more thinking on the fly using inference.

In other words, instead of baking every answer into a gigantic model during training, it designed this system to reason dynamically in real time. That changed everything.

Suddenly, inference wasn’t just about reading a playbook. It became the playmaker.

And in that world, you want chips that are lean, fast, and optimized for inference. You want TPUs.

Source link

Share with your friends!

Leave a Reply

Your email address will not be published. Required fields are marked *

Get The Latest Investing Tips
Straight to your inbox

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for subscribing.

Something went wrong.