Investing

AI Inferencing Is the Future. Are You Holding the Right Stocks?

The AI industry is undergoing a transformation of sorts right now: one that could define the stock market winners – and losers – for the rest of the year and beyond. 

That is, the AI model-making process is undergoing a seismic shift from training to inferencing

In short, it’s pivoting away from the gluttonous obsession with training giant omniscient models. Now it’s rushing headlong into a new era, where inferencing – when models make decisions based on new, unseen data – takes center stage.

This may sound nuanced and technical in nature; and to some degree, it is. (Don’t worry: we’ll get into these details more below.) 

But nonetheless, it will transform how hundreds of billions of dollars are spent, even this year alone, on new AI infrastructure. 

It means the bulk of that money will stop flowing into the stuff that assists AI training… and start flowing toward what boosts model inferencing. 

🚨 Translation for investors: This is the moment to pay attention to because a new class of AI stock winners is about to emerge…

AI Training Fueled the Boom’s First Phase, But That Era Is Ending

Let’s start with the basics here.

Essentially, AI training is exactly what it sounds like. It’s the process of teaching a model what to know. Think of it like putting a bunch of professionals in a room with every book ever written and having them memorize it all, in case someone, somewhere asks them a question about it someday.

And until this point in the AI boom, training has been the star of the show. Companies like OpenAI, Alphabet (GOOGL), and Meta (META) have spent hundreds of millions – sometimes billions – training large language models. 

Nvidia (NVDA) has made a killing selling its monster GPUs (A100, H100, Blackwell, etc.) to power this training frenzy, with nearly 90% of revenue now tied to the data‑center segment and year-over-year growth often exceeding 90%.

And the company isn’t shy about what’s driving the boom – or where it’s headed next.

As CEO Jensen Huang touted recently:

“The world’s most powerful data centers are being built right now, and they’re being built for training AI models that will change everything.”

No argument there, Jensen. Training did change everything.

Yet, that was just the first phase of this new era…

AI Inferencing Takes the Lead: Faster, Smarter, Cheaper

After all that extensive training, these models put that knowledge into practice. 

Every time you ask ChatGPT a question… an autonomous Tesla makes a driving decision… or Meta’s AI filters your social media feed, that’s inferencing.

And thanks to breakthroughs like DeepSeek’s earlier this year, inferencing is stealing the spotlight. Such fine-tuned lightweight models can outperform larger general-purpose models in practical applications, especially when it comes to responsiveness, context handling, and personalization – drastically reducing the cost of deploying useful AI.

Researchers are learning that you don’t need to create an AI model that knows everything – you just need an AI model that knows how to infer things to get to the right answer in real-time. 

As the old saying goes; you want to teach someone how to think, not what to think. 

That’s the shift going on in model-making right now. 

AI doesn’t have to know all the answers – it just needs the skills to find them. 

Indeed, as Nvidia’s Huang said:

“The future of AI isn’t just about building models. It’s about how fast, how cheap, and how smartly you can serve them at scale.”

OpenAI’s Sam Altman made a similar remark during a spring tech event:

“We’ve entered the phase where inference efficiency is the battleground. That’s where AI becomes universal.”

For investors, that means the big bucks are now shifting from model-building to model-serving.

Why AI Inferencing Will Reshape the Infrastructure Investment Landscape

This is where it gets interesting (and profitable). 

The difference between training and inferencing isn’t just technical. It activates totally different parts of the AI supply chain.

The training process requires super-powerful GPUs (as with NVIDIA H100 or Blackwell B200) and massive data centers with top-of-the-line cooling, power, and networking. Robust memory is also a must; and as such, high-bandwidth memory demand is off the charts. Suppliers like Micron (MU) and Samsung are battling to meet demand amid ongoing shortages.

But inferencing uses different, more specialized chips designed for speed and efficiency (like NVIDIA L4, AWS Inferentia, Qualcomm edge AI, AMD MI300). It’s also typically deployed in lighter clusters with more edge devices (phones, cars, cameras using AI locally) and is less memory-intensive. (In other words, it doesn’t have to juggle giant data sets, just the trained model and your input). When it comes to inferencing, researchers prioritize low cost per query and low latency.

The economic implications here? Industry leaders will keep pouring money into AI infrastructure. But increasingly, that cash will flow to different parts of the supply chain as more models achieve inferencing. 

Here’s where we see the action heading…

Source link

Share with your friends!

Leave a Reply

Your email address will not be published. Required fields are marked *

Get The Latest Investing Tips
Straight to your inbox

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for subscribing.

Something went wrong.