Nvidia CEO Jensen Huang speaks during a press conference at The MGM during CES 2018 in Las Vegas on January 7, 2018.

Almond Ngan | AFP | Getty Images

Software that can write paragraphs of text or draw pictures that look like a human created them has started a gold rush in the tech industry.

Companies like Microsoft and Google are scrambling to integrate cutting-edge AI into their search engines, as billion-dollar competitors like OpenAI and Stable Diffusion race ahead and release their software to the public.

Powering many of these applications is a roughly $10,000 chip that has become one of the most critical tools in the artificial intelligence industry: the Nvidia A100.

The A100 has become the “workhorse” of artificial intelligence professionals at the moment, says Nathan Benaich, an investor who publishes a newsletter and Report covers the AI ​​industry, including an incomplete list of supercomputers using A100s. Nvidia takes 95% of the GPU market that can be used for machine learning, according to New Street Research.

AI is the catalyst behind Nvidia's profit pace, says Susquehanna's Christopher Rolland

The A100 is ideally suited for the kind of machine learning models that power tools like ChatGPT, Bing AIor Stable diffusion. It can perform many simple calculations simultaneously, which is important for training and using neural network models.

The technology behind the A100 was originally used to render sophisticated 3D graphics in games. It’s often called a graphics processing unit, or GPU, but these days Nvidia’s A100 is configured and targeted at machine learning tasks and runs in data centers, not inside glowing gaming PCs.

Large companies or startups that work with software like chatbots and image generators require hundreds or thousands of Nvidia chips and either buy them on their own or secure access to the computers from a cloud provider.

Hundreds of GPUs required to train artificial intelligence models, such as large language models. The chips must be powerful enough to quickly crunch terabytes of data to recognize patterns. After that, GPUs like the A100 are also needed for “inference,” or using the model to generate text, make predictions, or identify objects inside photos.

This means AI companies need access to many A100s. Some entrepreneurs in the space even see the number of A100s they have access to as a sign of progress.

“A year ago we had 32 A100,” Stability AI CEO Emad Mostaque wrote on Twitter in January. “Dream big and stick moar GPUs kid. Brrr.” Stability AI is the company that helped develop Stable Diffusion, an image generator that attracted attention last fall, and which reportedly has a valuation of over 1 billion dollars.

Now, Stability AI has access to over 5,400 A100 GPUs, according to to an estimate from the State of AI report, which maps and tracks which companies and universities have the largest collection of A100 GPUs — though it doesn’t include cloud providers, which don’t publish their numbers publicly.

Nvidia gets on the AI ​​bandwagon

Nvidia will benefit from the AI ​​hype cycle. During Wednesday’s financial fourth quarter results reportAlthough overall sales fell 21%, investors drove stock up about 14% on Thursday, mainly because the company’s AI chip business — reported as data centers — rose 11% to more than $3.6 billion in sales in the quarter, showing continued growth.

Nvidia stock is up 65% year-to-date in 2023, outperforming the S&P 500 and other semiconductor stocks.

Nvidia CEO Jensen Huang couldn’t stop talking about AI during a call with analysts on Wednesday, suggesting that the recent boom in artificial intelligence is at the center of the company’s strategy.

“The activity around the AI ​​infrastructure that we built, and the activity around inferencing using Hopper and Ampere to influence large language models has just gone through the roof in the last 60 days,” Huang said. “There’s no question that whatever we’re looking at this year going into the year has changed pretty dramatically as a result of the last 60, 90 days.”

Ampere is Nvidia’s codename for the A100 generation of chips. Hopper is the code name for the new generation, including the H100, which recently started shipping.

More computers are needed

Nvidia A100 processor


Compared to other types of software, such as serving a web page, which use processing power sometimes in bursts of microseconds, machine learning tasks can take up the entire computer’s processing power, sometimes for hours or days.

This means that companies that find themselves with a hit AI product often need to acquire more GPUs to handle peak traffic or improve their models.

These GPUs don’t come cheap. In addition to a single A100 on a card that can be inserted into an existing server, many data centers use a system that includes eight A100 GPUs working together.

This system, Nvidia’s DGX A100, has an MSRP of nearly $200,000, though it comes with the chips needed. On Wednesday, Nvidia said it would sell cloud access to DGX systems directly, which is likely to reduce the cost of entry for tinkerers and researchers.

It’s easy to see how the cost of the A100 can add up.

For example, an estimate by New Street Research found that the OpenAI-based ChatGPT model in Bing search could require 8 GPUs to deliver an answer to a query in less than a second.

At that rate, Microsoft would need over 20,000 8-GPU servers just to deploy the Bing model to everyone, suggesting Microsoft’s feature could cost $4 billion in infrastructure spending.

“If you come from Microsoft and you want to scale it, at the scale of Bing, it’s maybe $4 billion. If you want to scale at the scale of Google, which serves 8 or 9 billion queries every day, you actually need to spend $80 billion on DGX.” said Antoine Chakaivan, a technology analyst at New Street Research. “The numbers we came up with are huge. But they’re simply a reflection of the fact that every single user using such a large language model requires a huge supercomputer while using it.”

The latest version of Stable Diffusion, an image generator, was trained on 256 A100 GPUsor 32 machines with 8 A100s each, according to information published online by Stability AI, totaling 200,000 computing hours.

At the market price, just training the model cost $600,000, Stability AI chief Mostaque said on Twitter, suggesting in a tweet exchange the price was unusually cheap compared to the competition. It does not count the cost of “inference” or implementation of the model.

Huang, Nvidia’s CEO, said in an interview with CNBC’s Katie Tarasov that the company’s products are actually cheap for the amount of computation these types of models need.

“We took what would otherwise be a $1 billion data center running processors, and we shrunk it to a $100 million data center,” Huang said. “Now $100 million, when you put it in the cloud and shared by 100 companies, is next to nothing.”

Huang said Nvidia’s GPUs allow startups to train models at a much lower cost than if they used a traditional computer processor.

“Now you can build something like a large language model, like a GPT, for something like $10, $20 million,” Huang said. “It’s really, really affordable.”

New competition

Nvidia isn’t the only company making GPUs for artificial intelligence. AMD and Intel have competing GPUs, and big cloud companies like Google and Amazon develops and distributes its own chips specifically designed for AI workloads.

Still, “AI hardware remains heavily consolidated to NVIDIA,” according to State of AI calculate the report. In December, more than 21,000 open-source AI papers said they used Nvidia chips.

Most researchers included in the State of AI Compute Index used the V100, Nvidia’s chip that came out in 2017, but the A100 quickly grew in 2022 to be the third most used Nvidia chip, just behind a $1,500 or less consumer graphics chip originally intended for gaming .

The A100 also has the distinction of being one of only a few chips to have export controls placed on it due to national defense reasons. Last fall, Nvidia said in an SEC filing that the U.S. government imposed a licensing requirement that prevents exports of the A100 and H100 to China, Hong Kong and Russia.

“The USG indicated that the new licensing requirement will address the risk that the covered products could be used in, or diverted to, a ‘military end use’ or ‘military end user’ in China and Russia,” Nvidia said in his report. Nvidia previously said it adapted some of its chips for the Chinese market to comply with US export restrictions.

The toughest competition for the A100 may be its successor. The A100 was first introduced in 2020, an eternity ago in chip cycles. The H100, which was introduced in 2022, is starting to be produced in volume — indeed, Nvidia recorded more revenue from H100 chips in the quarter ended in January than the A100, it said Wednesday, even though the H100 is more expensive per unit.

The H100, Nvidia says, is the first of its data center GPUs to be optimized for transformers, an increasingly important technology that many of the latest and greatest AI applications use. Nvidia said Wednesday it wants to make AI training over 1 million percent faster. That could mean AI companies wouldn’t need as many Nvidia chips in the end.