Fri. Dec 2nd, 2022

With Moore’s Law repealed, traditional approaches to meeting the insatiable demand for increased computing performance will require a disproportionate increase in cost and power.

At the same time, slowing the effects of climate change will require more efficient data centers, which already consume more than 200 terawatt-hours of energy annually, or about 2% of global energy consumption.

Published today, the new Green500 list of the world’s most efficient supercomputers showcases the energy efficiency of accelerated computing, which is already used in all of the top 30 systems on the list. Its impact on energy efficiency is staggering.

We estimate that TOP500 systems require more than 5 terawatt-hours of energy per year to operate, or $750 million worth of energy.

But that amount could be reduced by more than 80% to just $150 million, saving 4 terawatt-hours of energy if these systems were as efficient as the top 30 greenest systems on the TOP500 list.

Conversely, with the same power budget as today’s TOP500 systems and the efficiency of the top 30 systems, these supercomputers can deliver five times today’s performance.

And the efficiency improvements seen with the latest Green500 systems are just the beginning. NVIDIA is committed to continually improving the power consumption of its processors, GPUs, software, and systems.

Debut Hopper Green500

NVIDIA technologies are already used in 23 of the top 30 systems in the latest Green500 list.

Among the highlights: The Flatiron Institute in New York topped the list of top-performing Green500 supercomputers powered by Lenovo’s air-cooled ThinkSystem with NVIDIA Hopper H100 GPUs.

The supercomputer, called Henri, performs 65 billion double-precision floating-point operations per watt, according to Green500, and will be used to solve problems in computational astrophysics, biology, mathematics, neuroscience and quantum physics.

Based on the NVIDIA Hopper GPU architecture, the NVIDIA H100 Tensor Core GPU delivers up to 6x faster AI performance and up to 3x faster HPC performance than the previous generation A100 GPU. It is designed to work with incredible efficiency. Its second-generation Multi-Instance GPU technology allows the GPU to be divided into smaller compute units, greatly increasing the number of GPU clients available to data center users.

And this year’s SC22 showroom showcases new systems featuring the latest NVIDIA technologies from ASUS, Atos, Dell Technologies, GIGABYTE, Hewlett Packard Enterprise, Lenovo, QCT and Supermicro.

The fastest new computer on the TOP500 list, the Leonardo, hosted and managed by the non-profit consortium Cineca and equipped with almost 14,000 NVIDIA A100 GPUs, ranked 4th and also 13th among the most energy efficient systems.

The latest TOP500 list boasts the largest number of NVIDIA technologies.

In total, NVIDIA technologies are used in 361 TOP500 systems, including 90% of new systems (see chart).

Next Generation Accelerated Data Center

NVIDIA is also developing new compute architectures to deliver even greater power efficiency and performance for the accelerated data center.

The Grace CPUs and Grace Hopper superchips, announced earlier this year, will deliver the next major power-efficiency boost for NVIDIA’s accelerated computing platform. The Grace Superchip CPU delivers twice the performance per watt of a traditional CPU thanks to the incredible efficiency of the Grace CPU and low power LPDDR5X memory.

Assuming a 1MW HPC data center with 20% power dedicated to the CPU section and 80% to the accelerated part using Grace and Grace Hopper, data centers can do 1.8x more work for the same power budget compared to similarly divided. x86 based data center.

DPUs provide additional efficiency gains

Along with Grace and Grace Hopper, NVIDIA networking technologies are accelerating the development of cloud supercomputing, just as increased use of simulations is driving demand for supercomputing services.

The NVIDIA Quantum-2 InfiniBand platform, powered by NVIDIA BlueField-3 DPUs, delivers the exceptional performance, wide availability, and robust security that cloud providers and supercomputing centers demand.

The effort described in a recent white paper demonstrated how DPUs can be used to offload and accelerate network, security, storage, or other infrastructure and control-layer application functions, reducing server power consumption by up to 30%.

The amount of energy savings increases as the load on the server increases and can easily save $5 million in energy costs for a large data center with 10,000 servers over the three year life of the servers, as well as additional savings in cooling, power delivery, rack space and server. capital expenditures.

Accelerated computing with DPUs for networking, security, and storage is one of the next important steps to improve data center energy efficiency.

More with less

Breakthroughs like this are happening as the scientific method rapidly transforms into an approach based on data analysis, artificial intelligence and physics-based modeling, making more efficient computers the key to next-generation scientific breakthroughs.

By providing researchers with a multidisciplinary HPC platform optimized for this new approach and capable of delivering both performance and efficiency, NVIDIA empowers scientists to make important discoveries that will benefit us all.

Additional Resources