Keeping Your Cool In A Red Hot AI Data Center Space

Virtual Interview


Back in the early days of commercial computing, when most of us were either just born or not yet born, the circuits comprising the main frame of central processing units in the IBM System/360 computers were so hot that these machines had to be liquid cooled.

And perhaps the intervening six decades of data center computing, where most of machines could be cooled using moving air, are the abnormality but liquid cooling is returning to normal as compute densities are on the rise.

This increase in compute – and therefore thermal – density is not an accident, but is the direct result tightly coupling latency sensitive compute, memory, and networking into much more capacious systems. These days, a rack, a row, or a hall filled with hybrid CPU-GPU systems within a data center comprises what is essentially a single system, with one application spanning all that machinery. The temperature is rising in the data center faster than the capacity is. This is an inevitable consequence as Moore’s Law increases in transistor speed and density have slowed with each new manufacturing process – a relatively new phenomenon that worked for decades and was one of the enabling factors of air-cooled systems in the first place.

But when it comes to AI and HPC systems, those days are quickly coming to an end. And that means that systems and the data centers that house them need to be re-engineered for higher thermal densities.

To get a sense of what is happening in today’s high-performance data centers, we sat down with Scott Mills, Senior Vice President of product and engineering at Digital Realty, and Scott Tease, General Manager of HPC and AI at Lenovo, both of whom are embracing the challenges of building higher performance, power hungry systems and keeping them powered and cooled.

Digital Realty operates over 300 data centers in more than two dozen countries and more than 50 metropolitan areas for over 5,000 customers. And Lenovo, of course, is the fast-growing original equipment manufacturer of systems worldwide and also a major supplier of on premises HPC/AI systems.

The challenges that IT managers face as they adopt AI technologies are complex, but Digital Realty and Lenovo have the experience to support enterprises through their digital transformation journey. In this chat, we talk about needing to support 100 kilowatt (kW) system racks and possibly even more dense iron in the future, and what the engineering issues are as well as the benefits of moving back to liquid cooling in the datacenter for such systems. The trick is that there is no one answer that suits all customers, and the good news is that there are lots of different ways to co-design the systems and the data center cooling to provide the best environment in terms of efficiency and cost.

Host: Timothy Prickett Morgan

Speakers:

  • Scott Tease, General Manager of HPC and AI at Lenovo
  • Scott Mills, Senior Vice President of Product and Engineering at Digital Realty