Immersion cooling of servers is always fun, and it has evolved in the 20 years or so since I first saw it with $300/gallon special 3M liquids. In 2019, at every enterprise trade show, we see a few servers that use this cooling in data centers, despite the different infrastructure needs they have. In order to simplify adoption, TMGcore have developed fully self-contained and physically dense server containers. Not only that, but ‘OTTO’ is supposed to be better for the environment too.

The traditional picture we all have of data centers are racks upon racks of servers perhaps built into hot channels and cold channels for airflow, a big HVAC system, a ton of noise, and a ton of networking and power cables everywhere. Over the last few years, we are seeing more and more efforts to bring the power efficiency of these date centers into more reasonable numbers, and one of those methods has been through two-phase immersion cooling: rather than using air, you put the whole server/rack into a liquid with a low boiling point and use the phase transition along with convection as your heat removal system.


A GIGABYTE Demo at CES 2017

Managing the infrastructure needs for two-phase immersion cooling is different to a traditional data center. There are the liquids, the heat exchangers, the power, the maintenance, and the fact that not a lot of people are used to having big expensive hardware dipped in what looks like water. This is why an immersion demonstration at a trade show usually draws a crowd – despite seeing it year on year, there are plenty of people that haven’t. How TMGcore have solved most of these issues is to remove the infrastructure and maintenance requirements completely.


60 kW Unit in 16 sq ft

The OTTO is a self-contained, automated, two-phase liquid immersed data center unit. All a datacenter needs to add is a connection to its power, network, and water lines. The family of products from TMGcore, built with partners, is designed so that once the hardware is installed, it doesn’t need adjusting by the person buying it. Units come in different sizes, and customers can scale their needs simply by ordering more units. Hardware hotswapping is either done locally or remotely by the internal system, energy is reused by the heat exchangers, and the typical ‘PUE’ metric that describes the power efficiency of a datacenter is only 1.028, compared to 1.05/1.06 for some of the most efficient air-cooled data centers. This means that for every megawatt of HPC compute done, TMGcore claims that their OTTO systems only need 1.028 megawatts of energy.


Six units with stacking

Another claim from TMGcore is compute density, up to 3.75 kW per square foot. This means that the three main feature sizes of Otto, 60 kW, 120 kW, and 600 kW, come in self-contained sizes of 16 sq ft, 32 sq ft, and 160 sq ft. The goal here is to provide compute capacity when space requirements are low. The units can also be stacked where required, or retained in portable containers where facilities exist. Customers with specific requirements can request unique builds as required.

Each unit is fitted with TMGcore’s own blade infrastructure, aptly named an ‘OTTOblade’. An example of one blade that the company provides is a dual socket Intel Xeon with dual 100G Ethernet, 512 GB of DRAM, eight SSDs, and 16 V100 GPUs, for 6 kW. 10 of these can go into one of the 60 kW units, affording 160 V100 GPUs in 16 sq ft.

Obviously one of the key criticisms for self-contained, sealed, automated hardware is that it’s a pain when hardware fails and it needs changing. One of the ideas behind two-phase cooling is that the temperature of the hardware can be closely monitored to extend its lifespan. For other out-of-the-box failures, some of it can be managed by the automated systems, while others will require engineers on site. The idea is that because these units are a lot easier to manage, operational expenses will be severely reduced regardless.

TMGcore is working with partners for initial deployments, and we’re hoping to see one in action this week at Supercomputing. I have an open offer to visit the R&D facility next time I’m in Dallas.

Source: TMGcore

Related Reading

POST A COMMENT

20 Comments

View All Comments

  • prime2515103 - Tuesday, November 19, 2019 - link

    When is the desktop version coming out? Reply
  • The True Morbus - Tuesday, November 19, 2019 - link

    If only they had gone AMD, they wouldn't need ridiculous cooling like this, and would be faster :P
    The server market turns slooooowly.
    Reply
  • Santoval - Tuesday, November 19, 2019 - link

    I wonder if they will also provide an AMD Epyc version. If Intel is one of the partners they helped in the design I would assume they will not. Reply
  • Santoval - Tuesday, November 19, 2019 - link

    edit : "...*that* helped in the design..." Reply
  • valinor89 - Tuesday, November 19, 2019 - link

    "energy is reused by the heat exchangers"
    What does this mean? Does it mean they just use the natural movement of the liquid from convection and avoid some pumps or do they actively harvest energy using some sort of peltier or such?
    Reply
  • saratoga4 - Tuesday, November 19, 2019 - link

    Typo for "removed by the heat exchangers" I think. Reply
  • surt - Tuesday, November 19, 2019 - link

    Could be, but power plants do a lot of work to generate hot gas to drive turbines, so recovering some power doesn't seem out of the question in a fully contained system like this. Reply
  • FunBunny2 - Tuesday, November 19, 2019 - link

    I don't recall when IBM released it's first air-cooled full mainframe (3X0 class machines), but for the first 40 or 50 years of mainframe computing, liquid cooling was all there was. Whether it was total immersion, in some definition of 'total', I don't know. Back To The Future. Reply
  • Lord of the Bored - Wednesday, November 20, 2019 - link

    The first immersion-cooled system I know of was the Cray-2. Water inside pipes is a diffrent and more mundane solution than sticking electrical components into an inert fluid. Reply
  • PeachNCream - Tuesday, November 19, 2019 - link

    Data center power consumption is significantly impactful given the sheer number of racked systems people operate globally. It's great to see attempts to handle waste heat more efficiently, but the core problem is that modern civilization is broadly compelled to process vast (and every growing) amounts of information for a variety of reasons. Certainly, we can now handle individual chunks of data more efficiently than we could when people kept filing cabinets crammed to the brim with papers, but the problem is that as we've gotten more efficient, we have been storing and processing far more data to offset the advantages our technological systems provide without fully considering each processing or storage requirement and whether or not it really is necessary to begin with. I'm afraid the net gains from immersion cooling will be wiped out by dumping more power into more processing we would have otherwise not done at all which will continue to result in a net loss of quality of life for civilization as a whole for obvious reasons anyone can reach on their own. Reply

Log in

Don't have an account? Sign up now