Thermal Design Power Trend
As of 2024, single-phase direct-to-chip (D2C) cooling dominates the high-end GPU thermal management market. However, with the increasing thermal design power (TDP), two-phase D2C cooling will be required, and it is expected to come in large volume no earlier than 2026 and 2027. IDTechEx has interviewed a large number of players in the data center value chain, ranging from chip makers, cold plate suppliers, and system integrators. Despite different opinions on the exact timeline, the consensus is that around 1500 W is the TDP where single-phase D2C starts to struggle, and 2000 W might be the limit of single-phase D2C. According to analysis of the historic trend of thermal design power of GPUs by IDTechEx, the take-off of two-phase direct-to-chip will happen soon. IDTechEx also projects the future trend of GPU's TDP, based on the historic trend and roadmap of leading chip suppliers interviewed by IDTechEx, such as Nvidia. More details are included in IDTechEx's report, "Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities".
D2C Cooling Challenges: Single and Two-Phase
With the potential adoption of two-phase liquid cooling, IDTechEx foresees some advantages and barriers, both technically and commercially. Single-phase direct-to-chip (D2C) cooling is a relatively simple and widely adopted solution. It uses a liquid coolant, typically a water-glycol mixture, to absorb heat from the chips via convection without undergoing a phase change. However, it faces significant technical challenges, such as potential leakage of coolant, which poses risks to IT equipment, and the mechanical stress caused by high flow rates. In order to cool down a chip with a TDP of 1000 W, approximately 1.5 L per minute is needed, which is fairly significant. The high flow rate also leads to potential erosion corrosion and requires quick disconnects with larger diameters, which adds up the total cost quickly.
The complexity of plumbing in data centers, especially around tight spaces, adds to the maintenance burden. Additionally, the high capital expenditure (CAPEX) (e.g., $200-$400 for a cold plate system including QDs, fluid distribution manifold inside servers, hoses, etc.) required for installation, particularly in retrofitting older data centers makes cold plate cooling a costly option upfront despite over the long run, it will be more energy efficient thereby saving costs.
On the other hand, two-phase D2C cooling offers higher efficiency by using the phase change of the coolant, which allows for better heat dissipation and lower cooling costs per watt. It also reduces mechanical stress because it operates at lower flow rates than single-phase systems. For instance, the flow rate for a two-phase cold plate is around 0.3 L/min to cool down a chip with a TDP of 1000 W. However, two-phase systems come with their own challenges. The use of fluorinated liquids can lead to environmental hazards if these fluids escape and form aerosols, raising concerns about safety and their global warming potential. Additionally, these systems are expensive to implement, with higher CAPEX for cold plate setups and additional fluid recycling and disposal costs. Despite its efficiency, the environmental and commercial hurdles make two-phase cooling a more complex choice. However, with design considerations, some of the challenges can be mitigated, and IDTechEx's "Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities" report also quantifies the CAPEX of single and two-phase cooling technologies with costs per component.
In summary, while single-phase cooling is simpler and more established, it has higher maintenance and technical risks. Two-phase cooling is more efficient but faces environmental concerns and higher initial costs, making it a less straightforward solution despite its advantages. However, with the upcoming thermal design power trend, IDTechEx believes that two-phase cold plate has potential, especially considering that they are easier to get retrofitted into existing data centers compared with immersion cooling. In "Thermal Management for Data Centers 2025-2035: Technologies, Markets, and Opportunities", IDTechEx has also listed the technical and commercial barriers of single and two-phase immersion, along with the roadmap and timeline for different cooling technologies based on primary and secondary research.