top of page

Why High-Fidelity Simulation Matters for Immersion Cooling in the AI Era

  • Apr 23
  • 4 min read
High Fidelity Modelling of Immersion Cooling

Immersion cooling is attracting more attention, but it is not a solved problem. As accelerator power moves into the 700 W to 900 W range and rack densities continue to rise, thermal behaviour inside a dielectric tank becomes harder to predict and less forgiving in operation. Uptime Institute’s 2025 cooling survey points in the same direction: liquid cooling adoption is still gradual, but rising rack density and server heat load are now the main drivers behind it.

This simulation shows why that matters.

The case is a multiphase dielectric immersion tank operating under high-load server conditions, representing one of the more demanding thermal environments in current data-centre design. In that regime, performance is not set by bulk temperature alone. It is set by the interaction of boiling, vapour transport, conduction paths, buoyancy, free-surface motion and local flow distribution across densely packed hardware.

The governing physics include two-phase nucleate boiling and vapour transport, conjugate heat transfer across die, TIM, PCB and enclosure paths, buoyancy-driven turbulence and thermal stratification, VOF free-surface dynamics, inter-board flow maldistribution, temperature-dependent fluid rheology, and surface wettability with contact-angle effects.

A tank can look acceptable in average thermal terms and still fail where it matters: local hotspots, vapour trapping, uneven board cooling or unstable transient behaviour. That is why immersion systems cannot be judged reliably through intuition or low-order approximations alone.


Inside the tank

Under high load, immersion performance is governed by interacting mechanisms rather than any single cooling metric. Boiling onset influences vapour generation. Vapour transport affects local heat removal. Buoyancy reshapes recirculation patterns. Free-surface behaviour changes pressure and flow response inside the tank. Together, these effects determine whether cooling remains uniform and stable across the hardware.

This is where apparently small design choices stop being small. A modest change in spacing, fluid properties or surface condition can change circulation structure, vapour residence time and hotspot formation in ways that are difficult to anticipate without high-fidelity modelling.


Design variables that govern thermal stability

Immersion cooling performance is governed by a tightly coupled set of variables: fluid properties, tank geometry, server spacing, inlet flow rate, CDU approach temperature, surface wettability, board orientation, component layout and load profile.

Small changes in any one of these can shift boiling onset, vapour removal, flow maldistribution and local temperature margin across the tank. The engineering question is not whether a tank works at nominal load, but whether it remains stable at peak and fault conditions.

That question is becoming more important as device power rises. Intel’s Gaudi 3 technical material points to total device power up to 900 W, which is a useful indication of the thermal envelope now being designed around in advanced AI systems.


Where simplified models stop helping

This is where reduced-order assumptions start to lose value. They remain useful for early screening, but once boiling, buoyancy, conjugate heat transfer and free-surface motion are all active in the same enclosure, they do not fully resolve the interaction effects that determine real thermal margin.

For immersion systems, that is often where the design risk sits: vapour accumulation near critical components, uneven flow distribution across boards, thermal layering at tank level, or sensitivity to fluid drift and transient load excursions. What matters is not only average cooling performance, but how stable the system remains when the operating envelope shifts.

That operating envelope is moving quickly. The IEA projects that electricity demand from data centres will more than double by 2030 to around 945 TWh, with AI as the main driver. As compute density rises, thermal management becomes part of the core scaling constraint rather than a secondary utilities issue.


Mansim’s approach

At Mansim, immersion cooling is treated as a single coupled system rather than a sequence of isolated checks.

Our workflow combines GPU-accelerated solvers on HPC infrastructure, machine-learning-driven geometry optimisation, and TPMS lattice cold plate development within one simulation framework, from fluid selection through to rack-level thermal prediction. Reduced-order models and physics-informed neural networks are then used to compress that fidelity into real-time digital twins for control, monitoring and operational decision-making.

That matters because the important decisions in this space are expensive and difficult to reverse. If you are developing dielectric fluids, qualifying immersion hardware, or designing next-generation AI cooling infrastructure, nominal temperature plots are not enough. What is needed is evidence of how the system behaves under representative load, where the real margins sit, and which variables control failure risk.


The questions that matter

The point of high-fidelity simulation is not to produce a more detailed image of boiling flow. It is to answer design questions before they become operating problems.

Where do vapour pockets form under sustained peak load? Which boards are thermally disadvantaged by flow maldistribution? How sensitive is the tank to CDU drift, fluid variation or transient excursions? Which geometry changes improve stability rather than simply reducing average temperature?

Those are the questions that decide whether an immersion system is robust enough to deploy and scalable enough to build on.

Immersion cooling will play an important role in AI infrastructure, but it remains a multiphysics design problem with limited tolerance for assumption-led decisions. The Open Compute Project’s immersion work is still focused on requirements, qualification and standardisation, which is another sign that this field is still being formalised rather than settled. The organisations that move fastest will be the ones using simulation not as illustration, but as engineering evidence.

bottom of page