Thermal Analysis of Power Semiconductor Devices

Category: 電磁場解析 › パワーエレクトロニクス | Integrated 2026-04-11
Power semiconductor junction temperature distribution and thermal resistance network FEM analysis
パワー半導体モジュールのジャンクション温度分布と熱抵抗ネットワーク

Theory and Physics

Overview — Why FEM is Needed

🧑‍🎓

Professor, what is FEM used for in the thermal design of power devices? The datasheet lists thermal resistance, so can't we just calculate with that?

🎓

Good question. The thermal resistance $R_{\theta,jc}$ in the datasheet is a value under ideal conditions where "the entire chip surface heats uniformly and the backside is cooled uniformly." However, reality is not that simple.

The chip temperature of a SiC MOSFET is managed by the junction temperature $T_j$. Exceeding the rated 175°C causes an exponential decrease in lifespan. The problem here is the voids (air gaps) in the die attach layer inside the package. During the manufacturing soldering process, microscopic voids inevitably remain. These voids locally increase thermal resistance, but this cannot be evaluated with a 1D equivalent circuit. 3D analysis using FEM becomes essential.

🧑‍🎓

Do they really have that much impact? Specifically, how much does the temperature rise?

🎓

In Infineon's CoolSiC 1200V module, there are reported cases where with only a 5% void rate in the die attach layer, the junction temperature locally rises by about 15°C. If the design had only a 20°C margin, a 5% void rate could potentially cause reliability failure.

Therefore, thermal design for power semiconductors must guarantee the "worst-case point temperature," not the "average temperature," and to do that, the 3D temperature distribution must be solved using FEM.

🧑‍🎓

15°C from just 5% voids... You'd never notice that just by looking at the datasheet.

Thermal Resistance Network $R_{\theta,ja}$

🧑‍🎓

First, could you explain the basics? What is the structure of the thermal resistance listed in the datasheet?

🎓

The steady-state temperature of a power device is expressed by the series sum of thermal resistances from the junction (chip junction surface) to the ambient environment. This is the so-called junction-to-ambient thermal resistance $R_{\theta,ja}$:

$$ T_j = T_a + P_{\text{loss}} \cdot R_{\theta,ja} $$ $$ R_{\theta,ja} = R_{\theta,jc} + R_{\theta,cs} + R_{\theta,sa} $$
Physical Meaning of Each Term
  • $T_j$ (Junction Temperature): The highest temperature point on the semiconductor chip. Rated 150–175°C for Si-IGBT, 175–200°C for SiC-MOSFET. This temperature governs lifespan.
  • $T_a$ (Ambient Temperature): Air temperature at the cooling system inlet. Assumed worst-case 105°C for automotive, 40–55°C for industrial.
  • $P_{\text{loss}}$ (Power Loss): Sum of conduction loss $P_{\text{cond}} = I^2 R_{\text{DS(on)}}$ and switching loss $P_{\text{sw}} = \frac{1}{2}(E_{\text{on}}+E_{\text{off}}) f_{\text{sw}}$.
  • $R_{\theta,jc}$ (Junction-to-Case): From chip to package backside. Includes die attach layer (solder/sintered silver), DBC/AMB substrate. Typically 0.1–1.0 K/W.
  • $R_{\theta,cs}$ (Case-to-Sink): Thermal resistance of TIM (Thermal Interface Material). 0.1–0.5 K/W for grease, 0.5–2.0 K/W for thermal pads.
  • $R_{\theta,sa}$ (Sink-to-Ambient): Performance of heatsink + cooling system. 1–10 K/W for natural convection, 0.1–1 K/W for forced air, 0.01–0.1 K/W for water cooling.
🧑‍🎓

So it's an image of thermal resistance building up layer by layer from the chip to the heatsink. But if that's all, isn't 1D calculation sufficient?

🎓

That would be fine for steady-state, uniform heating. But in practice, FEM is needed for three reasons:

  • Void Effects: Local defects in the die attach layer prevent heat spreading, causing temperature hot spots on the chip.
  • Thermal Interference: In multi-chip modules (e.g., 6-in-1 packages), heat from adjacent chips interferes.
  • Transient Response: Temperature oscillates at tens of kHz due to PWM switching, causing fatigue.

Cauer Model and Foster Model

🧑‍🎓

How do we handle transient temperature fluctuations? The steady-state thermal resistance alone isn't enough, right?

🎓

That's where the transient thermal impedance $Z_{\theta}(t)$ comes in. Datasheets always include a curve for $Z_{\theta,jc}(t)$. Approximating this with an RC equivalent circuit leads to the Cauer and Foster models.

$$ Z_{\theta}(t) = \sum_{i=1}^{n} R_i \left(1 - e^{-t/\tau_i}\right), \quad \tau_i = R_i \cdot C_i $$
🎓
ItemFoster ModelCauer Model
Circuit ConfigurationSeries connection of RC parallel elementsSeries connection of RC series elements
Parameter AcquisitionCurve fitting of $Z_{\theta}(t)$ (easy)Derivation from physical layer structure or conversion from Foster
Intermediate Node TemperatureNo physical meaningCorresponds to interface temperature of each layer
Boundary Condition ChangeNot possible (requires re-fitting)Possible (accommodates heatsink changes, etc.)
Integration with FEMDifficultEasy (corresponds to layer structure)
Circuit SimulatorWidely used in SPICE/PLECSPreferred by physical model advocates
🧑‍🎓

So Foster is easy because it's curve fitting, but becomes unusable if the heatsink is changed. Cauer corresponds to the physical structure, so it works well with FEM, right?

🎓

Exactly. In practice, it's common to use Foster for system simulation (PLECS, Simulink) and Cauer-like physical models for detailed package design FEM (Ansys, COMSOL). A common workflow is to extract Foster parameters from FEM results and pass them to circuit simulators.

Governing Equation — 3D Unsteady Heat Conduction

🧑‍🎓

Please explain the fundamental equation solved by FEM.

🎓

What is solved in thermal analysis of power devices is the 3D unsteady heat conduction equation (Fourier's equation):

$$ \rho c_p \frac{\partial T}{\partial t} = \nabla \cdot (k \nabla T) + Q_v $$
Physical Meaning of Each Term
  • $\rho c_p \frac{\partial T}{\partial t}$ (Heat Storage Term): Material's ability to store heat. $\rho$ is density [kg/m³], $c_p$ is specific heat [J/(kg·K)]. SiC chip ($\rho = 3210$ kg/m³, $c_p = 690$ J/(kg·K)) is small, so response is fast.
  • $\nabla \cdot (k \nabla T)$ (Heat Conduction Term): Heat diffusion based on Fourier's law. Thermal conductivity of SiC $k = 370$ W/(m·K) is about 2.5 times that of Si ($k = 148$ W/(m·K)). Solder for die attach ($k \approx 50$ W/(m·K)) is the bottleneck.
  • $Q_v$ (Volumetric Heat Generation Density): Joule heat within the chip. Concentrated in the MOSFET active region, typically on the order of $10^7 \sim 10^9$ W/m³. Smaller chip area leads to higher heat generation density per unit area.
Boundary Conditions
  • Type 1 (Dirichlet): $T = T_0$ (Fixed heatsink base temperature. Cooling water temperature for water cooling.)
  • Type 3 (Newtonian Cooling): $-k \frac{\partial T}{\partial n} = h(T - T_{\infty})$ (Air-cooled surface. Natural convection $h = 5\text{–}25$ W/(m²·K), forced convection $h = 50\text{–}500$ W/(m²·K).)
  • Type 4 (Radiation): $-k \frac{\partial T}{\partial n} = \varepsilon \sigma (T^4 - T_{\text{sur}}^4)$ (Important in high-temperature environments. Package surface $\varepsilon \approx 0.9$.)
  • Interface Thermal Resistance: $q = \frac{\Delta T}{R_{\text{contact}}}$ (TIM layer or die attach interface. Represented by thin-layer elements or thermal contact in FEM.)
Temperature Dependence of Material Properties
Material$k$ [W/(m·K)]$\rho c_p$ [MJ/(m³·K)]Notes on Temperature Dependence
SiC (4H)370 @25°C → 200 @175°C2.21Thermal conductivity decreases significantly with temperature rise (beware of positive feedback)
Si148 @25°C → 80 @175°C1.66Same as above
Cu (Base Plate)398 @25°C → 385 @175°C3.44Temperature dependence is small
Al₂O₃ (DBC Substrate)353.08Temperature dependence is almost negligible
AlN (AMB Substrate)1702.37High thermal conductivity. For high power.
Si₃N₄ (AMB Substrate)902.33High strength. Excellent power cycle resistance.
Sn-Ag-Cu Solder581.65Die attach layer. Effective value decreases with voids.
Sintered Ag2502.48Next-generation die attach. High reliability.

Power Cycle Lifetime Prediction $N_f$

🧑‍🎓

Once you know the temperature, how do you connect it to lifespan?

🎓

The lifespan of a power semiconductor is determined by both the "amplitude" and "maximum value" of temperature. Repeated temperature swings cause wire bond lift-off and solder cracks to progress. This is predicted by the power cycle lifetime formula:

$$ N_f = A \cdot \Delta T_j^{\;-n} \cdot \exp\!\left(\frac{E_a}{k_B \cdot T_{j,\max}}\right) $$
Terms and Typical Parameter Values
  • $N_f$ (Number of Cycles to Failure): Power cycle lifetime. For wire bond lift-off, $10^4 \sim 10^7$ cycles is typical.
  • $\Delta T_j$ (Temperature Amplitude): Junction temperature fluctuation range per cycle [K]. $\Delta T_j = 50$ K yields about $10^6$ cycles, $\Delta T_j = 100$ K reduces it to about $10^4$ cycles.
  • $n$ (Temperature Exponent): Coffin-Manson exponent. $n \approx 4 \sim 5$ for Al wire bonds, $n \approx 3 \sim 4$ for Cu wire bonds.
  • $E_a$ (Activation Energy): Arrhenius term. $E_a \approx 0.8$ eV for solder fatigue, $E_a \approx 0.06 \sim 0.08$ eV for wire bonds.
  • $T_{j,\max}$ (Maximum Junction Temperature): Absolute temperature [K]. Higher temperatures accelerate degradation.
  • $k_B$ (Boltzmann Constant): $8.617 \times 10^{-5}$ eV/K.
Correspondence with LESIT Test Data

Simplified formula based on industry-standard LESIT (Leistungselektronik Simulation und Technologie) test data:

$$ N_f = A \cdot \Delta T_j^{\;-n} $$
Package / Bonding Technology$A$$n$Source
IGBT / Al Wire / Solder$3.025 \times 10^{14}$4.416LESIT (ECPE)
IGBT / Cu Wire / Sintered Ag$9.3 \times 10^{14}$−5.3Infineon (Guideline)
SiC MOSFET / Cu ClipNot enough data yet
🧑‍🎓

Wait, so if the temperature amplitude $\Delta T_j$ doubles, the lifespan shortens by $2^{4.4} \approx 21$ times!?

🎓

Exactly. Therefore, thermal design for power devices is not just about "lowering the average temperature"; "reducing the temperature amplitude" directly impacts lifespan. Specific countermeasures include increasing the heat capacity of the heatsink to smooth temperature fluctuations, and increasing the switching frequency to reduce $\Delta T_j$ per pulse.

To accurately determine this $\Delta T_j$, RC equivalent circuits alone may lack sufficient precision. Especially for multi-chip modules or inverters with mixed switching conditions, the best practice is to obtain the temperature time waveform from FEM transient analysis, perform cycle counting using the rainflow method, and then input it into the lifetime formula.

Coffee Break Side Story

IGBT Thermal Runaway — The Terror of Positive Feedback: "Temperature Rises, Current Increases"

IGBTs have a troublesome characteristic. As temperature rises, $V_{\text{CE(sat)}}$ (collector-emitter saturation voltage) increases, raising losses and further increasing temperature—a positive feedback loop. If there is variation in characteristics among parallel-connected IGBTs, the chip carrying more current gets hotter and draws even more current. The countermeasure is to design while constantly monitoring $T_j$ in simulation and accurately constructing the thermal resistance model. The datasheet thermal resistance is a DC value, but in actual switching operation, using the pulse thermal resistance $Z_{\theta}(t)$ is necessary; otherwise, instantaneous temperature is underestimated. On the other hand, SiC MOSFETs have a positive temperature coefficient (temperature rise increases $R_{\text{DS(on)}}$ → current decreases), which is a major advantage as it naturally balances current sharing in parallel connections.

Numerical Methods and Implementation

FEM Discretization and Element Selection

🧑‍🎓

When solving the 3D heat conduction equation with FEM, how exactly is it discretized?

🎓

It's converted to weak form and discretized using the Galerkin method. Approximating the temperature field $T$ with shape functions $N_i$ gives:

$$ T^h(\mathbf{x}, t) = \sum_{i=1}^{n} N_i(\mathbf{x}) \, T_i(t) $$ $$ [C]\dot{\{T\}} + [K_{\theta}]\{T\} = \{Q\} $$
🎓

Here, $[C]$ is the heat capacity matrix, $[K_{\theta}]$ is the thermal conductivity matrix, and $\{Q\}$ is the heat load vector. It has the same form as $[K]\{u\}=\{F\}$ in structural analysis, but since temperature varies with time, the term $[C]\dot{\{T\}}$ is added.

Element selection for thermal analysis of power devices is important:

Element TypeRecommended UseNotes
Hexahedral 2nd order (20 nodes)Layer structure of chip/substrateHighest accuracy. Sweep mesh to divide in layer direction.
Tetrahedral 2nd order (10 nodes)Complex shapes like heatsink finsMesh generation is easy but number of elements increases.
Shell elements (thin plate)Copper foil of DBC substrate (e.g., 0.3mm thick)Effective for thin layers where solid elements would explode in count.
Thermal contactTIM layer, die attach interfaceRepresents interface thermal resistance without elements. Can simulate voids.
🧑‍🎓

The chip is only about 0.1mm, but the heatsink is tens of mm. The scales are completely different. Mesh design seems tough...

🎓

That's precisely the difficulty of thermal analysis for power devices. Chip thickness is 100μm vs. heatsink at 50mm. There's a 500x scale difference. The trick is to use a sweep mesh with 3–5 divisions in the layer direction and refine the in-plane mesh only in areas with steep temperature gradients. The die attach layer directly under the chip needs at least 3 layers in the thickness direction, and areas around voids require in-plane mesh of 0.1mm or less.

Transient Analysis Time Integration

🧑‍🎓

How is time discretization handled? With switching at 10kHz or more, the time step seems challenging.

🎓

That's the key practical point. To track temperature fluctuations over one switching cycle (if $f_{\text{sw}} = 10$ kHz, then $100\,\mu$s), a time step of $\Delta t \leq 10\,\mu$s is needed. But if you track the entire mission profile lasting minutes to hours needed for power cycle lifetime evaluation with such tiny steps, the computational load explodes.

In practice, a two-stage approach is used:

  1. Switching Cycle Scale: Analyze temperature oscillation over a few cycles using FEM or an RC equivalent circuit to obtain $\Delta T_j$ per pulse.
  2. Mission Profile Scale: Input the average loss waveform into an equivalent circuit model (Foster/Cauer) to obtain temperature fluctuations on the minute-to-hour scale.

For time integration, Backward Euler method (1st order implicit) or Crank-Nicolson method (2nd order implicit) is generally used:

$$ \left(\frac{[C]}{\Delta t} + \theta [K_{\theta}]\right) \{T\}_{n+1} = \frac{[C]}{\Delta t} \{T\}_n + \{Q\}_{n+\theta} $$
関連シミュレーター

この分野のインタラクティブシミュレーターで理論を体感しよう

シミュレーター一覧

関連する分野

連成解析構造解析熱解析
この記事の評価
ご回答ありがとうございます!
参考に
なった
もっと
詳しく
誤りを
報告
参考になった
0
もっと詳しく
0
誤りを報告
0
Written by NovaSolver Contributors
Anonymous Engineers & AI — サイトマップ
About the Authors