What is FEM used for in power device thermal design?

It is used to design systems that maintain the chip temperature (junction temperature Tj) of SiC MOSFETs and IGBTs below their rated limits. FEM analysis is used to evaluate three-dimensional thermal distributions that cannot be assessed with equivalent circuit models, such as localized increases in thermal resistance caused by voids in the die attach layer.

What is the difference between Cauer and Foster models?

The Cauer model uses a series RC network that corresponds to the physical layer structure (chip → die attach → substrate → heat sink), where the temperature at each node has a physical meaning. The Foster model uses a parallel RC network derived from curve-fitting a transfer function, where the intermediate nodes have no physical meaning.

How is the power cycle lifetime Nf predicted?

It is predicted using the Coffin-Manson empirical formula: Nf = A·ΔTj^(-n). Here, ΔTj is the junction temperature amplitude, and the coefficients A and n vary depending on the package and Al/Cu wire bonds. LESIT test data is the industry standard for this.

To what extent do die attach voids affect performance?

In the case of Infineon's CoolSiC 1200V module, a 5% void ratio in the die attach layer causes the junction temperature to rise by approximately 15°C. This localized effect cannot be evaluated without a 3D analysis using FEM.

Thermal Analysis of Power Semiconductor Devices

Category: 電磁場解析 › パワーエレクトロニクス | Integrated 2026-04-11

Power semiconductor junction temperature distribution and thermal resistance network FEM analysis

パワー半導体モジュールのジャンクション温度分布と熱抵抗ネットワーク

Theory and Physics

Overview — Why FEM is Needed

🧑‍🎓

Professor, what is FEM used for in the thermal design of power devices? The datasheet lists thermal resistance, so can't we just calculate with that?

🎓

Good question. The thermal resistance $R_{\theta,jc}$ in the datasheet is a value under ideal conditions where "the entire chip surface heats uniformly and the backside is cooled uniformly." However, reality is not that simple.

The chip temperature of a SiC MOSFET is managed by the junction temperature $T_j$. Exceeding the rated 175°C causes an exponential decrease in lifespan. The problem here is the voids (air gaps) in the die attach layer inside the package. During the manufacturing soldering process, microscopic voids inevitably remain. These voids locally increase thermal resistance, but this cannot be evaluated with a 1D equivalent circuit. 3D analysis using FEM becomes essential.

🧑‍🎓

Do they really have that much impact? Specifically, how much does the temperature rise?

🎓

In Infineon's CoolSiC 1200V module, there are reported cases where with only a 5% void rate in the die attach layer, the junction temperature locally rises by about 15°C. If the design had only a 20°C margin, a 5% void rate could potentially cause reliability failure.

Therefore, thermal design for power semiconductors must guarantee the "worst-case point temperature," not the "average temperature," and to do that, the 3D temperature distribution must be solved using FEM.

🧑‍🎓

15°C from just 5% voids... You'd never notice that just by looking at the datasheet.

Thermal Resistance Network $R_{\theta,ja}$

🧑‍🎓

First, could you explain the basics? What is the structure of the thermal resistance listed in the datasheet?

🎓

The steady-state temperature of a power device is expressed by the series sum of thermal resistances from the junction (chip junction surface) to the ambient environment. This is the so-called junction-to-ambient thermal resistance $R_{\theta,ja}$:

$$ T_j = T_a + P_{\text{loss}} \cdot R_{\theta,ja} $$ $$ R_{\theta,ja} = R_{\theta,jc} + R_{\theta,cs} + R_{\theta,sa} $$

Physical Meaning of Each Term

$T_j$ (Junction Temperature): The highest temperature point on the semiconductor chip. Rated 150–175°C for Si-IGBT, 175–200°C for SiC-MOSFET. This temperature governs lifespan.
$T_a$ (Ambient Temperature): Air temperature at the cooling system inlet. Assumed worst-case 105°C for automotive, 40–55°C for industrial.
$P_{\text{loss}}$ (Power Loss): Sum of conduction loss $P_{\text{cond}} = I^2 R_{\text{DS(on)}}$ and switching loss $P_{\text{sw}} = \frac{1}{2}(E_{\text{on}}+E_{\text{off}}) f_{\text{sw}}$.
$R_{\theta,jc}$ (Junction-to-Case): From chip to package backside. Includes die attach layer (solder/sintered silver), DBC/AMB substrate. Typically 0.1–1.0 K/W.
$R_{\theta,cs}$ (Case-to-Sink): Thermal resistance of TIM (Thermal Interface Material). 0.1–0.5 K/W for grease, 0.5–2.0 K/W for thermal pads.
$R_{\theta,sa}$ (Sink-to-Ambient): Performance of heatsink + cooling system. 1–10 K/W for natural convection, 0.1–1 K/W for forced air, 0.01–0.1 K/W for water cooling.

🧑‍🎓

So it's an image of thermal resistance building up layer by layer from the chip to the heatsink. But if that's all, isn't 1D calculation sufficient?

🎓

That would be fine for steady-state, uniform heating. But in practice, FEM is needed for three reasons:

Void Effects: Local defects in the die attach layer prevent heat spreading, causing temperature hot spots on the chip.
Thermal Interference: In multi-chip modules (e.g., 6-in-1 packages), heat from adjacent chips interferes.
Transient Response: Temperature oscillates at tens of kHz due to PWM switching, causing fatigue.

Cauer Model and Foster Model

🧑‍🎓

How do we handle transient temperature fluctuations? The steady-state thermal resistance alone isn't enough, right?

🎓

That's where the transient thermal impedance $Z_{\theta}(t)$ comes in. Datasheets always include a curve for $Z_{\theta,jc}(t)$. Approximating this with an RC equivalent circuit leads to the Cauer and Foster models.

$$ Z_{\theta}(t) = \sum_{i=1}^{n} R_i \left(1 - e^{-t/\tau_i}\right), \quad \tau_i = R_i \cdot C_i $$

🎓

Item	Foster Model	Cauer Model
Circuit Configuration	Series connection of RC parallel elements	Series connection of RC series elements
Parameter Acquisition	Curve fitting of $Z_{\theta}(t)$ (easy)	Derivation from physical layer structure or conversion from Foster
Intermediate Node Temperature	No physical meaning	Corresponds to interface temperature of each layer
Boundary Condition Change	Not possible (requires re-fitting)	Possible (accommodates heatsink changes, etc.)
Integration with FEM	Difficult	Easy (corresponds to layer structure)
Circuit Simulator	Widely used in SPICE/PLECS	Preferred by physical model advocates

🧑‍🎓

So Foster is easy because it's curve fitting, but becomes unusable if the heatsink is changed. Cauer corresponds to the physical structure, so it works well with FEM, right?

🎓

Exactly. In practice, it's common to use Foster for system simulation (PLECS, Simulink) and Cauer-like physical models for detailed package design FEM (Ansys, COMSOL). A common workflow is to extract Foster parameters from FEM results and pass them to circuit simulators.

Governing Equation — 3D Unsteady Heat Conduction

🧑‍🎓

Please explain the fundamental equation solved by FEM.

🎓

What is solved in thermal analysis of power devices is the 3D unsteady heat conduction equation (Fourier's equation):

$$ \rho c_p \frac{\partial T}{\partial t} = \nabla \cdot (k \nabla T) + Q_v $$

Physical Meaning of Each Term

$\rho c_p \frac{\partial T}{\partial t}$ (Heat Storage Term): Material's ability to store heat. $\rho$ is density [kg/m³], $c_p$ is specific heat [J/(kg·K)]. SiC chip ($\rho = 3210$ kg/m³, $c_p = 690$ J/(kg·K)) is small, so response is fast.
$\nabla \cdot (k \nabla T)$ (Heat Conduction Term): Heat diffusion based on Fourier's law. Thermal conductivity of SiC $k = 370$ W/(m·K) is about 2.5 times that of Si ($k = 148$ W/(m·K)). Solder for die attach ($k \approx 50$ W/(m·K)) is the bottleneck.
$Q_v$ (Volumetric Heat Generation Density): Joule heat within the chip. Concentrated in the MOSFET active region, typically on the order of $10^7 \sim 10^9$ W/m³. Smaller chip area leads to higher heat generation density per unit area.

Boundary Conditions

Type 1 (Dirichlet): $T = T_0$ (Fixed heatsink base temperature. Cooling water temperature for water cooling.)
Type 3 (Newtonian Cooling): $-k \frac{\partial T}{\partial n} = h(T - T_{\infty})$ (Air-cooled surface. Natural convection $h = 5\text{–}25$ W/(m²·K), forced convection $h = 50\text{–}500$ W/(m²·K).)
Type 4 (Radiation): $-k \frac{\partial T}{\partial n} = \varepsilon \sigma (T^4 - T_{\text{sur}}^4)$ (Important in high-temperature environments. Package surface $\varepsilon \approx 0.9$.)
Interface Thermal Resistance: $q = \frac{\Delta T}{R_{\text{contact}}}$ (TIM layer or die attach interface. Represented by thin-layer elements or thermal contact in FEM.)

Temperature Dependence of Material Properties

Material	$k$ [W/(m·K)]	$\rho c_p$ [MJ/(m³·K)]	Notes on Temperature Dependence
SiC (4H)	370 @25°C → 200 @175°C	2.21	Thermal conductivity decreases significantly with temperature rise (beware of positive feedback)
Si	148 @25°C → 80 @175°C	1.66	Same as above
Cu (Base Plate)	398 @25°C → 385 @175°C	3.44	Temperature dependence is small
Al₂O₃ (DBC Substrate)	35	3.08	Temperature dependence is almost negligible
AlN (AMB Substrate)	170	2.37	High thermal conductivity. For high power.
Si₃N₄ (AMB Substrate)	90	2.33	High strength. Excellent power cycle resistance.
Sn-Ag-Cu Solder	58	1.65	Die attach layer. Effective value decreases with voids.
Sintered Ag	250	2.48	Next-generation die attach. High reliability.

Power Cycle Lifetime Prediction $N_f$

🧑‍🎓

Once you know the temperature, how do you connect it to lifespan?

🎓

The lifespan of a power semiconductor is determined by both the "amplitude" and "maximum value" of temperature. Repeated temperature swings cause wire bond lift-off and solder cracks to progress. This is predicted by the power cycle lifetime formula:

$$ N_f = A \cdot \Delta T_j^{\;-n} \cdot \exp\!\left(\frac{E_a}{k_B \cdot T_{j,\max}}\right) $$

Terms and Typical Parameter Values

$N_f$ (Number of Cycles to Failure): Power cycle lifetime. For wire bond lift-off, $10^4 \sim 10^7$ cycles is typical.
$\Delta T_j$ (Temperature Amplitude): Junction temperature fluctuation range per cycle [K]. $\Delta T_j = 50$ K yields about $10^6$ cycles, $\Delta T_j = 100$ K reduces it to about $10^4$ cycles.
$n$ (Temperature Exponent): Coffin-Manson exponent. $n \approx 4 \sim 5$ for Al wire bonds, $n \approx 3 \sim 4$ for Cu wire bonds.
$E_a$ (Activation Energy): Arrhenius term. $E_a \approx 0.8$ eV for solder fatigue, $E_a \approx 0.06 \sim 0.08$ eV for wire bonds.
$T_{j,\max}$ (Maximum Junction Temperature): Absolute temperature [K]. Higher temperatures accelerate degradation.
$k_B$ (Boltzmann Constant): $8.617 \times 10^{-5}$ eV/K.

Correspondence with LESIT Test Data

Simplified formula based on industry-standard LESIT (Leistungselektronik Simulation und Technologie) test data:

$$ N_f = A \cdot \Delta T_j^{\;-n} $$

Package / Bonding Technology	$A$	$n$	Source
IGBT / Al Wire / Solder	$3.025 \times 10^{14}$	4.416	LESIT (ECPE)
IGBT / Cu Wire / Sintered Ag	$9.3 \times 10^{14}$	−5.3	Infineon (Guideline)
SiC MOSFET / Cu Clip	−	−	Not enough data yet

🧑‍🎓

Wait, so if the temperature amplitude $\Delta T_j$ doubles, the lifespan shortens by $2^{4.4} \approx 21$ times!?

🎓

Exactly. Therefore, thermal design for power devices is not just about "lowering the average temperature"; "reducing the temperature amplitude" directly impacts lifespan. Specific countermeasures include increasing the heat capacity of the heatsink to smooth temperature fluctuations, and increasing the switching frequency to reduce $\Delta T_j$ per pulse.

To accurately determine this $\Delta T_j$, RC equivalent circuits alone may lack sufficient precision. Especially for multi-chip modules or inverters with mixed switching conditions, the best practice is to obtain the temperature time waveform from FEM transient analysis, perform cycle counting using the rainflow method, and then input it into the lifetime formula.

Coffee Break Side Story

IGBT Thermal Runaway — The Terror of Positive Feedback: "Temperature Rises, Current Increases"

IGBTs have a troublesome characteristic. As temperature rises, $V_{\text{CE(sat)}}$ (collector-emitter saturation voltage) increases, raising losses and further increasing temperature—a positive feedback loop. If there is variation in characteristics among parallel-connected IGBTs, the chip carrying more current gets hotter and draws even more current. The countermeasure is to design while constantly monitoring $T_j$ in simulation and accurately constructing the thermal resistance model. The datasheet thermal resistance is a DC value, but in actual switching operation, using the pulse thermal resistance $Z_{\theta}(t)$ is necessary; otherwise, instantaneous temperature is underestimated. On the other hand, SiC MOSFETs have a positive temperature coefficient (temperature rise increases $R_{\text{DS(on)}}$ → current decreases), which is a major advantage as it naturally balances current sharing in parallel connections.

Numerical Methods and Implementation

FEM Discretization and Element Selection

🧑‍🎓

When solving the 3D heat conduction equation with FEM, how exactly is it discretized?

🎓

It's converted to weak form and discretized using the Galerkin method. Approximating the temperature field $T$ with shape functions $N_i$ gives:

$$ T^h(\mathbf{x}, t) = \sum_{i=1}^{n} N_i(\mathbf{x}) \, T_i(t) $$ $$ [C]\dot{\{T\}} + [K_{\theta}]\{T\} = \{Q\} $$

🎓

Here, $[C]$ is the heat capacity matrix, $[K_{\theta}]$ is the thermal conductivity matrix, and $\{Q\}$ is the heat load vector. It has the same form as $[K]\{u\}=\{F\}$ in structural analysis, but since temperature varies with time, the term $[C]\dot{\{T\}}$ is added.

Element selection for thermal analysis of power devices is important:

Element Type	Recommended Use	Notes
Hexahedral 2nd order (20 nodes)	Layer structure of chip/substrate	Highest accuracy. Sweep mesh to divide in layer direction.
Tetrahedral 2nd order (10 nodes)	Complex shapes like heatsink fins	Mesh generation is easy but number of elements increases.
Shell elements (thin plate)	Copper foil of DBC substrate (e.g., 0.3mm thick)	Effective for thin layers where solid elements would explode in count.
Thermal contact	TIM layer, die attach interface	Represents interface thermal resistance without elements. Can simulate voids.

🧑‍🎓

The chip is only about 0.1mm, but the heatsink is tens of mm. The scales are completely different. Mesh design seems tough...

🎓

That's precisely the difficulty of thermal analysis for power devices. Chip thickness is 100μm vs. heatsink at 50mm. There's a 500x scale difference. The trick is to use a sweep mesh with 3–5 divisions in the layer direction and refine the in-plane mesh only in areas with steep temperature gradients. The die attach layer directly under the chip needs at least 3 layers in the thickness direction, and areas around voids require in-plane mesh of 0.1mm or less.

Transient Analysis Time Integration

🧑‍🎓

How is time discretization handled? With switching at 10kHz or more, the time step seems challenging.

🎓

That's the key practical point. To track temperature fluctuations over one switching cycle (if $f_{\text{sw}} = 10$ kHz, then $100\,\mu$s), a time step of $\Delta t \leq 10\,\mu$s is needed. But if you track the entire mission profile lasting minutes to hours needed for power cycle lifetime evaluation with such tiny steps, the computational load explodes.

In practice, a two-stage approach is used:

Switching Cycle Scale: Analyze temperature oscillation over a few cycles using FEM or an RC equivalent circuit to obtain $\Delta T_j$ per pulse.
Mission Profile Scale: Input the average loss waveform into an equivalent circuit model (Foster/Cauer) to obtain temperature fluctuations on the minute-to-hour scale.

For time integration, Backward Euler method (1st order implicit) or Crank-Nicolson method (2nd order implicit) is generally used:

$$ \left(\frac{[C]}{\Delta t} + \theta [K_{\theta}]\right) \{T\}_{n+1} = \frac{[C]}{\Delta t} \{T\}_n + \{Q\}_{n+\theta} $$