Tweet

Posted on 11 January 2019

Reliability Engineering in Power Electronics

 

 

Reliability in power electronics is the extent to which a system can be expected to perform its intended function within the required tolerances. In other words, reliability is the system's ability to operate correctly under certain conditions over a given time period.

 

 

The definition of failure rate (FR) to be used in the following is the ratio of the number of faulty elements to the total number of elements.

A useful way of quantifying reliability is by considering the following values:

  • Mean time to failure (MTTF)
  • Mean time to repair (MTTR)
  • Mean time between failure (MTBF), i.e. the sum of the mean time to failure and mean time to repair (MTBF = MTTF + MTTR)

Calculating Failure Rates (FR) and Mean Time to Failure (MTTF)

If FR and MTTF are known for a given temperature T1 (from measurements taken, for example), the FR and MTTF for another temperature, say T2, can be calculated using the Arrhenius equation. However, this can only be done under the condition that the activation energy of the failure is known. The Arrhenius equation is

\frac{FR_1}{FR_2} = \frac {MTTF_2}{MTTF_1} = exp\frac{E_A}{K} (\frac{1}{T_2} - \frac {1}{T_1})

where K = Boltzmann Constant = 8.63 x 10-5 eV/K = 1.38 x 10-23 J/K; T1, T2 = Temperature in Kelvin; EA = Activation Energy

Arrhenius Plot

If the failure rate (FR) or MTTF for a certain chip temperature is known, the failure rate for other temperatures can be determined using the Arrhenius plot.

arrhenius plot for different activation energies

Figure 1. Arrhenius plot for different activation energies

However, use of the Arrhenius plot is only applicable if the failure mechanism is determined by only one activation energy.

Higher temperatures are typically used to accelerate the test since test times would be too long if real conditions were to be used.

Bathtub Curve

The typical course of probability of default follows the bathtub curve. The probability of failure is initially high, but decreases during the period of early failure occurence. After a long period of consistent low failure rates, the failure rate again rises near the end of the systems life due to wear (aging).

Bathtub Curve

Figure 2. Bathtub Curve

Reliability Testing

The minimum requirements of a reliability test are standardized in order to aid the comparison of different products.

A common method of testing reliability is by simulating prolonged use by vibration tests. Vibration tests are performed in the frequency range of 10 ... 1000 Hz, normally with an acceleration of about 5 g. The test can also be carried out at higher accelerations. Tests at up to 20 g can be performed depending on the weight of the sample. The test looks for vulnerabilities in the mechanical design such as:

  • Contact reliability of spring contacts
  • Strength of solder on moving parts
  • Cracks in the castings or design parts

Other types of tests that can be performed include thermal cycling in which passive temperature cycles are observed through the heating and cooling of the entire component from the outside in a double chamber system.

Another common test is the active power cycling test where the  device's rated current is repeatedly passed through the device. The rated current is determined using catalogue values VCEsat, Rth. For tests with short life cycles, the wire bonding and solder chip carry the greatest burden. The heatsink barely experiences any temperature fluctuations.

Tests with short life cycles

Figure 3. Power cycling tests with short life cycles

Temperature gradients produce a cyclic substrate and base plate bending during the temperature and power cycling test, causing strain on the solder between the DCB and the base plate. This is due to the baseplate and insulating substrate having different coefficients of thermal expansion.

Effects of Temperature and Power Cycling on Solder

The differences in the coefficients of thermal expansion (CTE) of the materials used lead to varying mechanical stresses within the solder. The solder is weakened in the process which eventually leads to delamination (separation). This begins at the points where the stress is the greatest, namely at the corners of the soldered areas . Due to the large cycle times, delamination is caused more often by temperature changes than by active power cycling in which there is only a small increase in temperature on the large solder surface of the base plate/DCB.

effects of temperature cycles on solder

Figure 4. Effects of temperature cycles on solder

Comparing the resistance to thermal shock of soldered pin connections to the resiatance to thermal shock of spring contacts, the pin connection fails after 1000 cycles due to interfacial delamination, where the spring contact remains intact even after 2000 cycles.

Effects of thermal shock on pin and spring contacts

Figure 5. Effects of thermal shock on pin and spring contacts

Failure mechanisms due to temperature cycles

Solder fatigue can be demonstrated by bending a soldered base plate over a mandrel. The DCB is separated from the base plate due to solder fatigue. Solder fatigue can be avoided by matching coefficients of thermal expansion. For example, if a base plate made of  very good AlSiC is matched very well with AlN-AMB ceramics, soldere fatigue can be avoided all together.

Thermal cycles also lead to the deformation of wire bonds around 5 - 50µm. Thermal cycling affects the bond wire heel close to the actual bond and leads to breakage. The higher the loop, the less sensitive is the bond to temperature cycles.

Bond wire heel crack due to thermal cycling

Figure 6. Bond wire heel crack due to thermal cycling

Cracks in the foot of the bond wire, the part of the wire that is soldered, can also result from temperature cycling. Once developed, cracks in the foot of the bond wire can grow and ultimately lead to wire lift off . This behavior is significantly influenced by the current density in the chip metallization under the bond foot.

Bond wire lift off can be triggered by physical changes to the chip pad metallization. These structural changes to the aluminium surface are caused by temperature cycles and are sometimes visible to the naked eye. Growth of aluminium crystals occurs and since there is no space left on the chip, the crystals spread to the top surface.

Major considerations to be addressed in regards to reliability under thermal cycling include:

  • The materials used and their coefficients of thermal expansion
  • Type of connection (soldering, NTV, Pressure contact) and the materials used to make the connection (solder, wire quality etc)
  • Quality of the connection
  • Level of temperature gradients (Δ T)
  • Length of the cycle, heating and cooling speed
  • Mean temperature of the test sample

The smaller the temperature difference, the higher the active and passive thermal cycling resistance of the structure. The mean temperature plays a big role since the higher Tave is, the smaller is the number of cycles in the devices lifetime.

Load change- Lifetime as a function of change in T

Figure 7. Power cycling lifetime as a function of junction temperature difference Tj

Power cycling cabability for modules with and without base plate

The solder between the DCB and the base plate is especially prone to solder fatigue due to its relatively large area. Thermal resistance increases due to temperature cycles and can result in solder fatigue between the DCB and the chip and also lead to lift off of wire bonds. The accelerated failure due to the large surface soldering does not occur in modules without a base plate.

The relaibility of a bond wires depends not only on the temperature differences but also on the mean temperature during testing.

Comparison of maximum junction temperature evolution for different control strategies

Figure 8. Comparison of maximum junction temperature evolution for different control strategies

The evolution of the maximum junction temperature versus the number of cycles during the power cycling test in comparison to other control strategies is shown in figure 8 above. The curves also show the numbers of cycles to failure. For the constant timing, the end-of-life was reached after 32,073 cycles, when the junction temperature approached 360°C and the emitter metallization melted and failed. For constant base temperature swing, the final failure was observed after 47,485 cycles, when the maximum junction temperature exceeded 340°C. Again, the metallization of the emitter failed. For constant power losses, a lifetime of 69,423 cycles was determined. In this case, the maximum junction temperature never exceeded 178°C and the failure was caused by the lift-off of all wire bonds, while the emitter metallization remained intact.[1]

References:

[1] Schuler, Scheuermann: Impact of test control strategy on power cycling lifetime, PCIM 2010

 

For more information, please read:

Causes of Failure of Power Semiconductor Devices (PSDs)

Failure Mechanisms During Power Cycling

Device Failure due to Electrical and Thermal Conditions

Served Out Power Semiconductor Devices

Device Failure due to Incorrect Mounting

 

VN:F [1.9.17_1161]
Rating: 0.0/6 (0 votes cast)

This post was written by:

- who has written 197 posts on PowerGuru - Power Electronics Information Portal.


Contact the author

Leave a Response

You must be logged in to post a comment.