Conclusion

Although the fundamental issue is clear in that some users are experiencing burnout of their Ryzen 7000X3D processors, the issue isn't limited to just Zen 4-based SKUs with 3D V-Cache. The problem could potentially lay at several doors, at a silicon level, the motherboard's implementation of SoC voltage, and, in some cases, an uncontrollable current rampaging through the chip and socket. As a highly destructive issue, which isn't only killing the processor, but in some cases, taking the motherboard socket with it, AMD and motherboard vendors are experiencing a tumultuous time in diagnosing and implementing a solid fix to resolve these issues.

While writing this article, Gamers Nexus posted a new video where they sent the failed CPU to an external laboratory to investigate the issue. The 'failure analysis report,' as GN is calling it, uses an external and unnamed lab to do a variety of testing including and not limited to, C-mode scanning acoustic microscopy, X-ray analysis, a 3D CT scan, high magnification microscopy, and scanning using an electron microscope Xsec.

The biggest takeaway from Gamers Nexus's most recent external lab-based analysis is that multiple manufacturing defects could have caused the issue. The lab in question couldn't identify the issues specifically, and much of it at this point is based on assumptions; to assume isn't a scientific method to establish anything from, only opinions. 

From our testing, we set out not to look at trying to replicate the burnout issue but to try and understand what AMD's AGESA updates are doing to different variables such as current, voltage, and power, typically focusing on the SoC, as that's what AMD has primarily concentrated on capping to try and alleviate the issues.

Looking at an overall view of the peak current we experienced with our AMD Ryzen 9 7950X3D with ASRock's X670E Taichi motherboard, we can see that everything is fundamentally well within control; this can be taken one of two ways. The first is that ASRock has done things correctly by applying 1.30 V to the SoC by default when applying memory overclocks via AMD's EXPO and XMP profiles. The second is that other vendors haven't been getting things as right, especially with reports of ASUS boards before the new AGESA firmware having OCP (over current protection) failures, resulting in too much current going through the chip. 

In our testing and focusing primarily on the amperage levels, we can see that the initial Ryzen 7000X3D firmware, AGESA 1.0.0.5c, did spike higher than all the other firmware in terms of SoC current. This is the case at default memory settings and with AMD EXPO applied to our G.Skill Trident Z5 Neo DDR5-6000 memory kit. Despite the higher peaks in SoC current, the peak didn't seem troublesome, but having a lower current is always advantageous in helping reduce overall power, heat, and in this case, not overloading and frying CPUs.

In our testing, the latest (at the time of writing) AGESA 1.0.0.7 (BETA) firmware had the lowest peak SoC current at default settings and the lowest amperage with EXPO applied. By setting 1.25 V instead of ASRock's 1.30 V default on the SoC voltage, we managed to lower the peak SoC amperage.

Turning to the average current through the SoC rails, AMD's AGESA 1.0.0.5c, as expected, has the highest average. AMD AGESA 1.0.0.7 (BETA), the newest at the time of writing, has the lowest average current levels. We reduced the average current by around 7% by setting 1.25 V on the SoC voltage instead of relying on ASRock's 'one size to fit all' approach. Interestingly, despite having the highest peak SoC current, AGESA 1.0.0.6 has a very marginally lower average current through the SoC, by 0.01 A. 

Final Words: Speculations on Ryzen 7000 Burnout Issue, But Nothing Conclusive

The biggest problem is that AMD's Ryzen 7000 series processors (mainly X3D) are burning up inside the socket, frying, and sometimes totaling the AM5 socket. This is a big issue that AMD and its partners still need to address 'properly.' It's not that they aren't working tirelessly to rectify the problem, as it released three AGESA firmware updates in just over a month. AMD's most significant strategy to fix the issue has been to curtail SoC voltage, which, as it has been found by Gamers Nexus, is at the root of the problem; it's not the only problem, but rampaging and unattended SoC current is a notable cause for the destruction.

Perhaps one of the other core problems with all of these issues is all of the speculation. We aren't interested in speculation because, as good as it is sometimes to speculate, assumptions can run wild. Even when Gamers Nexus sent one of their dead Ryzen 7000X3D CPUs to an external lab, the unnamed lab didn't come up with anything particularly conclusive. While it is clear there's a lot of speculation and analysis that's already been done, as well as more likely to come shortly until the root cause is identified, the buck stops with AMD and its partners.


Image Credit: Speedrookie/Reddit

From our testing, we can highlight clearly that we didn't experience any issues with the ASRock X670E Taichi, nor did we find any cause for concern. If anything, we can see one particular trend throughout our testing, and we're making this claim based on our testing; AMD's AGESA 1.0.0.6 looks rushed, and that's certainly not without benefit to users and scrutiny. It benefits users by not allowing them to accidentally enable too much SoC voltage to the chip, which in the case of ASRock's X670E Taichi on AGESA 1.0.0.5c, allowed us to set 2.50 V.

With the second BIOS fix through AGESA 1.0.0.7 (BETA), we observed more reserved SoC current, peak power from the SoC and more conservative average values. This is a step in the right direction in terms of lowering the likelihood that SoC voltage and current are going to kill the CPU. While AMD is rolling out its AGESA firmware, it's fundamental to note that these revisions are listed as BETA, which gives AMD room to improve for a comprehensively tested and tweaked firmware designed to alleviate all of the issues above.

Exposure to higher voltage and heat can energize the atoms and molecules of a dielectric material and trigger chemical reactions that break down its structure, leading to dielectric degradation. Common mechanisms of degradation include thermal oxidation and electrical breakdown, which respectively create defects and conductivity in the material. The end result is the loss of insulation properties, increased leakage currents, and eventual material failure.

In the case of the Ryzen 7000/X3D series, the large current and heat are accelerating dielectric degradation and are not only weakening the integrity of the silicon and the internals but it's effectively damaging them beyond a point of no return. This is why it's important to operate with lower voltages which in turn lowers current, lowers total power output, and in turn, lowers temperatures. Overshooting so high on something with a fragile component added through vias as a 3D packaged die is, isn't likely to turn out well, at least not from a theoretical standpoint.

It's expected that AMD is going to soon roll out a new fully-fledged AGESA firmware to mitigate these issues. Which, according to Gamers Nexus, is likely a result of failing to implement proper fail-safes in over current protection (OCP), thereby in some cases letting current run rampant through the CPU. Whether this is down to motherboard vendors such as ASUS, GIGABYTE, and ASRock, or is something under AMD's umbrella, is speculation at this point.

Our testing shows that the latest AGESA 1.0.0.7 (BETA) (BETA) firmware is undoubtedly better overall than the initial firmware. However, the news that AMD openSIL is set to replace AGESA firmware in 2026 is another variable entirely. The key takeaway is that, at least on the ASRock X670E Taichi, things are working as they should be with AGESA 1.0.0.7 (BETA), and we look forward to a full release (non-BETA) of their latest AGESA in the coming weeks.

AGESA 1.0.0.5c to 1.0.0.7 Firmware Testing: Temps, Voltages, Currents, and Power
Comments Locked

39 Comments

View All Comments

  • dullard - Tuesday, May 16, 2023 - link

    I'm confused by the voltages on the first page. This article repeatedly mentions 0.5 V, when I think it intends to say 0.05 V. For example, 1.35 V is 0.05 V over the new 1.30 V.
  • Ryan Smith - Tuesday, May 16, 2023 - link

    You are correct. Thanks!
  • Threska - Tuesday, May 16, 2023 - link

    Seems there should have been physical safeties built into the CPU so at worst the chip would shut down and set a fault flag indicating what the problem was. Annoying, but cheaper than replacing burnt up hardware, and ruined reputations.
  • Samus - Wednesday, May 17, 2023 - link

    This is AMD we're talking about here, not Intel. AMD chips going back to the Athlon XP have always lacked fail safes when compared to the competition.
  • Netmsm - Thursday, May 25, 2023 - link

    what a judgment!
  • cheshirster - Tuesday, May 16, 2023 - link

    "Gamers Nexus Deep-Dive - The Ryzen 7000 CORE Fundamental Issues"
    If you're not new to semiconductors, this video contains absolutely nothing other than some nice die shots.
    There are zero specific details on the matter of burned 7000 CPU's.
    It's basically: "Chips can die from numerous reasons and also we have nice shot of small AMD logo".
  • techjunkie123 - Tuesday, May 16, 2023 - link

    The most annoying thing about these GN videos is that they act as if they really know a lot, when in fact they probably know less than the average anandtech reviewer. Not just this video, but other ones too.
  • TheinsanegamerN - Wednesday, May 17, 2023 - link

    Which one are you referring to, the one that left 2 years ago or the one that left 5 years ago?
  • alpha754293 - Wednesday, May 17, 2023 - link

    This is so stupid.

    What's your beef with GN?

    GN EXPLICITLY STATES that they are learning and that they're NOT experts in failure analysis.

    This is why they, also EXPLICITLY state, they sent it out to an external lab, so that someone who IS a failure analysis expert, can actually do the detailed, technical, failure analysis.

    You seem to have a chip on your shoulder when it comes to GN which has caused your panties to get all bunched up into a wad.
  • Skeptical123 - Thursday, May 18, 2023 - link

    lol it looks like you got that backwards

Log in

Don't have an account? Sign up now