The Mystery of the Missing Performance

As with past experience, we saw some very odd system behavior in testing and determined that Cool & Quiet may have had something to do with it. In testing this theory, we pulled out our Photoshop CS3 benchmark. We ran it once with Cool & Quiet off and once with it on. Our results were staggeringly different.

Our goal was to test both with C&Q disabled and enabled. It so happened that our first run through the benchmarks was with the power saving feature disabled. Our numbers looked much better than in previous tests and it seemed like everything made sense once again.

When we enabled C&Q the second time, however, the issue seemed to have disappeared (as has randomly happened in the past as well). We did install AMD's Power Meter in order to verify that C&Q was working (and it was), and it is possible that installing this software somehow fixed the issue. But since the issue has randomly come and gone in the past we really can't suggest this as a sure fire fix either.

In trying to re-reproduce the problem, we uninstalled the power meter, we rebooted and disabled CnQ, then re-enabled CnQ again. None of this brought back the poor performance we saw, but in another odd twist CnQ didn't really provide any power advantage either. It's entirely possible, since we didn't measure power when the problem was apparent, that the power savings of CnQ are also afflicted by whatever is underlying here.

In fact, if both performance and power savings were negatively affected by whatever is happening, we would not be surprised. AMD has informed us that our power numbers don't show as much of a savings as they would expect from CnQ (interestingly enough, Johan saw similar behavior in his latest piece). We've asked AMD to help us track down the issue, but their power guy is currently on vacation so it will be a little while.

Because the Photoshop test took so much less time without CnQ, we actually wanted to measure power usage over the test and compare energy used (watts * secconds to give joules). We fully expected the non-C&Q mode to be so much more efficient in completing the test quickly that it would use less total energy to perform the operation. Unfortunately, we were unable to verify this theory.

One thing is for certain, something is definitely not working as it should.

We do have a couple theories, but nothing confirmed or that really even makes a lot of sense yet. Why not share our thoughts and musings and see what comes of that though. It worked fairly well to help us find the instruction latency of the GT200 right?

One of the first things we thought was that it took longer to come out of its low power state than it should, but AMD did say that there's no reason why the X2 should be able to do this faster.

Our minds then wandered over to what we saw when we looked at the AMD Power Meter. Since Windows Vista takes it upon itself to move threads between cores in fairly stupid ways, during the Photoshop test we saw what looked like threads bouncing around between cores or cycling through them in rapid succession. Whatever was actually being done, the result was that one processor would ramp up to full power (1GHz up to 2GHz) and then drop back down as the next CPU came up to speed.

We talked about how it's possible that threads moving between these different cores, needing to wake the next one up rather than running on an already at speed core, could possibly impact performance. As the Phenom is the only CPU architecture we currently have access to with individual PLLs per core (Intel's CPUs must run all cores at the same frequency), the CnQ issues could be related to that.

There has to be a factor that is AMD specific that causes this problem -- and not only that but Phenom specific because we've never seen this problem on other AMD parts.

Or have we?

AMD GPUs last year exhibited quite an interesting issue with their power management features that were clearly evident in specific locations while playing Crysis. The culprit was the dynamic clocking of the GPU based on the graphics load. Because the hardware was able to switch quickly between modes, and due to the way Crytek's engine works, AMD GPUs were constantly speeding up and slowing down in situations where they should have been at full speed the entire time.

It is entirely possible that the CPU issue is of a similar nature. Perhaps the hardware that controls the clock speed is slowing down and then speeding up each core when it should just keep the core at full speed for a short time longer. The solution to the GPU issue was to increase the amount of time the GPU had to have lowered activity before the processor was clocked down. This meant that an increase in activity would result in an instant speed bump while the GPU had to be relatively lightly used for a longer period of time (still less than a second if I recall correctly) before it was clocked back down.

Yes, it's the same company, but the similarities do go a bit deeper. We really don't know what the heart of the matter is, but this kind of problem certainly is not without precedent. We will have to wait for AMD to help us understand better what is happening and if there is anything that could be done about it. We do hope you've enjoyed our best guesses though, and please feel free to let us know if you've got any other plausible explanations we didn't address.

The Story of Phenom's Erratic Performance More Problems?
Comments Locked

36 Comments

View All Comments

  • Sylvanas - Tuesday, July 1, 2008 - link

    Wheres the 9950BE overclocking results? It is an unlocked CPU so what about Overclocking the NB? What performance difference does that bring? I doubt people that buy IGP's are going to overclocking much anyway since they are usually silent HTPC rigs...
  • Gary Key - Tuesday, July 1, 2008 - link

    The 9950BE overclocking results are coming in a different article. Unfortunately, our 790FX boards (they have been beat on for six months) were not exactly up to speed and we thought it would be better to not show anything instead of a 2.8GHz clock that obviously is not representative of the processor at this point.

    Also, most of our previous results were run on the 780G, a chipset that when tuned correctly and on a good board will outclock the 790FX with a discreet graphics card by the way. Jetway just released a fairly comprehensive BIOS for their new 780G we ended up using after the others started failing. We just received BIOS updates for the 780a boards and have a new 790FX/SB750 arriving shortly for a CF/SLI update on AMD (gaming is not that bad by the way on the Phenom for the mid-range market).

    Increasing the NB core (IMC) clock (in Phenom it runs async from the Core Speed unlike Athlon which is Sync) drops latencies (especially L3) and increases memory performance/throughput, which in turn improves system performance. The Phenom starts to come to life when you hit a 2.6GHz core speed with a NB core clock at 2200MHz+. Depending on the application and CPU, increasing NB core speeds (getting up to 2200MHz+) can result in performance differences from 3%~12% in most cases.

    Almost as important is increasing HT speed for further optimizing the pipeline links (CPU/Memory/PCIe,etc). Our 9950BE follow up will have an overclocking guide along with optimization details.
  • Sylvanas - Wednesday, July 2, 2008 - link

    Excellent, thanks for the info Gary- I look forward to the follow up 9950BE overclocking article. If there is some info on the SB750 aswell that's even better :)
  • DigitalFreak - Tuesday, July 1, 2008 - link

    AMD post X2 = ROFLMAO

    The C&Q thing is probably another respin waiting to happen. What a bunch of boobs.
  • acejj26 - Tuesday, July 1, 2008 - link

    what's a seccond?
    why didn't you include the 9950 in the first page of benchmarks?
    is the 9960 a new processor from AMD?

    i've come to expect these errors from other staff writers, but not you Anand.
  • skiboysteve - Tuesday, July 1, 2008 - link

    why are you using 780G to overclock and check stability on the same article you say how someone else wrote an article about how that is a bad idea because of power...

    you even say at the bottom of your overclocking page, a mere footnote, that you got higher clocks on a different platform
  • js01 - Tuesday, July 1, 2008 - link

    I think they scale much better then that hothardware got the 9950be to 3.1ghz barely even trying and the 9350e to 2.7ghz.
    http://www.hothardware.com/Articles/AMD_Phenom_X4_...">http://www.hothardware.com/Articles/AMD...nom_X4_9...
  • Gary Key - Tuesday, July 1, 2008 - link

    It depends on the board and CPU actually. We have a retail 9850BE that will do 3.3, but three others struggle to make it to 2.8. Until we see some consistency in the retail parts, we would rather play it safe with the comments. A separate overclocking article is on its way though with the new lineup. :)
  • woofermazing - Tuesday, July 1, 2008 - link

    Odd that you guys couldn't get any OC out of the 9950. Results from other sites have been pretty impressive using the stock cooler. 3.6ghz is the highest of seen so far.
  • Clauzii - Wednesday, July 2, 2008 - link

    I second that!

    PS: And why does the comment page keep looking like pre-95 internet :O (I'm on FF3)

Log in

Don't have an account? Sign up now