x264 HD Video Encoding Performance

Graysky's x264 HD test uses the publicly available x264 codec (open source alternative to H.264) to encode a 4Mbps 720p MPEG-2 source. The focus here is on quality rather than speed, thus the benchmark uses a 2-pass encode and reports the average frame rate in each pass.

I measured power and performance in the second pass of the benchmark since that’s where the more CPU intensive work gets done.

First, we look at performance:

x264 HD Encode Benchmark - 720p MPEG-2 to x264

x264 HD Encode Benchmark - 720p MPEG-2 to x264

The Q9550S takes the cake with the lowest average power during the x264 encode, even slightly lower than the Q9400.

x264 HD Encode Benchmark - 720p MPEG-2 to x264

Peak power is also lowest on the Q9550S, note that it draws 11W less than the Core i7-920. But the thing to keep in mind is that, once more, the i7-920 is about 40% faster than the Q9550S.

Let’s look at the total energy consumed by the system during the benchmark:

x264 HD Encode Benchmark - 720p MPEG-2 to x264

Once again, the two Core i7 platforms offer better energy efficiency than the new Q9550S. In fact, the Q9550S offers about the same energy efficiency as the rest of the Core 2 Quad lineup and AMD’s new Phenom II X4 940. If you want energy efficiency, you actually want a Core i7.

Adobe Photoshop CS4 Performance POV-Ray 3.73 beta 23 Ray Tracing Performance
Comments Locked

62 Comments

View All Comments

  • JPForums - Wednesday, January 28, 2009 - link

    In general when people say average, they are talking about the Arithmetic mean.
    Arithmetic mean = 1/n*(X1 + ... + Xn)
    or in English a list of numbers divided by the number of items in the list.

    In your case this would mean summing the list of power measurements taken at one second intervals and dividing by the number of measurements which would be the integer number of seconds. You could then calculate joules by multiplying that average by the total time.

    The only way the sum of power measurements and calculation made by multiplying the average power by the time would be different is if the number of measurements for the sum and the average are different. In this scenario, the calculation made with more data points would be more accurate (think integration).

    So the question becomes: How did you calculate your average? It appears that your average has more data points given that the total test time is measured to 1/10 seconds and your summation was only one second intervals.

    That said, the difference between the summation and the result calculated from the average should be small as you stated. The Q9550S results, for instance, only differs by 113(3244-3131) joules and the Core i7-920 differs by a mere 86(2818-2732) joules. Even the Phenom X4 9950 only has a delta of 69(5474-5405) joules. However, the Phenom II 940 has a delta of 898(4697-3799) joules.

    This massive difference leads me to suspect that either the average power, the total time, or the total energy for this processor was reported incorrectly. If we assume the average power and the maximum power are the same, then the delta shrinks to 220(4697-4490) joules. Alternately, if we assume that the Phenom II 940 is the same speed as the Phenom 9950, the delta shrinks to 207(4697-4477) joules. Both of the assumptions seem unreasonable to me, and neither get the delta as small as it should be. So I ask, now that I've presented a reasonable case, please recheck your total energy numbers as Ryun suggested.
  • harijan - Tuesday, January 27, 2009 - link

    It still it doesn't make sense. How can it use 4700 joules yet average 157 watts over 24 seconds? Or have a max of 188 Watts?

    4697 Joules / 24.2 seconds = 195 Watts average
  • Anand Lal Shimpi - Wednesday, January 28, 2009 - link

    Woops, you're completely right :) The issue wasn't with the power measurement but with the performance. The performance data for the run that I measured power under was incorrect. A re-run fixes that problem. The Q9450 was also impacted slightly.

    It's worth mentioning that the performance and power data are taken at two different times. First the performance data, then the power data. The performance during the power run is close but not always identical to the performance during the performance run. There's going to be some variation depending on the test.

    Take care,
    Anand
  • GourdFreeMan - Wednesday, January 28, 2009 - link

    I have to agree. There is something wrong with Anand's methodology. Also, look at his specious reasoning for the difference in processor ranking between his "average" power and the energy consumed in the Fallout 3 section, where the tests are run for the same time interval. He is measuring total system power, so the improved idle efficiency of the Nehalems should already be incorporated in those numbers. Average power draw is by definition total energy consumed divided by time interval over which it is consumed. Either taking instantaneous measurements and treating them as averages for each second or simple human error could be responsible for the discrepancy.
  • JPForums - Wednesday, January 28, 2009 - link

    I wouldn't call it suspicious, just a flaw in the procedure. If you have a sine wave and a cosine wave at the same frequency, amplitude, and offset measured once per period, one will look much larger than the other even though they average out to be exactly the same. Likewise, if you have two computers drawing the same average power, but you happen to record one during mostly high fluctuations and the other during mostly low fluctuations, you'll get two very different results.
    You need more samples to get accurate results. The best method would be to record a power graph using the smallest period possible. Then, integrate the power under the curve. Convert the units to seconds to get energy in joules. Divide by the number of samples to get the average power.
  • GourdFreeMan - Wednesday, January 28, 2009 - link

    JPForums, I appreciate your efforts to elucidate my remarks to Anand, but I must comment on two things. First, the word I used was "specious" not "suspicious". There is a difference, just as there is a difference between "average power draw" and "an average of periodically sampled power draws". These two are only guaranteed to coincide if the samples themselves are average powers or in the limit as the period they are sampled over approaches 0. (The latter remark is directed at your definition of average in the first paragraph of your other post).
  • Ryun - Tuesday, January 27, 2009 - link

    I was expecting a much bigger delta compared to the 95W quads in wattage.
  • Ryun - Tuesday, January 27, 2009 - link

    Meant to end with, "Thanks for the review."

    So, thanks. =)
  • harijan - Tuesday, January 27, 2009 - link

    no idle power usage numbers?
  • michael2k - Tuesday, January 27, 2009 - link

    Unfortunately 65W is still too hot for a 17" MacBook Pro.

Log in

Don't have an account? Sign up now