Quick Sync Image Quality & Performance

Intel obviously focused on increasing GPU performance with Ivy Bridge, but a side effect of that increased GPU performance is more compute available for Quick Sync. As you may recall, Sandy Bridge's secret weapon was an on-die hardware video transcode engine (Quick Sync), designed to keep Intel's CPUs competitive when faced with the onslaught of GPU computing applications. At the time, video transcode seemed to be the most likely candidate for significant GPU acceleration so the move made sense. Plus it doesn't hurt that video transcoding is an extremely popular activity to do with one's PC these days.

The power of Quick Sync was how it leveraged fixed function decode (and some encode) hardware with the on-die GPU's EU array. The combination of the two resulted in some pretty incredible performance gains not only over traditional software based transcoding, but also over the fastest GPU based solutions as well.

Intel put to rest any concerns about image quality when Quick Sync launched, and thankfully the situation hasn't changed today with Ivy Bridge. In fact, you get a bit more flexibility than you had a year ago.

Intel's latest drivers now allow for a selectable tradeoff between image quality and performance when transcoding using Quick Sync. The option is exposed in Media Espresso and ultimately corresponds to an increase in average bitrate. To test image quality and performance, I took the last Harry Potter Blu-ray, stripped it of its DRM and used Media Espresso to make it playable on an iPad 2 (1024 x 768 preset).

In the case of our Harry Potter transcode, selecting the Better Quality option increased average bitrate from to 3.86Mbps to 5.83Mbps. The resulting file size for the entire movie increased from 3.78GB to 5.71GB. Both options produced a good quality transcode, picking one over the other really depends on how much time (and space) you have as well as the screen size of the device you'll be watching it on. For most phone/tablet use I'd say the faster performing option is ideal.

Intel Core i7 3770K (x86) Intel Quick Sync (SNB) Intel Quick Sync (IVB) Intel Quick Sync, Better (IVB) NVIDIA GeForce GTX 680 AMD Radeon HD 7970
original original original original original original

 

While AMD has yet to enable VCE in any publicly available software, NVIDIA's hardware encoder built into Kepler is alive and well. Cyberlink Media Espresso 6.5 will take advantage of the 680's NVENC engine which is why we standardized on it here for these tests. Once again, Quick Sync's transcoding abilities are limited to applications like Media Espresso or ArcSoft's Media Converter—there's still no support in open source applications like Handbrake.

Compared to the output from Quick Sync, NVENC appears to produce a softer image. However, if you compare the NVENC output to what we got from the software/x86 path you'll see that the two are quite similar. It seems that Quick Sync, at least in this case, is sharpening/adding more noise beyond what you'd normally expect. I'm not sure I'd call it bad, but I need to do some more testing before I know whether or not it's a good thing.

The good news is that NVENC doesn't pose any of the horrible image quality issues that NVIDIA's CUDA transcoding path gave us last year. For getting videos onto your phone, tablet or game console I'd say the output of either of these options, NVENC or Quick Sync, is good enough.

Unfortunately AMD's solution hasn't improved. The washed out images we saw last year, particularly in dark scenes prior to a significant change in brightness are back again. While NVENC delivers acceptable image quality, AMD does not.

The performance story is unfortunately not much different from last year either. The chart below is average frame rate over the entire encode process.

CyberLink Media Espresso 6.5—Harry Potter 8 Transcode

Just as we saw with Sandy Bridge, Quick Sync continues to be an incredible way to get video content onto devices other than your PC. One thing I wanted to make sure of was that Media Espresso wasn't somehow holding x86 performance back to make the GPU accelerated transcodes seem much better than they actually are. I asked our resident video expert, Ganesh, to clone Media Espresso's settings in a Handbrake profile. We took the profile and performed the same transcode, the result is listed above as the Core i7 3770K (Handbrake). You will notice that the Handbrake x86/x264 path is definitely faster than Cyberlink's software path, by over 50% to be exact. However even using Handbrake as a reference, Quick Sync transcodes over 2x faster.

In the tests below I took the same source and varied the output quality with some custom profiles. I targeted 1080p, 720p and 480p at decreasing average bitrates to illustrate the relationship between compression demands and performance:

CyberLink Media Espresso 6.5—Harry Potter 8 Transcode

CyberLink Media Espresso 6.5—Harry Potter 8 Transcode

CyberLink Media Espresso 6.5—Harry Potter 8 Transcode

Unfortunately NVENC performance does not scale like Quick Sync. When asked to preserve a good amount of data, both NVENC and Quick Sync perform similarly in our 1080p/13Mbps test. However ask for more aggressive compression ratios for lower resolution/bitrate targets, and the Intel solution quickly distances itself from NVIDIA. One theory is that NVIDIA's entropy encode block could be the limiting factor here.

Ivy Bridge's improved Quick Sync appears to be aided both by an improved decoder and the HD 4000's faster/larger EU array. The graph below helps illustrate:

If we rely on software decoding but use Intel's hardware encode engine, Ivy Bridge is 18% faster than Sandy Bridge in this test (1080p 13Mbps output from BD source, same as above). If we turn on both hardware decode and encode, the advantage grows to 29%. More than half of the performance advantage in this case is due to the faster decode engine on Ivy Bridge.

Power Consumption Final Words
POST A COMMENT

173 Comments

View All Comments

  • Alexo - Wednesday, April 25, 2012 - link

    It will be in Canada once Bill C-11 passes in a couple of months. Reply
  • p05esto - Monday, April 23, 2012 - link

    It would be neat to see older CPUs in these benchmarks. It's always a pet peve of mine that these reviews only compare new CPUs against the previous generation and not 2-3 generations back.

    Most people do NOT upgrade with every single CPU release, most people upgrade their rigs every 2-3 years I'm guessing. For example, I'm running a Core i7 930 and it's very fast already, I want to upgrade to Ivy and will either way, but I'd love to see how much faster I can expect the Ivy to compare to the ol 930/920 which tons of people have.

    In my opinion going back a 2-3 generations is the ideal thing that people want to compare to. No one will upgrade from Sandy bridge (unless rich and a little stupid), but a lot of people will upgrade from the original 920 era which is a few years old now.

    Just food for thought.
    Reply
  • Tchamber - Monday, April 23, 2012 - link

    I agree, I have an X58 CPU too, and there was no SB CPU worth upgrading to. Reply
  • Anand Lal Shimpi - Monday, April 23, 2012 - link

    I agree with you and typically try to do just that, time was an issue this round - I was on the road for much of the past month and had to cut out a number of things I wanted to do for this launch.

    Thankfully, we have bench - with the 3770K included: www.anandtech.com/bench. Feel free to compare away :)

    Take care,
    Anand
    Reply
  • AmdInside - Monday, April 23, 2012 - link

    Wish you guys would have included BF3 numbers for discrete GPU benchmarks. It is a game that is CPU heavy in multiplayer maps with large amounts of people. Reply
  • fic2 - Monday, April 23, 2012 - link

    "One problem Intel does currently struggle with is game developers specifically targeting Intel graphics and treating the GPU as a lower class citizen."

    Well, as long as Intel treats their igp as the bastard red-headed step child then I am sure that developers will too.

    If they would actually put the HD3000/4000 into the main stream parts developers might pay attention to it. If I was a game developer why would I pay attention to the HD2000/2500 which isn't really capable of playing crap and is the mainstream Intel IGP? If I was a game developer I would know that anyone buying a 'K' series part is also going to be buying a discrete video card also.
    Reply
  • JarredWalton - Monday, April 23, 2012 - link

    Intel's IGP performance has improved by about 500% since the days of GMA 4500. Is that not enough of an improvement for you? My comparison, Llano is only about 300% faster than the HD 4200 IGP. What's more, Haswell is set to go from 16 EUs in IVB GT2 to 40 EUs in GT3. Along with other architectural tweaks, I expect Haswell's GT3 IGP to be about three times as fast as Ivy Bridge. You'll notice that in the gaming tests, 3X HD 4000 is going to put discrete GPUs in a tough situation. Reply
  • fic2 - Monday, April 23, 2012 - link

    Yes, but the majority of users will not have an HD3000/4000 since they will have an OEM built computer. Conversely, gamers will more than likely have an HD3000/4000 included with the 'K' series. BUT, these same gamers will more than likely also have a discrete video card and never use the HD3000/4000.

    Again, if I was a game developer why would I put resources into optimizing for an igp that gamers aren't going to use?

    I give props to Intel for the huge jump in improvement in the 'K' series igp - it went from really crappy to just sort of crappy.
    If Intel would stop doing the stupid igp segmentation and include the HD3000/HD4000 in ALL of their *Bridge cpus then game developers might see there is a big market there to optimize for. Until Intel stops shooting themselves in the marketing foot then game developers won't pay any attention to their igp. But, based on IB it looks like Haswell will probably do the same brain damaged thing and include the "good" graphics into cpus that less than 10% of the people buy and less than 10% of that 10% don't use a discrete graphics card.

    Oh, and your 500%/300% improvement is pretty crappy since HD 4200 was way faster than GMA 4500 to begin with so in absolute terms the 4200->Llano made a bigger jump than 4500->3000:
    i.e.
    4500 starts out at 2. 500% improvement would put it to 10 for an absolute improvement of 8.
    4200 starts out at 6. 300% improvement would put it at 18 for an absolute improvement of 12.
    So, AMD is still pulling away from Intel on the igp front. And AMD doesn't play igp segmentation game so their whole market has pretty good igp.
    Reply
  • JarredWalton - Monday, April 23, 2012 - link

    It's an estimate, and it's pretty clear that AMD did not make the bigger jump. They were much faster than GMA 4500, but not the 3x improvement you suggest. In fact, I tested this several years back: http://www.anandtech.com/show/2818/8

    Even if we count the "failed to run" games as a 0 on Intel, AMD's HD 4200 was only 2.4x faster, and if we only look at games where the drivers didn't fail to work, they were more like 2X faster. So here's the detailed history that you're forgetting:

    1) HD 4200 was much faster than GMA 4500 -- call it twice as fast. Intel = 1, AMD = 2.

    2) Arrandale's HD Graphics really closed the gap with HD 4200 (which AMD continued to ship for far too long). Arrandale's "pathetic" HD Graphics were actually just 10% behind HD 4200, give or take. Intel = 1.9, AMD = 2 (http://www.anandtech.com/show/3773)

    3) Sandy Bridge more than doubled IGP performance on average compared to Arrandale, 130% faster by my tests (http://www.anandtech.com/show/4084/5). Meanwhile, AMD finally came out with a new IGP to replace the horribly outdated HD 4200 with Llano (http://www.anandtech.com/show/4444/11). The A8 GPU ended up being on average 50% faster than HD 3000. Intel = 2.5, AMD = 3.8.

    4) Ivy Bridge comes out and improves by 50% on average over HD 3000 (http://www.anandtech.com/show/5772/6). Intel = 3.8, AMD = 3.8

    So by those figures, what we've actually seen is that since GMA 4500MHD and HD 4200, Intel has improved their integrated graphics performance 280% and AMD has improved their performance by around 90%. So my initial estimates were off (badly, apparently). If we bring Trinity into the equation and it gets 50% more performance, then yes AMD is still ahead: Intel 3.8, AMD 5.7. That will give Intel a 280% improvement over three years and AMD a similar 280% improvement.

    Of course, if we look at the CPU side, Intel CPU multithreaded performance (just looking at Cinebench 10 SMP score) has gone up 340% from the Core 2 P8600 to the i7-3720QM. AMD's performance in the same test has gone up 80%. For single-threaded performance, Intel has gone up 115% and AMD has improved about 5-10%. So for all the talk of Intel IGP being bad, at least in terms of relative performance Intel has kept pace or even surpassed AMD. For CPU performance on the other hand, AMD has only improved marginally since the days of Athlon X2.

    Your discussion of the Intel's market segmentation is apparently missing the whole point of running a business. You do it to make a profit. Core i3 exists because not everyone is willing to pay Core i5 prices, and Core i5 exists because even fewer people are willing to pay Core i7 prices. The people that buy Core i3 and are willing to compromise on performance are happy, the people that buy i5 are happy, and the people that buy i7 are happy...and they all give money to Intel.

    If you look at the mobile side of the equation, your arguments become even less meaningful. Intel put HD 3000 into all of the Core i3/i5/i7 mobile parts because that's where IGP performance is the most important. They're doing the exact same thing on the mobile side. People who care about graphics performance on desktops are already going to by a dGPU, but you can't just add a dGPU to a notebook if you want more performance.

    And finally, "AMD doesn't play IGP segmentation" is just completely false. Take off your blinders. A8 APUs have 400 cores clocked at 444MHz. A6 APUs have 320 cores clocked at 400MHz, and A4 APUs have 240 cores clocked at 444MHz. AMD is every bit as bad as Intel when it comes to market segmentation by IGP performance!
    Reply
  • fic2 - Monday, April 23, 2012 - link

    I guess you are correct about AMD - I haven't really paid much attention to them since, as you said, they can't keep up on the cpu side.

    But, TH lists the 6410 (A4 igp) as being 3 levels above the HD3000 in their Graphics Hierarchy Chart. They also have the HD2000 2 levels below the HD3000. So, Intel's mainstream igp is 5 levels below AMDs lowest igp.

    That is why game developers treat Intel's igp as a lower class citizen.

    The quote that I was addressing (as stated in my first post) is:
    "One problem Intel does currently struggle with is game developers specifically targeting Intel graphics and treating the GPU as a lower class citizen."

    The article acts like it is a total mystery why game developers don't give the Intel igp any respect. As I have repeatedly said in my comments - until Intel starts putting the HD3000/HD4000 into their mainstream parts and not just the 'K' series game developers know that Intel igp is a lower class citizen. And, yes, I know that you can get a xxx5 variant w/HD3000 if you look around enough, but I doubt any OEM is using them and they didn't appear until 6+ months after the launch. It is just easier to slap a 5-6 year old discrete video card into a computer.
    Game developers can't target the HD3000/HD4000 since those are the minority for SB/IB chips. They would have to target the HD2000/HD2500. Since they don't the conclusion is that it isn't worth putting the resources into such a low end graphics solution.
    Reply

Log in

Don't have an account? Sign up now