Original Link: http://www.anandtech.com/show/5773/intels-ivy-bridge-an-htpc-perspective



The ability to cram in more and more transistors in a die has made it possible to have both the CPU and GPU in the same silicon. Intel's GPUs have traditionally catered to the entry-level consumers, and have often been deemed good enough for basic HTPC use. AMD introduced their own CPU + GPU combination in the Llano series last year. While AMD does have a better GPU architecture in-house, they could not integrate their best possible GPU for fear of cannibalizing their mid-range GPU sales. The result was that Llano, while being pretty decent for HTPC use, didn't excite us enough to recommend it wholeheartedly.

Today, Intel is taking on AMD's Llano with a revamped integrated GPU. We have traditionally not been kind to Intel in our HTPC reviews because of the lack of proper drivers and open source software support. Things took a turn for the better with Sandy Bridge. One of Intel's engineers took it upon himself to bring reliable hardware decoding support on Intel platforms with the QuickSync decoder.

As a tech journalist in the HTPC space, I spend quite a bit of time on forums such as Doom9 and AVSForum where end-users and developers interact with each other. The proactive nature of the QuickSync developer in interacting with the end-users was something sorely lacking from Intel's side previously. We have seen various driver issues getting quashed over the last few releases, thanks to the new avenue of communication between Intel and the consumers.

With Ivy Bridge, we are getting a brand new GPU with more capabilities. Given the recent driver development history, even advanced HTPC users could be pardoned for thinking that Ivy Bridge would make a discrete HTPC GPU redundant. Video post processing quality is subjective, but that shouldn't prevent us from presenting pictorial results for readers to judge. One of the most talked about issues with the Intel GPU for HTPC purposes is the lack of proper 23.976 Hz display refresh rate support. Does this get solved in Ivy Bridge?

In this review, we present our experience with Ivy Bridge as a HTPC platform using a Core i7-3770K (with Intel HD Graphics 4000). In the first section, we tabulate our testbed setup and detail the tweaks made in the course of our testing. A description of our software setup and configuration is also provided. Following this, we have the results from the HQV 2.0 benchmark and some pictorial evidence of the capabilities of the GPU drivers. A small section devoted to the custom refresh rates is followed by some decoding and rendering benchmarks. No HTPC solution is completely tested without looking at the network streaming capabilities (Adobe Flash and Microsoft Silverlight performance). In the final section, we cover miscellaneous aspects such as power consumption and then proceed to the final verdict.



Intel provided us with a Core i7-3770K processor and Asus was kind enough to supply the HTPC friendly P8H77-M Pro motherboard for our test drive. Purists might balk at the idea of an overclockable 77W TDP processor being used in tests intended to analyze the HTPC capabilities. However, the Core i7-3770K comes with Intel HD Graphics 4000, the highest end GPU in the Ivy Bridge lineup. Using this as the review platform gives readers an understanding of the maximum HTPC capabilities of the Ivy Bridge lineup.

The table below presents the hardware components of our Ivy Bridge HTPC testbed:

Ivy Bridge HTPC Testbed Setup
Processor Intel Core i7-3770K - 3.50 GHz (Turbo to 3.9 GHz)
Intel HD Graphics 4000 - 650 MHz (Max. Dynamic Frequency of 1150 MHz)
Motherboard Asus P8H77-M Pro uATX
OS Drive Seagate Barracuda XT 2 TB
Memory G.SKILL ECO Series 4GB (2 x 2GB) SDRAM DDR3 1333 (PC3 10666) F3-10666CL7D-4GBECO CAS 9-9-9-24
G.SKILL Ripjaws Z Series 16GB (2 x 8GB) SDRAM DDR3 1600 (PC3 12800) F3-12800CL10Q2-64GBZL CAS 10-10-10-30
Optical Drives ASUS 8X Blu-ray Drive Model BC-08B1ST
Case Antec VERIS Fusion Remote Max
Power Supply Antec TruePower New TP-550 550W
Operating System Windows 7 Ultimate x64 SP1
Display / AVR Acer H243H / Pioneer Elite VSX-32 + Sony Bravia KDL46EX720
.

The Asus P8H77-M PRO makes for a nice HTPC / general purpose board for consumers not interested in overclocking their CPU. It also has two PCI-E x16 slots (one operating in x16 with PCI-E 3.0, and the other in x4 with PCI-E 2.0) and two PCI-E x1 slots for those interested in adding gaming cards or TV tuners / video capture cards.

Readers might wonder about the two different flavours of DRAM being used in the testbed. It must be noted that at any given point of time, only one of the flavours was being used.

As readers will see in a later section, it is possible that the memory bandwidth and latency can play a very important role in the video post processing performance. Towards this, we actually ran our decode / post processing tests with three distinct configurations. The ECO modules were run at DDR3 1333 (9-9-9-24) and also at DDR3 1600 (9-9-9-24). The Ripjaws Z modules were overclocked to DDR3 1800 (12-12-12-32). The ability to overclock the G.Skill DRAM modules was quite useful in trying to find some insights into the effect of memory bandwidth and latency on video post processing using the integrated GPU.

The software setup for the Ivy Bridge HTPC testbed involved the following:

Ivy Bridge HTPC Testbed Software Setup
Blu-ray Playback Software CyberLink PowerDVD 12
Media Player MPC-HC v1.6.1.4235
Splitter / Decoder LAV Filters 0.50.1
Renderers EVR-CP (integrated in MPC-HC v1.6.1.4235)
madVR v0.82.5

The madVR renderer settings were fixed as below for testing purposes:

  1. Decoding features disabled
  2. Deinterlacing set to:
    • automatically activated when needed (activate when in doubt)
    • automatic source type detection (i.e, disable automatic source type detection is left unchecked)
    • only look at pixels in the frame center
    • be performed in a separate thread
  3. Scaling algorithms were set as below:
    • Chroma upscaling set to default (SoftCubic with softness of 100)
    • Luma upscaling set to default (Lanczos with 4 taps)
    • Luma downscaling set to default (Lanczos with 4 taps)
  4. Rendering parameters were set as below:
    • Start of playback was delayed till the render queue filled up
    • A separate device was used presentation, and D3D11 was used
    • CPU and GPU queue sizes were set to 32 and 24 respectively
    • Under windowed mode, the number of backbuffers was set to 8, and the GPU was set to be flushed after intermediate render steps as well as the last render step. In addition, the GPU was set to wait (sleep) after the last render step.

Exclusive mode settings were not applicable to our testbed, because we found the full screen exclusive mode to be generally bad in performance compared to the full screen windowed mode. Also, none of the options to trade quality for performance were checked.



HTPC enthusiasts are often concerned about the quality of pictures output by the system. While this is a very subjective metric, we have been taking as much of an objective approach as possible. We have been using the HQV 2.0 benchmark in our HTPC reviews to identify the GPUs' video post processing capabilities. The HQV benchmarking procedure has been heavily promoted by AMD, and Intel also seems to be putting its weight behind that.

The control panel for the Ivy Bridge GPU has a number of interesting video post processing control knobs which earlier drivers lacked. The most interesting of these is the ability to perform noise reduction on a per-channel basis, i.e, only for luma or for both luma and chroma. More options are always good for consumers, and the interface makes it simple enough to leave the decision making to the drivers or the application. An explicit skin tone correction option is also available.

HQV scores need to be taken with a grain of salt. In particular, one must check the tests where the GPU lost out points. In case those tests don't reflect the reader's usage scenario, the handicap can probably be ignored. So, it is essential that the scores for each test be compared, rather than just the total value.

The HQV 2.0 test suite consists of 39 different streams divided into 4 different classes. For the Ivy Bridge HTPC, we used Cyberlink PowerDVD 12 with TrueTheater disabled and hardware acceleration enabled for playing back the HQV streams. The playback device was assigned scores for each, depending on how well it played the stream. Each test was repeated multiple times to ensure that the correct score was assigned. The scoring details are available in the testing guide from HQV.

Blu-rays are usually mastered very carefully. Any video post processing (other than deinterlacing) which needs to be done is handled before burning it in. In this context, we don't think it is a great idea to run the HQV benchmark videos off the disc. Instead, we play the streams after copying them over to the hard disk. How does the score compare to what was obtained by the Sandy Bridge and Llano at launch?

In the table below, we indicate the maximum score possible for each test, and how much each GPU was able to get. The HD3000 is from the Core i5-2520M with the Intel 15.22.2.64.2372 drivers. The AMD 6550D was tested with Catalyst 11.6, driver version 8.862 RC1 and the HD4000 with driver version 8.15.10.2696

 
HQV 2.0 Benchmark
Test Class Chapter Tests Max. Score Intel HD3000 AMD 6550D (Local file) Intel HD4000
Video Conversion Video Resolution Dial 5 5 4 5
Dial with Static Pattern 5 5 5 5
Gray Bars 5 5 5 5
Violin 5 5 5 5
Film Resolution Stadium 2:2 5 5 5 5
Stadium 3:2 5 5 5 5
Overlay On Film Horizontal Text Scroll 5 3 5 3
Vertical Text Scroll 5 5 5 5
Cadence Response Time Transition to 3:2 Lock 5 5 5 5
Transition to 2:2 Lock 5 5 5 5
Multi-Cadence 2:2:2:4 24 FPS DVCam Video 5 5 5 5
2:3:3:2 24 FPS DVCam Video 5 5 5 5
3:2:3:2:2 24 FPS Vari-Speed 5 5 5 5
5:5 12 FPS Animation 5 5 5 5
6:4 12 FPS Animation 5 5 5 5
8:7 8 FPS Animation 5 5 5 5
Color Upsampling Errors Interlace Chroma Problem (ICP) 5 2 2 5
Chroma Upsampling Error (CUE) 5 2 2 5
Noise and Artifact Reduction Random Noise SailBoat 5 5 5 5
Flower 5 5 5 5
Sunrise 5 5 5 5
Harbour Night 5 5 5 5
Compression Artifacts Scrolling Text 5 3 3 5
Roller Coaster 5 3 3 5
Ferris Wheel 5 3 3 5
Bridge Traffic 5 3 3 5
Upscaled Compression Artifacts Text Pattern 5 3 3 3
Roller Coaster 5 3 3 3
Ferris Wheel 5 3 3 3
Bridge Traffic 5 3 3 3
Image Scaling and Enhancements Scaling and Filtering Luminance Frequency Bands 5 5 5 5
Chrominance Frequency Bands 5 5 5 5
Vanishing Text 5 5 5 5
Resolution Enhancement Brook, Mountain, Flower, Hair, Wood 15 15 15 15
Video Conversion Contrast Enhancement Theme Park 5 5 5 5
Driftwood 5 5 5 5
Beach at Dusk 5 2 5 5
White and Black Cats 5 5 5 5
Skin Tone Correction Skin Tones 10 0 7 7
             
    Total Score 210 173 184 197

A look at the above table reveals that Intel has caught up with the competition in terms of HQV scores. In fact, they have comfortably surpassed what the Llano got at launch time. Many of the driver problems plaguing AMD's GPUs hadn't been fixed when we looked at the AMD 7750 a couple of months back, so it is likely that the Llano's scores have not budged much from what we have above. In fact, the score of 197 ties with what we obtained for the 6570 during our discrete HTPC GPU shootout.



We briefly looked at the various knobs available in the graphics control panel in the previous section. In this section, we will take a look at some of those knobs in action. In our piece on discrete HTPC GPUs, we had explained clearly about the basics of cadence detection and why it is necessary. We had also included a gallery with screenshots of various GPUs playing back the Spears & Munsil Wedge Pattern. The 2:3:2:3 cadence is undoubtedly the most common pattern. The pictures below show the effect of the film mode detection knob on the wedge pattern clip. If you refer back to the gallery, you will find that all GPUs other than the GT 430 had trouble with properly identifying the cadence and performing deinterlacing appropriately. The Intel HD Graphics 4000 has no trouble with this clip.

 

Deinterlacing works as expected in PowerDVD and also EVR-CP / madVR (which implement DXVA2 deinterlacing).

For reference, a screenshot of the non-deinterlaced version can be found as the penultimate picture here.

The quality of chroma upsampling differs from GPU to GPU, and even within the same GPU, it depends on the driver version. It is generally accepted that madVR provides one of the best upsampling algorithm implementations for rendering purposes. In fact, the end-user has the ability to opt for an upsampling algorithm of his choice. We took the HQV clip for testing chroma upsampling, and played it in both PowerDVD as well as MPC-HC with madVR. The two screenshots below show the magnified view of a particular area in the clip. The madVR quality is visibly better, but the PowerDVD version is no slouch either. There is almost no colour bleeding or any other artifacts similar to what we saw in the AMD 7750 review. The full screenshots are available here (madVR) and here (PowerDVD).

PowerDVD Chroma Upsampling

madVR Chroma Upsampling (Default Algorithm)

One of the interesting aspects of the noise reduction knob is the fact that we have separate controls for luma / luma and chroma. The gallery below has the feature in action.

Adaptive contrast enhancement works as advertised, enabling HD 4000 to comfortably score the maximum possible points in that section of the HQV benchmark.



One of the main reasons for HTPC purists to override the Intel integrated GPUs was the lack of a proper 23.976 Hz refresh rate. Till Clarkdale, the Intel GPUs refreshed the display at 24 Hz when set to 23 Hz. When Sandy Bridge was launched, it was discovered that the 23 Hz setting could be activated and made to function as intended if UAC was disabled. With v2372 drivers, the disabling of UAC became unnecessary. While this didn't result in perfect 23.976 Hz (locked around 23.972 Hz), it was definitely much better than the earlier scenario.

How does Ivy Bridge fare? The short story is that the behaviour on the P8H77-M Pro board is very similar to Sandy Bridge. As the screenshot below shows, the refresh rate is quite stable around 23.973 Hz. This is as good as the bad AMD and NVIDIA GPU cards.

The good news is that Intel is claiming that this issue is fully resolved in the latest production BIOS on their motherboard. This means that BIOS updates to the current boards from other manufacturers should also get the fix. Hopefully, we should be able to independently test and confirm this soon.

It is not only the 23 Hz setting which is off the mark by a small amount. Other refresh rates also suffer similar problems (with videos played back at that frame rate dropping a frame every 5 minutes or so). The gallery below shows some of the other refresh rates that we tested.

Another aspect we found irritating with Intel's GPU control panel is the custom resolution section. Intel seems very reliant on EDID and doesn't allow the user to input any frequency not supported by the display. Our testbed display (a Sony Bravia KDL46EX720) doesn't indicate PAL compatibility in its EDID information. I was able to play back PAL videos with matched refresh rates using the Vision 3D (NVIDIA GT 425M) as well as the AMD 7750. However, Intel's control panel wouldn't allow me to set up 50 Hz as the display refresh rate. It is possible that an EDID override might help, but we can't help complaining about Intel's control panel not being as user friendly as NVIDIA's (despite the availability of a custom resolutions section in the control panel).



Advanced HTPC users adopt specialized renderers such as madVR which provide better quality when rendering videos. Unlike the standard EVR-CP (Enhanced Video Renderer-Custom Presenter) which doesn't stress the GPU much, renderers like madVR are very GPU-intensive. This has often been the sole reason for many HTPC users to go in for NVIDIA or AMD cards for their HTPC. Traditionally, Intel GPUs have lacked the performance necessary for madVR to function properly (particularly with high definition streams). We did some experiments to check whether Ivy Bridge managed some improvements.

Using our testbed with the 4 GB of DRAM running at DDR3-1333 9-9-9-24, we took one clip each of 1080i60 H.264, 1080i60  VC-1, 1080i60 MPEG-2, 576i50 H.264, 480i60 MPEG-2 and 1080p60 H.264. We tabulated the CPU and GPU usage using various combinations of decoders and renderers. It is quite obvious that using madVR tends to drive up the CPU usage compared to pure DXVA mode (with EVR-CP renderer). This is because the CPU needs to copy back the data to the system memory for madVR to execute the GPU algorithms. A single star against the GPU usage indicates between 5 - 10 dropped frames in a 3 minute duration. Double stars indicate that the number of dropped frames was high and that the dropping of the frames was clearly visible to the naked eye.

  DDR3-1333 [ 9-9-9-24 ]
  madVR 0.82.5 EVR-CP 1.6.1.4235
  QuickSync
Decoder
DXVA2
Copy-Back
DXVA2
(SW Fallback)
DXVA2 QuickSync
Decoder
  CPU GPU CPU GPU CPU GPU CPU GPU CPU GPU
480i60
MPEG-2
3 74 3 74 4 74 5 28 5 28
576i50
H.264
3 59 3 58 4 58 5 25 5 27
1080i60
H.264
14 86** 11 86** 14 81* 6 42 8 48
1080i60
VC-1
13 84** 13 80* 13 80* 13 47 8 47
1080i60
MPEG-2
12 82** 12 80** 9 78** 5 44 9 48
1080p60
H.264
18 97* 20 97** 18 96** 5 44 12 50

With DDR3-1333, it is evident that 1080i60 streams just can't get processed through madVR without becoming unwatchable. Memory bandwidth constraints are quite problematic for madVR. So, we decided to overclock the memory a bit, and got the G.Skill ECO RAM running at DDR3-1600 without affecting the latency. Of course, we made sure that the system was stable running Prime95 for a couple of hours before proceeding with the testing. With the new memory configuration, we see that the GPU usage improved considerably, and we were able to get madVR to render even 1080p60 videos without dropping frames.

  DDR3-1600 [ 9-9-9-24 ]
  madVR 0.82.5 EVR-CP 1.6.1.4235
  QuickSync
Decoder
DXVA2
Copy-Back
DXVA2
(SW Fallback)
DXVA2 QuickSync
Decoder
  CPU GPU CPU GPU CPU GPU CPU GPU CPU GPU
480i60
MPEG-2
2 76 2 76 2 73 5 27 5 27
576i50
H.264
2 57 2 57 3 57 5 25 5 24
1080i60
H.264
7 77 11 74 12 74 6 40 9 40
1080i60
VC-1
7 76 11 75 12 79 12 40 8 40
1080i60
MPEG-2
6 74 6 74* 8 75* 5 39 9 40
1080p60
H.264
13 82 14 84 14 80 6 41 10 42

However, the 5 - 10 dropped frames in the 1080i60 MPEG-2 clip continued to bother me. I tried to overclock G.Skill's DDR3-1600 rated DRAM, but was unable to reach DDR3-1800 without sacrificing latency. With a working configuration of DDR3-1800 12-12-12-32, I repeated the tests, but found that the figures didn't improve.

  DDR3-1800 [ 12-12-12-32 ]
  madVR 0.82.5 EVR-CP 1.6.1.4235
  QuickSync
Decoder
DXVA2
Copy-Back
DXVA2
(SW Fallback)
DXVA2 QuickSync
Decoder
  CPU GPU CPU GPU CPU GPU CPU GPU CPU GPU
480i60
MPEG-2
2 75 2 75 2 72 5 27 5 27
576i50
H.264
2 57 2 57 3 57 5 25 5 24
1080i60
H.264
7 74 11 73 12 74 6 39 9 40
1080i60
VC-1
7 74 11 74 12 77 12 39 8 40
1080i60
MPEG-2
6 74 6 74* 8 74* 5 39 9 40
1080p60
H.264
12 84 14 84 14 80 6 41 10 42

My inference is that a low memory latency is as important as high bandwidth for madVR to function effectively. I am positive that with a judicious choice of DRAM, it is possible to get madVR functioning flawlessly with the Ivy Bridge platofrm. Of course, more testing needs to be done with other algorithms, but the outlook is quite positive.

In all this talk about madVR, let us not forget the efficient QuickSync / native DXVA2 decoders in combination with EVR-CP. With low CPU usage and moderate GPU usage, these combinations deliver satisfactory results for the general HTPC crowd.



The last time we looked at Flash acceleration in the Intel drivers, we came away disappointed. Have things changed this time around? Intel seems to have taken extra care about this aspect and even supplied us with Flash player builds confirmed and tested to have full GPU acceleration and rendering capabilities. We took the Flash plugin for a test drive using our standard YouTube clip. This time around, we also added a 720p Hulu Plus clip. In the case of YouTube, there is visual confirmation of accelerated decoding and rendering. For Hulu Plus, we need to infer it from the GPU usage in GPU-Z. Hulu Plus streaming seems to be slightly more demanding on the CPU compared to YouTube.

Netflix streaming, on the other hand, uses Microsoft's Silverlight technology. Unlike Flash, hardware acceleration for the video decode process is not controlled by the user. It is upto the server side code to attempt GPU acceleration. Thankfully, Netflix does try to take advantage of the GPU's capabilities.

This is evident from the A/V stats recorded while streaming a Netflix HD video at the maximum possible bitrate of 3.7 Mbps. The high GPU usage in GPU-Z also points to hardware acceleration being utilized.



Before proceeding to the business end of the review, let us take a look at some power consumption numbers. The G.Skill ECO RAM was set to DDR3 1600 during the measurements. We measured the average power drawn at the wall under different conditions. In the table below, the Blu-ray movie from the optical disk was played using CyberLink PowerDVD 12. The Prime95 + Furmark benchmark was run for 1 hour before any measurements were taken. The MKVs were played back from a NAS attached to the network. The testbed itself was connected to a GbE switch (as was the NAS). In all cases, a wireless keyboard and mouse were connected to the testbed.

Ivy Bridge HTPC Power Consumption
Idle 37.7 W
Prime95 + Furmark (Full loading) 127.1 W
Blu-ray from optical drive 57.6 W
1080p24 MKV Playback (MPC-HC + QuickSync + EVR-CP) 47.1 W
1080p24 MKV Playback (MPC-HC + QuickSync + madVR) 49.8 W

The Ivy Bridge platform ticks all the checkboxes for the average HTPC user. Setting up MPC-HC with LAV Filters was a walk in the park. With good and stable support for DXVA2 APIs in the drivers, even softwares like XBMC can take advantage of the GPU's capabilities. The QuickSync decoder and DXVA decoder are equally efficient, and essential video processing steps such as cadence detection and deinterlacing work beautifully

For advanced users, the GPU is capable of supporting madVR for most usage scenarios even with slow memory in the system. With fast, low-latency DRAM, it is even possible that madVR can be used as a renderer for the most complicated streams. More investigation needs to be carried out to check the GPU's performance under different madVR algorithms, but the initial results appear very promising.

Does this signify the end of the road for the discrete HTPC GPU? Unfortunately, that is not the case. The Ivy Bridge platform is indeed a HTPC dream come true, but it is not future proof. While Intel will end up pleasing a large HTPC audience with Ivy Bridge, there are still a number of areas which Intel seems to have overlooked:

  • Despite the rising popularity of 10-bit H.264 encodes, the GPU doesn't seem to support decoding them in hardware. That said, software decoding of 1080p 10-bit H.264 is not complex enough to overwhelm the i7-3770K (but, that may not be true for the lower end CPUs).
  • The video industry is pushing 4K and it makes more sense to a lot of people compared to the 3D push. 4K will see a much faster rate of adoption compared to 3D, but Ivy Bridge seems to have missed the boat here. AMD's Southern Islands as well as NVIDIA's Kepler GPUs support 4K output over HDMI, but none of the current motherboards for Ivy Bridge CPUs support 4K over HDMI.
  • It is not clear whether the Ivy Bridge GPU supports decode of 4K H.264 clips. With the current drivers and LAV Filter implementation, 4K clips were decoded in software mode. This could easily be fixed through a driver / software update. In any case, without the ability to drive a 4K display, the capability would be of limited use.

Discrete HTPC GPUs are necessary only if one has plans to upgrade to 4K in the near term. Otherwise, the Ivy Bridge platform has everything that a HTPC user would ever need.

Log in

Don't have an account? Sign up now