Original Link: http://www.anandtech.com/show/7007/intels-haswell-an-htpc-perspective
Intel's Haswell - An HTPC Perspective: Media Playback, 4K and QuickSync Evaluatedby Ganesh T S on June 2, 2013 8:15 PM EST
Over the last two years, the launch of every major desktop CPU family from both AMD and Intel has been accompanied by a dedicated HTPC-oriented article. This coverage has been complementary to Anand's extensive analysis from a general computing perspective. Haswell will be no different. The advancements made from Llano to Trinity and from Sandy Bridge to Ivy Bridge had rendered entry level platforms good enough for casual / mainstream HTPC users. Advanced users still require discrete GPUs for using some video renderers and obtaining accurate display refresh rates. Each vendor has their own quirks when it comes to driver features and stability. This has made it difficult to declare any one solution as the perfect HTPC platform. Intel has hyped up improved GPU performance in the lead up to Haswell.
Has Intel improved the GPU performance and video-centric features enough to make discrete GPUs redundant for HTPCs? More importantly, how much of an improvement do we have over the HD4000 in Ivy Bridge? This question will be looked at from multiple angles in the course of this review. We will determine whether the shortcomings of Ivy Bridge (rendering benchmarks and refresh rate support, primarily) have been addressed. Also of importance are the HTPC configuration options, stability and power efficiency.
In this review, we present our experience with low-power desktop Haswell as a HTPC platform. We have listened to feedback from our earlier HTPC reviews at launch time and made efforts to source a low power CPU suitable for HTPC duties. In earlier HTPC reviews put out at launch time, we used the highest end CPU sampled by Intel / AMD. This time around, thanks to ASRock, we managed to get hold of an Intel Core i7-4765T CPU along with their mini-ITX motherboard, the Z87E-ITX.
In the first section, we tabulate our testbed setup and detail the tweaks made in the course of our testing. A description of our software setup and configuration is also provided. Following this, we cover the video post processing options provided by the Intel drivers. A small section devoted to the custom refresh rates is followed by some decoding and rendering benchmarks. No HTPC solution is completely tested without looking at the network streaming capabilities with respect to some of the popular OTT (over-the-top) services. 4K is the next major upgrade stop for the casual HTPC user. Haswell does have 4K display support and we will have a dedicated section to see how well it works. We are finally at a point where GPU encoders have become stable and popular enough for mainstream open source projects to utilize. A section is devoted to Handbrake's integration of QuickSync capabilities. In the final section, we cover miscellaneous aspects such as power consumption and then proceed to the final verdict.
Testbed and Software Setup
Instead of going for the usual high end CPU (77W / 95W TDPs), we have opted for the Core i7-4765T for today's review. This is a 35W TDP CPU with four cores / eight threads, expected to retail with a MSRP of $303. Intel has a number of GPU configurations doing the rounds at Haswell launch. The i7-4765T sports the HD 4600 GPU, and it is the best GPU available in a LGA 1150 configuration (The Iris Pro 5200 GPUs are reserved for BGA configurations and unavailable to system builders).
The table below presents the hardware components of our Haswell HTPC testbed
|Haswell HTPC Testbed Setup|
|Processor||Intel Core i7-4765T - 2.00 GHz (Turbo to 3.0 GHz)|
|Intel HD Graphics HD4600 - Up to 1200 MHz|
|Motherboard||ASRock Z87E-ITX mITX|
|OS Drive||Seagate 600 SSD ST240HM000 240GB|
|Memory||G.SKILL Ares Series 8GB (2 x 4GB) SDRAM DDR3 2133 (PC3 17000) F3-2133C9Q-16GAB CAS 9-11 -10-28 2N|
|Optical Drive||ASUS 8X Blu-ray Drive Model BC-08B1ST|
|Case||Antec Skeleton ATX Open Air Case|
|Power Supply||Antec VP-450 450W ATX|
|Operating System||Windows 8 Professional x64|
|Displays / AVRs||Onkyo TX-SR606 + Acer H243H|
|Pioneer Elite VSX-32 + Sony Bravia KDL46EX720|
|Seiki Digital SE50UY04|
The ASRock Z87E-ITX board comes with a Broadcom-based 802.11ac 2T2R solution. Connected to a Buffalo WZR-D1800H 802.11ac router, I was able to consistently obtain 173 Mbps of practical throughput. Streaming Blu-ray ISOs over Wi-Fi from a NAS worked without issues. The board was very simple to get up and running and given its form factor and the CPU currently installed, I hope to migrate it to a passive HTPC build soon.
The Haswell platform officially supports DDR3-1600. Towards this, we obtained a 16 GB DDR3-2133 Ares kit from G.Skill for our testbed. The Ares kit supports XMP 1.2 and the ASRock Z87E-ITX had it running at 2133 MHz flawlessly on first boot. However, we made sure to run the memory at the suggested 1600 MHz in order to obtain results consistent with what an average system builder (non-overclocker) would obtain. The Ares kit makes it possible to study HTPC behaviour from a memory bandwidth perspective, but we will not cover that aspect in this launch piece.
The software setup for the Haswell HTPC testbed involved the following:
|Haswell HTPC Testbed Software Setup|
|Intel Graphics Driver||126.96.36.19907 (Version on ASRock Motherboard DVD)|
|Blu-ray Playback Software||CyberLink PowerDVD 13|
|Media Player||MPC-HC v188.8.131.5214|
|Splitter / Decoder||LAV Filters 0.57|
|Renderers||EVR / EVR-CP (integrated in MPC-HC v184.108.40.20614)|
The madVR renderer settings were fixed as below for testing purposes:
- Decoding features disabled
Deinterlacing set to:
- automatically activated when needed (activate when in doubt)
- automatic source type detection (i.e, disable automatic source type detection is left unchecked)
- only look at pixels in the frame center
Scaling algorithms were set as below:
- Chroma upscaling set to SoftCubic with softness of 100
- Luma upscaling set to Lanczos with 4 taps with anti-ringing filter left deactivated and scale in linear light left unchecked / DXVA2
- Luma downscaling set to Lanczos with 4 taps with anti-ringing filter left deactivated and scale in linear light left unchecked / DXVA2
Rendering parameters were set as below:
- Automatic fullscreen exclusive mode was used
- CPU and GPU queue sizes were set to 32 and 24 respectively
- Under exclusive mode settings, the seek bar was enabled, switch to exclusive mode from windowed mode was delayed by 3 seconds and 16 frames were configured to be presented in advance. The GPU flushing modes were set to default
- Smooth motion was left disabled
- The 'trade quality for performance' settings were left at default (i.e, linear light was left disabled for smooth motion frame blending and custom pixel shader results were stored in 16-bit buffers instead of 32-bit)
Unlike our Ivy Bridge setup, we found the windowed mode to be generally bad in terms of performance compared to exclusive mode.
MPC-HC and LAV Filters settings were altered from the defaults as below for testing purposes:
- DirectShow Video Output was configured as EVR / EVR-CP / madVR under Options > Playback > Output
- All internal source and transform filters were disabled under Options > Internal Filters
- Under Options > External Filters, LAV Splitter, LAV Audio Decoder and LAV Video Decoder were added as Preferred filters
- LAV Audio Decoder was set to bitstream all applicable formats
LAV Video Decoder were altered from the defaults as below
- Hardware Acceleration was set to DXVA2 Native / QuickSync / None depending on the aspect being tested. UHD (4K) was enabled in all the cases
- Deinterlacing mode was set to 'Aggressive'
Video Post Processing and HTPC Configuration Options
Our HTPC reviews over the last few years have used the HQV 2.0 benchmark to estimate and compare video post processing quality of the GPUs. We are at a stage where almost all GPUs end up scoring around 200, leaving very little differentiation. Put bluntly, the HQV 2.0 benchmark is dated, and presenting scores from it delivers no practical value to readers. That said, the tests themselves are relevant, but, instead of the HQV 2.0 Blu-ray, we used clips from Spears & Munsil's HD Benchmark (2nd Edition).
Intel has been paying particular attention to video post processing (courtesy of the pressure put by AMD's high scores in the HQV benchmark during the Sandy Bridge era). Haswell manages to clear common deinterlacing, chroma upsampling and cadence detection tests without issues, as shown in the gallery below
The disappointment comes in the form of the revamped Intel Graphics Control Panel. While the changes in appearance can be excused as migrating to be friendly with the Windows 8 touchscreen devices, the distribution of the various configuration options makes no sense at all. For example, it is only fair for users to expect the 'inverse telecine' option to be present under the Video category. However, it makes its appearance under the advanced display settings. Input range (Full / Limited for 0 - 255 / 16 - 235) is under advanced video settings, but the YCbCr / RGB setting is under the Display settings. It would make sense to have both settings under one category as users usually modify both when trying to calibrate and ensure that their setup is working optimally.
As I found out when trying to calibrate using Spears & Munsil's HD Benchmark, the mixture of settings in the control panel makes it very difficult to calibrate the correct output color space (amongst other things). For example, there is no way to choose YCbCr 4:2:2 / YCbCr 4:4:4 / Limited RGB / Full RGB. This is just one of the missing features in the configuration utility. I hope Intel's engineers try to calibrate a few displays by driving them using an Intel GPU and using the HD Benchmark 2nd Edition calibration disk (just to understand how badly the layout of the control panel is designed).
Andrew at Missing Remote also brings out the fact that clipping issues still exist. In addition, the current control panel completely removes the ability to create custom resolutions (in any case, the previous feature was also not very user friendly compared to NVIDIA's solution). The drivers and UI / UX still need work, but Intel hasn't been as responsive as we would like (partly due to the fact that casual HTPC users don't really care about these issues).
Note of Thanks:
Refresh Rate Handling - 23.976 Hz Works!
Readers following our HTPC reviews know by now that Intel's 23 Hz issue was left unresolved in Ivy Bridge. It is definitely better than the Clarkdale days, as users no longer get 24 Hz when setting the display refresh rate to 23 Hz (23.976 Hz intended). However, the accuracy is not enough to prevent a frame drop every 4 minutes or so (the 23 Hz setting results in a display refresh rate of 23.972 Hz in Ivy Bridge). One of the first things I checked after building the Haswell HTPC was the 23 Hz setting. The good news is that the display refresh rate accuracy is excellent.
Even better news is that the set of display refresh rates obtained with the Haswell system is more accurate than anything I had obtained before with AMD or NVIDIA cards. The gallery below presents some of the other refresh rates that we tested out. madVR reports frame drops / repeats only once every 6 hours or more in the quiescent state.
Unfortunately, Intel still doesn't provide a way to easily configure custom resolutions (in fact, the latest driver release seems to have removed that option completely. Update: A reader pointed out that the feature is still available as CustomModeApp.exe in the drivers folder, but long time users still miss access to it from the main control panel). I know for a fact that my Sony display (KDL46EX720) does support 25 Hz and 50 Hz refresh rates, but Intel doesn't allow those to be configured. We are willing to cut Intel some slack this time around because they have finally resolved a bug that was reported way back in 2008.
Decoding and Rendering Benchmarks
Our decoding and rendering benchmarks consists of standardized test clips (varying codecs, resolutions and frame rates) being played back through MPC-HC. GPU usage is tracked through GPU-Z logs and power consumption at the wall is also reported. The former provides hints on whether frame drops could occur, while the latter is an indicator of the efficiency of the platform for the most common HTPC task - video playback.
Enhanced Video Renderer (EVR) / Enhanced Video Renderer - Custom Presenter (EVR-CP)
The Enhanced Video Renderer is the default renderer made available by Windows 8. It is a lean renderer in terms of usage of system resources since most of the aspects are offloaded to the GPU drivers directly. EVR is mostly used in conjunction with native DXVA2 decoding. The GPU is not taxed much by the EVR despite hardware decoding also taking place. Deinterlacing and other post processing aspects were left at the default settings in the Intel HD Graphics Control Panel (and these are applicable when EVR is chosen as the renderer). EVR-CP is the default renderer used by MPC-HC. It is usually used in conjunction with MPC-HC's video decoders, some of which are DXVA-enabled. However, for our tests, we used the DXVA2 mode provided by the LAV Video Decoder. In addition to DXVA2 Native, we also used the QuickSync decoder developed by Eric Gur (an Intel applications engineer) and made available to the open source community. It makes use of the specialized decoder blocks available as part of the QuickSync engine in the GPU.
Power consumption shows a tremendous decrease across all streams. Admittedly, the passive Ivy Bridge HTPC uses a 55W TDP Core i3-3225, but, as we will see later, the power consumption at full load for the Haswell build is very close to that of the Core i3-3225 build despite the lower TDP of the Core i7-4765T.
In general, using the QuickSync decoder results in a higher power consumption because the decoded frames are copied back to the DRAM before being sent to the renderer. Using native DXVA decoding, the frames are directly passed to the renderer without the copy-back step. The odd-man out in the power numbers is the interlaced VC-1 clip, where QuickSync decoding is more efficient compared to 'native DXVA2'. This is because there is currently no support in the open source native DXVA2 decoders for interlaced VC-1 on Intel GPUs, and hence, it is done in software. On the other hand, the QuickSync decoder is able to handle it with the VC-1 bitstream decoder in the GPU.
The GPU utilization numbers follow a similar track to the power consumption numbers. EVR is very lean on the GPU, as discussed earlier. The utilization numbers provide proof of the same. QuickSync appears to stress the GPU more, possibly because of the copy-back step for the decoded frames.
Videophiles often prefer madVR as their renderer because of the choice of scaling algorithms available as well as myriad other features. In our recent Ivy Bridge HTPC review, we found that with DDR3-1600 DRAM, it was straightforward to get madVR working with the default scaling algorithms for all materials 1080p60 or lesser. In the meanwhile, Mathias Rauen (developer of madVR) has developed more features. In order to alleviate the ringing artifacts introduced by the Lanczos algorithm, an option to enable an anti-ringing filter was introduced. A more intensive scaling algorithm (Jinc) was also added. Unfortunately, enabling either the anti-ringing filter with Lanczos or choosing any variant of Jinc resulted in a lot of dropped frames. Haswell's HD4600 is simply not powerful enough for these madVR features.
It is not possible to use native DXVA2 decoding with madVR because the decoded frames are not made available to an external renderer directly. (Update: It is possible to use DXVA2 Native with madVR since v0.85. Future HTPC articles will carry updated benchmarks) To work around this issue, LAV Video Decoder offers three options. The first option involves using software decoding. The second option is to use either QuickSync or DXVA2 Copy-Back. In either case, the decoded frames are brought back to the system memory for madVR to take over. One of the interesting features to be integrated into the recent madVR releases is the option to perform DXVA scaling. This is particularly interesting for HTPCs running Intel GPUs because the Intel HD Graphics engine uses dedicated hardware to implement support for the DXVA scaling API calls. AMD and NVIDIA apparently implement those calls using pixel shaders. In order to obtain a frame of reference, we repeated our benchmark process using DXVA2 scaling for both luma and chroma instead of the default settings.
One of the interesting aspects to note here is the fact that the power consumption numbers show a much larger shift towards the lower end when using DXVA2 scaling. This points to more power efficient updates in the GPU video post processing logic.
DXVA scaling results in much lower GPU usage for SD material in particular with a corresponding decrease in average power consumption too. Users with Intel GPUs can continue to enjoy other madVR features while giving up on the choice of a wide variety of scaling algorithms.
Network Streaming Performance - Netflix and YouTube
The move from Windows 7 to Windows 8 as our platform of choice for HTPCs has made Silverlight unnecessary. The Netflix app on Windows 8 supports high definition streams (up to a bit rate of 3.85 Mbps for all ISPs, more if the ISP is Super HD enabled) as well as 5.1-channel Dolby Digital Plus audio on selected titles.
It is not immediately evident whether GPU acceleration is available or not from the OSD messages. However, GPU-Z reported an average GPU utilization of 12% throughout the time that the Netflix app was playing back video. The average power consumption is around 28 W.
Unlike Silverlight, Adobe Flash continues to maintain some relevance right now. YouTube continues to use Adobe Flash to serve FLV (at SD resolutions) and MP4 (at both SD and HD resolutions) streams. YouTube's debug OSD indicates whether hardware acceleration is being used or not.
Windows 8 has plenty of YouTube apps. We chose the Megatube YouTube Player / Downloader which allows for stream selection. For our power measurement experiments, we chose the 1080p MP4 stream.
However, we can't be sure whether hardware acceleration is being used with the app, as there is no debug OSD. However, a look at the power consumption numbers reveal that both approaches consume less than 30 W on an average. The difference in the caching of the stream is also visible in this graph, with the Flash approach preferring to download data in bursts while the app prefers to download the whole stream as quickly as possible. Streaming was done over Wi-Fi.
Comparing these numbers with what was obtained using the i3-3225 in a passive build shows that the Haswell build manages to be more efficient even when active cooling (with one big Antec Skeleton chassis fan and a CPU fan) is employed.
On the image quality front, Haswell doesn't seem to change anything here vs. Ivy Bridge. Performance was acceptable before, and it continues to be so here. The big difference is really the additional power savings.
4K for the Masses
After our experience with Trinity and Ivy Bridge builds for HTPC purposes, we had reached the conclusion that a discrete GPU was necessary only if advanced rendering algorithms (using madVR's resource intensive scaling algorithms) or 4K support was necessary. In fact, the 4K media player supplied by Sony along with their $25K 84" 4K TV was a Dell XPS desktop PC with a AMD graphics card's HDMI output providing the 4K signal to the TV. Ivy Bridge obtained 4K display support last October, but not over the HDMI port (which is the only way to get 4K content on supported TVs).
The good news is that Haswell's 4K over HDMI works well, in a limited sort of way. In our first experiment, we connected our build to a Sony XBR-84X900 84" 4K LED TV. The full set of supported 4K resolutions (4096x2160 @ 23 Hz and 24 Hz, as well as 3840x2160 @ 23 Hz, 24 Hz, 25 Hz, 29 Hz and 30 Hz) was driven without issues.
4K H.264 decode using DXVA2 Native and QuickSync modes in LAV Video Decoder works without issues (this works well in Ivy Bridge too, just that Ivy Bridge didn't have the ability to output 4K over HDMI or any other single video link). Using madVR with 4K is out of the question (even with DXVA2 scaling), but EVR and EVR-CP both work without dropping any frames.
Now, for the bad news: If you are hoping to drive the ~$1300 Seiki Digital SE50UY04 50" 4K TV (the cheapest 4K TV in the market right now), I would suggest some caution. Our build tried to drive a 3840x2160 @ 30 Hz resolution to the Seiki TV on boot, but the HDMI link never got locked (the display would keep flickering on and off). The frequency of locking was inversely proportional to the HDMI cable length. The NVIDIA GT 640s that we tested in the same setup with the same cables and TV managed to drive the 4K Quad FHD resolutions without problems. We were able to recreate the situation with multiple Seiki units.
At this juncture, we are not sure whether this is an issue with the ASRock Z87E-ITX board in particular or a problem for all Haswell boards. Intel suggested that the HDMI level shifter used by ASRock might not be up to the mark for 4K output, but that doesn't explain why the output to the Sony 84" TV worked without issues. In short, if you have a Seiki 4K TV and want to use a PC to drive that, we would suggest using a NVIDIA GT 640 or greater / AMD 7750 or greater for now. We will update this section as and when we reach closure on the issue with ASRock / Intel.
QuickSync Gets Open Source Support, Regresses in Quality
I have traditionally avoided touching upon QuickSync in any of my HTPC reviews. The main reason behind this was the fact that support only existed in commercial software such as MediaEspresso, and even that functionality was spotty at best. Limited source file type support as well as limited configuration options rendered these unusable for the power users. While full x264 acceleration using QuickSync is out of the question, the developers of HandBrake have come forward with support for QuickSync in their transcoding application.
The feature is still in beta (for example, only H.264 files are allowed as input right now, and cropping isn't working properly), but we took it out for a test drive. We took a m2ts file from a Blu-ray and compressed it with a target bitrate of 10 Mbps using x264 single pass (everything at default) as well as QuickSync. The time taken for compression as well as the average power consumption during the course of the process are tabulated below. Numbers are also provided for the same process using our passive Ivy Bridge HTPC (which has the HD4000 GPU).
|H.264 Transcoding Performance|
|Transcoding Configuration||Engine||Power (W)||FPS|
|1080p @ 36.2 Mbps to 1080p @ 10 Mbps||QuickSync on HD4600||41.81 W||90.41|
|x264 on Core i7-4765T||67.93 W||51.66|
|QuickSync on HD4000||50.32 W||127.64|
|x264 on Core i3-3225||53.63 W||25.99|
|1080p @ 36.2 Mbps to 720p @ 7 Mbps||QuickSync on HD4600||44.02 W||166.91|
|x264 on Core i7-4765T||65.37 W||32.88|
|QuickSync on HD4000||59.67 W||206.65|
|x264 on Core i3-3225||53.85 W||16.31|
Fast and power-efficient transcoding is not the only requirement in the market. Video output quality is also very important. Encoder companies may present whitepapers with cherry-picked frame captures to show their efforts in good light. For all it is worth, the company's selected frame might be an I-frame, while the competitor's samples might be P or B-frames. PSNR is also presented as a metric indicating better quality. However, this is very unfair because encoders might be particularly tuned for PSNR but look bad when compared against the results of encoders tuned for, say, structural similarity (SSIM).
QuickSync is usually pretty fast, but the choice of bitrates in Handbrake seem to force it into one of the new modes in Haswell which actually regressed in both performance and image quality. This explains why the FPS on HD4000 is much more than than on the HD4600. However, Haswell remains very power efficient. Anand had mentioned in passing about image quality degradation in QuickSync on Haswell in yesterday's review. I was also able to replicate it. Given below are 10 consecutive raw frames from the various encoders. Take a look and judge for yourself on the basis of how the encoders handle movement and whether there are any image artifacts in the encoder results.
In our opinion, the QuickSync results on HD4600 appear to be worse than what is obtained on the HD4000. With Haswell, Intel introduced seven levels of quality/performance settings that application developers can choose from. According to Intel, even the lowest quality Haswell QSV settings should be better than what we had with Ivy Bridge. In practice, this simply isn't the case. There's a widespread regression in image quality ranging from appreciably worse to equal at best with Haswell compared to Ivy Bridge. I'm not sure what's going on here but QuickSync remains one of the biggest missed opportunities for Intel over the past few years. The fact that it has taken this long to get Handbrake support going is a shame. Now that we have it, the fact that Intel seems to have broken image quality is the icing on a really terrible cake.
For users looking for the best quality transcodes, software based x264 can deliver better output with tweaked options
two-pass encodes (such flexibilities are just not available with the QuickSync encoder). The big attraction to QuickSync remains low CPU utilization (< 10% in many cases) while you transcode. The image quality produced by Haswell's seemingly broken QSV implementation is still good enough for use on smartphones and tablets, it's just a step in the wrong direction.
Before proceeding to the business end of the review, let us take a look at some power consumption numbers. The G.Skill RAM was set to DDR3 1600 during the measurements. We measured the average power drawn at the wall under different conditions. In the table below, the Blu-ray movie from the optical disk was played using CyberLink PowerDVD 13. The ISOs were mounted using Windows 8's in-built mounting tool. Prime95 v27.9 and Furmark v1.10.6 were used for stress testing. Blu-ray ISO ripping was done using AnyDVD HD v7.2. The Prime95 + Furmark benchmark was run for 1 hour before any measurements were taken. Power consumption numbers for local file playback using various renderer / decode combinations has already been covered in a previous section. The testbed was connected to a Wi-Fi network (and the GbE port was left unconnected) throughout the evaluation. In all cases, a wireless keyboard and mouse were connected to the testbed.
|Haswell HTPC Testbed Power Consumption|
|Prime95 v27.9 + Furmark 1.10.6 (Full loading of both CPU and GPU)||85.68 W|
|Prime95 v27.9 (Full loading of CPU only)||73.79 W|
|1080p24 H.264 Blu-ray Playback from ODD||34.5 W|
|1080p24 VC-1 Blu-ray Playback from ODD||33.21 W|
|1080i60 VC-1 Blu-ray Playback from ODD||34.37 W|
|1080p24 VC-1 Blu-ray ISO Streaming from NAS||30.91 W|
|1080p24 H.264 MVC Blu-ray ISO Streaming from NAS||32.67 W|
|Blu-ray Rip to ISO from ODD||36.41 W|
The following screenshots gives an idea of how the integrated GPU and the CPU share the thermal headroom. In the first case, we have full CPU loading and no load on the GPU.
The CPU package power is around 47 W, with the IA cores alone consuming around 37 W. The second screenshot shows the transition from purely full CPU loading to full CPU and GPU loading. The CPU package power rises from 47 W to around 54 W. The GPU is consuming around 18 W, while the IA cores go down to around 27 W.
The Haswell platform ticks all the checkboxes for the mainstream HTPC user. It fixes some nagging bugs left behind in Ivy Bridge. Setting up MPC-HC with LAV Filters was a walk in the park. With good and stable support for DXVA2 APIs in the drivers, even softwares like XBMC can take advantage of the GPU's capabilities. Essential video processing steps such as chroma upsampling, cadence detection and deinterlacing work beautifully. For advanced users, the GPU is capable of supporting madVR for most usage scenarios even with DDR3-1600 memory in the system.
Admittedly, there doesn't seem to be much improvement in madVR capabilities over the HD4000 in Ivy Bridge. The madVR developer has also added more complicated algorithms to the mix and made further refinements to existing ones (such as the anti-ringing filter). The improvements in the Intel GPU capabilities haven't kept up with the requirements of these updates. That said, madVR with DXVA2 scaling works well and looks good, satifying some of the HTPC users who have moved to it from the default renderers. We could certainly complain about some missing driver features and the lack of hardware decode capabilities for 10b H.264 streams. HEVC (H.265) decode acceleration is absent too. However, let us be reasonable and accept the fact that despite anime's adoption of 10b H.264 in a big way, it is yet to gain mass-market appeal. HEVC was standardized pretty recently, and Haswell's GPU would have long been past the design stage by that time. To further Intel's defense, neither NVIDIA nor AMD support these two features.
Talking of display refresh rate support, Intel has finally fixed the 23.976 Hz bug which has been plaguing Intel-based HTPCs since 2008. This is going to make HTPC enthusiasts really happy. The fact that Intel manages the best match for the required refresh rate compared to AMD and NVIDIA cards is just icing on the cake. The 4K H.264 decode and output support from Haswell seems very promising for the 4K ecosystem. It also strengthens H.264's relevance for some time to come in the 4K arena.
The biggest disappointment with Haswell in the media department is the regression in QuickSync video transcode quality. The salt in the wound is really Intel's claims before launch of significant increases in QS video quality. Ivy Bridge definitely produces better quality QSV accelerated video transcodes. Combine that with a lack of significant progress on the software support side until recently (hooray for Handbrake, boo for no substantial OS X deployment) and you'd almost get the impression that Intel was trying its best to ruin one of the most promising features of its Core microprocessors. Haswell doesn't ruin QuickSync, the technology is still a great way of getting your content quickly transcoded for use on mobile devices. However, in its current implementation, Haswell does absolutely nothing to further QuickSync - in fact, it's a definitely step in the wrong direction.
The low power consumption of the Haswell system makes it ideal for HTPC builds, and we are very bullish on the NUC as well as the capabilities of completely passive builds as HTPC platforms. Our overall conclusion is that Haswell takes discrete GPUs out of the equation for a vast majority of HTPC users. The few who care about advanced madVR scaling algorithms (such as Jinc and the anti-ringing filters for Lanczos) may need to fork out for a discrete GPU, but even those will probably be of the higher end variety rather than the entry level GT 640s and AMD 7750s that we have been suggesting so far.