HTPC Decoding and Rendering Benchmarks: madVR

In the preceding section, we looked at EVR and EVR-CP. Videophiles often prefer madVR as their renderer because of the choice of scaling algorithms available as well as myriad other features. In our original Ivy Bridge HTPC review, I had been very satisfied with HD 4000 and madVR except for a few corner cases involving high frame rate material which also required both luma and chroma scaling (such as 720p60 material). One of the issues in our initial testbed was that we were using DDR3-1333 DRAM. Our current system under consideration uses DDR3-1600. This is more than enough to get madVR working with default scaling algorithm settings for all video material 1080p60 or lesser. Readers interested in seeing madVR in action on the HD 4000 should definitely check out Andrew's excellent piece in Missing Remote comparing HD 2500 and HD 4000 for madVR.

It is not possible to use native DXVA2 decoding with madVR because the decoded frames are not made available to an external renderer directly. To work around this issue, LAV Video Decoder offers three options. The first option involves using software decoding.

LAV Video Decoder (Software Fallback) + madVR
Stream GPU Usage % Power Consumption
     
480i60 MPEG-2 70.84 48.19
576i50 H.264 72.8 50.41
720p60 H.264 75.88 58.23
1080i60 H.264 61.51 59.05
1080i60 MPEG-2 61.22 55.09
1080i60 VC-1 62.22 59.85
1080p60 H.264 73.65 60.91

The second option is to use either QuickSync or DXVA2 Copy-Back. In either case, the decoded frames are brought back to the system memory for madVR to take over. The power consumption profile improves quite a bit, particularly for the 720p60 and 1080p60 streams.

LAV Video Decoder (QuickSync) + madVR
Stream GPU Usage % Power Consumption
     
480i60 MPEG-2 71.37 47.72
576i50 H.264 71.28 49.83
720p60 H.264 75.76 54.92
1080i60 H.264 62.5 56.15
1080i60 MPEG-2 62.02 55.81
1080i60 VC-1 61.86 55.94
1080p60 H.264 66.31 56.58

One of the interesting features to be integrated into the recent madVR releases is the option to perform DXVA scaling. This is particularly interesting for HTPCs running Intel GPUs because the Intel HD Graphics engine uses dedicated hardware to implement support for the DXVA scaling API calls. AMD and NVIDIA apparently implement those calls using pixel shaders. In order to obtain a frame of reference, we repeated our benchmark process using DXVA2 scaling for both luma and chroma instead of the default settings.

LAV Video Decoder (QuickSync) + madVR (DXVA Scaling)
Stream GPU Usage % Power Consumption
     
480i60 MPEG-2 50.33 43.54
576i50 H.264 52.39 44.33
720p60 H.264 57.34 48.82
1080i60 H.264 62.63 55.52
1080i60 MPEG-2 62.34 55.21
1080i60 VC-1 62.06 55.51
1080p60 H.264 65.56 55.33

DXVA scaling results in much lower GPU usage for SD material in particular with a corresponding decrease in average power consumption too. Users with Intel GPUs can continue to enjoy other madVR features while giving up on the choice of a wide variety of scaling algorithms.

HTPC Decoding and Rendering Benchmarks: EVR / EVR-CP Software Interface: XBMC and JRiver Media Center 18
Comments Locked

138 Comments

View All Comments

  • HighTech4US - Sunday, January 20, 2013 - link

    Agree, I see no other overall complete platform that would be better (or even equal) for a 4-OTA Tuner DVR with unlimited storage (only limited by disk size) with free EPG that Windows 7 Media Center provides.

    And by tricking out 7MC with MediaBrowser, MediaControl, SHARK007 Codecs I have a complete on demand system that can play any type of media.

    I use MediaCenterMaster to get program meta information, backdrops and thumbnails for MediaBrowser.

    I also use MakeMKV to rip my DVD's and VideoReDo TVSuite h.264 to edit recorded TV shows and convert them to H.264 MKV's.

    Oh and 7MC can show your digital pictures as a slide show on your big screen with background music.

    I also love the screen saver where it shows random pictures from your picture library then zooms to one (or more) from a folder. When I first got this enabled the wife spent 45 minutes just watching the screen saver.
  • powerarmour - Monday, January 21, 2013 - link

    Agreed, WMC is only EPG based Tuner app that can correctly use Freeview HD DVB-T2 Tuners in the UK, there are no other usable HTPC alternatives.
  • psuedonymous - Sunday, January 20, 2013 - link

    Question: why was the obsolete 2-pass method used instead of the faster (and more common) CRF? Was the encoding benchmark intended as an artificial CPU-stressing benchmark rather than a 'real world' encoding benchmark?
  • ganeshts - Sunday, January 20, 2013 - link

    Hmm.. that is what Graysky's benchmark does, and it keeps the setting consistent across different systems when you want to see how much better or worse your system is, when compared to someone else's.

    FWIW, pass 1 stresses the memory subsystem, while pass 2 stresses the CPU.
  • ganeshts - Sunday, January 20, 2013 - link

    Thanks for the info. I was looking at the FAQ hosted by TechARP here: http://www.techarp.com/showarticle.aspx?artno=442&... ;

    Also, look at Ian's test with various memory speeds here using the same processor (last section on this page):

    http://www.anandtech.com/show/6372/memory-performa...

    There is definitely an impact on pass 1 performance using different memory speeds and the impact is more than on pass 2.
  • Iketh - Sunday, January 20, 2013 - link

    Why is Prime95 v25.9 used? That is grossly outdated. The latest official 27.7 is needed to tax Ivy Bridge with AVX instructions. All those temps and watts you got will increase significantly. Please revise your Prime95. An oversight like this is unacceptable.

    Not to mention the latest Intel compilers have been implementing AVX instructions for like 6+ months now even if the programmer didn't specifically write for it. AND Handbrake has been using AVX in about that same timeframe and is only increasing.....
  • ganeshts - Sunday, January 20, 2013 - link

    I will definitely do some experiments with the new Prime95 and report back.
  • ganeshts - Monday, January 21, 2013 - link

    I repeated the CPU loading with the latest Prime95 (v27.7):

    http://i.imgur.com/lK0zqjR.png

    The readings didn't go up significantly, but, yes, there is an increase. The power consumption at the wall increased from 58.25 to 62.56 W.

    Thanks for bringing this to our attention, and we will make sure future reviews use the updated Prime95.
  • ganeshts - Monday, January 21, 2013 - link

    Oh, but, with full GPU and CPU loading (using Furmark 1.10.3 - latest), the power at the wall is only 89.77 W (compared to 88.75 W earlier). The ~40 W / ~15W TDP distribution between the CPU and the GPU still remains the same.

    http://i.imgur.com/soCGAyk.jpg

    I don't expect the steady state temperatures to be that different because the power increase at the wall is only 1 W.
  • ganeshts - Sunday, January 20, 2013 - link

    Yes, the scaling algorithms affect the performance a lot.

    That is why I mentioned that we used the default settings: Bicubic with sharpness 75 for chroma (no anti-ringing filter), Lanczos 3-tap for image upscaling / Catmull-Rom for image downscaling (no anti-ringing filter or linear light scaling),

    We will look at other scaling algorithms and their performance on the HD 4000 / GT 640 / AMD 7750 in the third part of the HTPC series.

    Also, a note that if you are using HD 4000 (or any other Intel HD Graphics), I would strongly suggest looking at DXVA Scaling. Users might be surprised at the quality delivered without taxing the GPU too much.

Log in

Don't have an account? Sign up now