Original Link: http://www.anandtech.com/show/2685



Last year, NVIDIA introduced it's CUDA development package. Existing as a stand alone download for a while, eventually CUDA was rolled in to the driver itself. Today, AMD is following suit rolling their own GPU computing package, called ATI Stream, into their Catalyst 8.12 driver. While the package has been available for a while now, AMD is really starting to try and push forward on the idea of using their hardware as a GPU computing platform. In both market penetration and branding, AMD is way behind NVIDIA and CUDA.

NVIDIA has been pushing forward in the HPC market very well with CUDA, and PhysX on the desktop uses CUDA to implement hardware accelerated physics. While AMD is a bit behind, we don't see them as hugely lagging either. NVIDIA has some good ground work laid, but the market for GPU computing is still incredibly untapped. Both AMD and NVIDIA are in a good position to take advantage of GPU computing efforts when OpenCL and DriectX 11 come along, and we really do see this effort by both camps to sell GPUs using stream computing as a pitch that is very preliminary.

Software like the Adobe's photoshop perform GPU acceleration using OpenGL. We are at a place where, if a commercial application would significantly benefit from GPU acceleration, software companies still want to develop it once and just have it work. Targeting either NVIDIA through CUDA or AMD through Brook+ is too much of a headache for most software vendors. Having an API (or better yet a choice of APIs) targeted at hardware agnostic GPU computing will kick off the real revolution that both AMD and NVIDIA want their solutions to provide.

But ATI Stream isn't the only thing in Catalyst 8.12 that piqued our interest. To show off the inclusion of the new package, AMD built-in a free video transcoder that actually makes use of GPU acceleration. It is somewhat limited (as is the Badaboom package that runs on NVIDIA hardware), but free is always a nice price. We will compare what you get with the Avivo Video Converter to Badaboom, but we must remind our readers that these are distinct applications that approach the problem of video encoding in different ways and thus a direct comparison isn't as telling as if we could run the same code on both hardware platforms. It's sort of like comparing Unreal Tournament 3 performance on AMD hardware to Enemy Territory: Quake Wars performance on NVIDIA hardware. We've got to look at what is being done, the quality of the output, and we must consider the fact that completely different approaches are likely to have been used by each development team.

The final bit of goodness in the 8.12 driver are some performance tweaks in some apps and fixes for performance and CrossFire in Far Cry 2. There were quite a number of unresolved issues we had to deal with especially on our Core i7 systems that we needed to run down. We'll let you know what issues remain in the following pages as well. This driver release has been highly anticipated not only by reviewers, but by consumers awaiting the merging of older hot fixes into a WHQL driver. We have high hopes, and we'll see if AMD has delivered something that meets our expectations.

But first, let's go a little deeper into ATI Stream and CUDA and take a look at the competing video transcoding offerings now available.

ATI Stream vs. NVIDIA CUDA

ATI and NVIDIA both know that ubiquitous GPU computing is the future for all data parallel tasks. Computer graphics is one of the most heavily parallel tasks around. It also happens to be a problem that easily lends itself to parallelization because of how independent the parallel tasks can be. These two factors, plus demand for high quality graphics, are what made the consumer GPU industry explode in it's dozen years of existence. While the GPGPU (general purpose use of GPUs) philosophy has been around for a while, the advent of using the GPU for general data parallel tasks has been slow on the uptake in the main stream. This has not only to do with the fact that there were no specialized tools available (developers needed to shoehorn algorithms into OpenGL or DirectX and "draw" triangles to solve problems), but was further complicated by the fact that GPU architecture did not lend itself to the implementation of many useful data parallel algorithms.

Hardware designed for graphics, until very recently, has meant floating point only acceleration of completely independent data points with heavy restrictions on reading and writing data. Only a small subset of problems really map to that kind of architecture. With DirectX 9 we only saw inklings of real programmability, and DirectX 10 class hardware has really brought the tools developers need to bear. We now have hardware that can handle not only floating point but integer and bitwise operations as well. This combines nicely with the fact that local and global data stores have been added for sharing data and there is better support for non-sequential reads and writes out there. With DirectX 11, the package will be fairly feature complete, adding real support for data structures beyond simple arrays, optional double precision support and a whole host of other minor improvements that really add up for GPU computing (which is no surprise because DX11 will also feature a general purpose Compute Shader that doesn't need to tie work to triangles, vertecies, fragments or pixels).

Both ATI and NVIDIA have been working on GPU computing for quite some time, though NVIDIA has really been pushing it lately. Even before either company got involved, pioneers in GPU computing were using graphics languages to solve their problems. Out of this, at Stanford, grew a project called Brook. Brook was able to take specially written code that looked fairly similar to standard C programing and source-to-source compile it into code that would implement the appropriate graphics API calls and shaders to run the program. ATI took an early interest in this project and sort of latched onto it (perhaps because it showed better performance on ATI hardware than NVIDIA hardware at the time). After releasing CTM (the low level ISA spec for their graphics hardware) and later CAL (their abstracted pseudo instruction set that can span hardware generations), Brook was modified to compile directly to code targeted at ATI GPUs. This modification was called Brook+ and is the major vehicle AMD uses for GPU Computing today.

With the Catalyst 8.12 release, AMD is now including the necessary software to build and run GPU computing applications with Brook+ and CAL in the driver itself. This software is bundled up in a package AMD likes to call the ATI Stream SDK and has been available as a separate download for a while now. NVIDIA also did this with CUDA, first offering it as a separate download and later integrating it into their driver.

With both Brook+ and CUDA there are limitations in what can be done from both a language and a hardware target standpoint. At this point, the documentation for CUDA is more practical giving better guidance on how to organize things, while the ATI Stream documentation is much lower level and arguably more complete than NVIDIA's. The long and short of it is that you'll be more likely to get up and running quickly with CUDA, but with ATI Stream there is the information to really understand what is going on if you want to go in and tweak the low level code generated by Brook+ (or if you just want to program at an assembly level).

As far as language extensions in general go, I prefer the CUDA approach to data parallel computing in spite of the fact that I still have my qualms about both. The major draw back to either is still the fact that they are locked in to a specific hardware target. Good data parallel programming is hard, and there's no reason to make it more difficult than it needs to be by forcing developers to write their code twice in two different languages and very likely in two completely different ways to take advantage of both architectures. It's ridiculous.

Both NVIDIA and AMD like to get on their high horse when talking about their GPU computing efforts. We have AMD talking about openness and standards and NVIDIA talking about their investment in CUDA and the already apparent adoption and market penetration in high performance computing. The problem is that both approaches are lacking and both companies are fully capable of writing compilers to take the current Brook or CUDA C language extensions and target them at their own architecture. Both companies will eventually support OpenCL when it hits and the DirectX 11 Compute Shader as well. But in the mean time they just aren't interested in working together. Which may or may not make sense from a business standpoint, but it certainly isn't the best path for the consumer or the industry.

Meanwhile NVIDIA, and now AMD, want to push their proprietary GPU computing technologies as a reason end users should want their hardware. At best, Brook+ and CUDA as language technologies are stop gap short term solutions. Both will fail or fall out of use as standards replace them. Developers know this and will simply not adopt the technology if it doesn't provide the return on investment they need, and 9 times out of 10, in the consumer space, it just won't make sense to develop either a solution for only one IHV's GPUs or to develop the same application twice using two different languages and techniques.

Where proprietary solutions for GPU computing do make sense is where the bleedingest edge performance is absolutely necessary: the HPC market. High Performance Computing for large companies with tons of cash for research will save more money than they put in when developing for the types of scientific computing that really benefit from the GPU today. No matter how long it takes a development team to port a solution to CUDA or Brook+, if the application has anything like the order of magnitude speedups we are used to seeing in this space, the project will have more than made up for the investment no time at all. Realized compute per dollar goes up at a similar rate to application speedup. GPU computing just makes sense here, even with proprietary solutions that only target one hardware platform.

In the consumer space, the real advantage that CUDA has over ATI Stream is PhysX. But this is barely even a real advantage at this point as PhysX suffers from the same fundamental problem that CUDA does: it only targets one hardware vendor's products. While there are a handful of PhysX titles out there, they aren't that compelling at this point either. We will have to start seeing some real innovation in physics with PhysX before it becomes a selling point. The closest we've got so far is the upcoming Mirror's Edge for the PC, but we must reserve judgement on that one because we haven't had the opportunity to play it yet.

And now we've got AMD's first real effort with the Avivo video converter finally using the GPU to do something (it did not in it's original incarnation). This competes with the only real consumer level application available for CUDA: Badaboom. Now that there are video converters available on both sides of the aisle, we have the opportunity to compare something that really still doesn't matter that much: we get to see the relative performance of two applications written by different teams with different goals targeted at different hardware for different markets. Great. Let's get started.



ATI Catalyst 8.12 Changes and Bug Fixes

As far as the 8.12 drivers themselves go, we have seen a few bug fixes. Far Cry 2 now supports CrossFire without requiring 4xAA to be enabled for it to work (and the rest of the FC2 hotfix has been incorporated as well). The stability and performance issues we noticed on Nehalem systems have been improved. Game tests make sense and behave more or less the way we expect they should.

Until this driver, using ATI graphics hardware in a Core i7 system was unstable and buggy, especially in non-Intel X58 boards. The worst problems were with CrossFire, but we had single card issues as well. When we began testing on Nehalem, we wanted to use a Radeon HD 4870 1GB in our launch article. In trying to make it work, Anand lost almost a week in tests that to just be thrown away because of the stability and performance issues with AMD hardware in the i7 system. We had to switch over to NVIDIA hardware to get the launch article done. Normally we don't go into detail about all the trouble we have when testing prerelease products, but even after launch we continued to have the same issues. Initially 8.10 was the problem, then the Far Cry 2 hotfix didn't really fix much. It almost seemed like 8.11 made things worse and the hotfixes following 8.11 didn't really help either.

For us, Catalyst 8.12 was the make or break driver for recommending ATI hardware on Core i7 systems. We had decided that unless most (if not all) of our outstanding issues were resolved we would recommend that anyone who wanted the latest Intel hardware stay very far away from AMD video cards. Fortunately for AMD, this latest release resolves enough of our issues that we are comfortable recommending that those who want AMD hardware in their Core i7 systems go ahead and give it a shot (note from Anand: I'm still having some issues in my media encoder tests with ATI hardware in my i7 test bed).

There has been a change in the layout of the Catalyst driver as well. It's really more of a minor tweak actually. In the 3D menu on the left side in the Advanced view, the last option on the list ("More Settings") used to be miscellaneous options for toggling z depth, texture compression and triple buffering with OpenGL. All of these options have been removed except for OpenGL triple buffering, which has been rolled into the "All Settings" menu option (it's at the very bottom).

We haven't yet completed a full performance analysis using Catalyst 8.12, but we expect to see practical gains similar to what we saw with NVIDIA's 180 series driver release: mostly modest gains with maybe some corner cases that may or may not be relevant to gamers getting a bigger boost. We are working on gathering data for upcoming articles using Catalyst 8.12 and the latest NVIDIA beta driver 180.84. We haven't run into anything that used to work being broken this time around, but the night is young, as they say. We are hopeful that at least the game tests we are looking at won't present us with any problems.

Using these drivers as a starting point on our Core i7 system should allow us to finally do more with our testing. We are looking forward to finally having a stable platform on which to test both CrossFire and SLI. We are also anxious to get comparisons of graphics hardware using the latest games up as well. This should all be much easier now that we have these drivers in our hands.



The Avivo Video Converter

Alongside the new Catalyst 8.12 drivers is AMD's Avivo Video Converter (AVC) - it's supposed to be like Badaboom, but with a five-finger-discount. The Avivo Video Converter will use any Radeon HD 4800 or 4600 series GPU to offload some of the calculations needed for video transcoding, resulting in faster overall performance.

The AVC is a separate 22MB download from the 8.12 drivers, available at AMD's website on the same page you use to download the latest Catalyst release.

Currently, the new video converter only works with Radeon HD 4k series hardware, and video encoding is only accelerated with 46xx and 48xx series hardware.  Further, only H.264 and MPEG-2 output is currently hardware accelerated. This means that WMV and DivX and the rest are not. Even though it is a little limited, we are interested in comparing the converter to Badaboom which only outputs H.264 anyway, so not a big loss thus far. But definitely something to keep in mind when playing with the application.

The Avivo Video Converter is contained within the Catalyst Control Center's Basic view. This means that if you've opted for the Advanced view, you need to go up to the top left tab (views) and select basic. This will pull up the screen you see here.

Hitting Go brings us to the next step: selecting a file for transcoding. It is unfortunately not clear what types of inputs work with Avivo, as we tried some DivX, H.264, and WMV files that either were not recognized as valid video files or just didn't produce an output video that worked. We ended up sticking with VOB files from DVDs as test sources because they just worked.

After selecting your source file, it's time to pick which format the file will be transcoded to. Many of the usual options are there, but some of the options will resize your video while other's won't. It isn't clear what resolution the options will target until after the video has been produced, and there is no way to change this option. The quality slider appears to just change bitrate, but for most formats the range of bitrates is fairly restrictive. We'd like to see higher bitrate output options in the future.

Pressing the next button after format and quality are selected brings up this progress bar window showing both elapsed time and estimated total time. This is very informative, but the window disappears when encoding is completed without logging or displaying the total time (which, while not necessary, would be a nice bit of polish).

After this step, we are left with a window that tells us file size and bitrate for the video. We've got some options to either open the folder the new video is in, play the file, finish or start over. Both finish and start over bring us back to beginning (the first interface image above), and if there's any other difference between the two option we have yet to discover it.

And there you have it. AMD went with a simple interface that give people some solid options for doing simple video transcoding with relative ease. Now let's take a look at how well the program actually works.



And now, the rest of the story

To say that the output quality is bad would be entirely too generous. Rather than just take our word for it, here are some screenshots of the video. The actual video is worse as the corruption and anomalies change over time as well, but this is enough to show the problems. 

This is the 64-bit version of Avivo encoding WMV files:


Our highest quality output video was S02E01 of MacGyver which lacked any deinterlacing.

 


Bad Boys delivered the second best output with massive corruption frequently occurring.

 


The Empire Strikes Back with completely unwatchable video that is nothing but completely corrupted and can even crash a player.

 

The 32-bit version has fewer problems. Codec choice didn't seem to fix corruption on the 64-bit version, but under the 32-bit version we were able to get cleaner output encoding to WMV (but not iPod (H.264) which remained corrupted).


Here's the Empire shot from the video encoded to WMV under Vista 32. Looks better.

 

MacGyver is unchanged in quality with the interlacing still a major problem, but at least some of the corruption issues are fixed for those running 32-bit Vista.

Our goal was to test the case where an end user may want to encode DVDs for use on their iPod, as this would allow us to more easily compare the software to Badaboom. Comparing quality comes out well in favor of NVIDIA on either the 64-bit or 32-bit version of Avivo when encoding to iPod video, so this makes any performance comparison much more difficult. We can't honestly directly compare the two software packages because of the major difference in the quality of output they generate.

To further illustrate the point I uploaded a segment to Viddler, but the further conversion to a flash video greatly improved quality. Some of the corruption is still there, but lost is the choppiness and other motion issues as well as the fact that every second or two we get a frame that looks like this thrown in:

 


It ain't easy being green.

 

Anyway, here's video clip. Yes, it also encoded with no sound, which is apparently also a feature of the Avivo video converter with some video formats.

 

We did also try using other sources to transcode from with Avivo, but our attempt to use H.264, DivX, or WMV sources all failed with different results. Some had video with no sound, some had only sound and no video, and some just gave us an error saying the file wasn't a valid movie and it wouldn't even try to transcode it.

And since when is video transcoding not a deterministic process? While working with our Star Wars clip, we noticed that our files weren't all the same size. Our first thought was that maybe when running on different video cards the transcoder operated differently. But after extended testing, we discovered that we don't always get the same output even if nothing changes. Not only do two different cards not output the same data, but one card isn't guaranteed to give you the same result from run to run.

This, on top of all the other issues, makes any performance comparison moot. But we ran the numbers anyway. And here's what we got:

 

empire strikes=

 

This is a roughly 18 minute section of The Empire Strikes Back that took between 38 and 43 seconds to encode, and we recorded the first run. Subsequent runs came out between 38 and 43 seconds for all the cards. Regardless of which video card we plug in, performance is the same (and this is despite the output not even being the same). This honestly seems to indicate that not much is happening on the GPU. All the variables change with this hardware and nothing is the same width or clock speed from top to bottom. 

We used GPU-Z to test utilization and saw that every 4 seconds or so during the encode process, we would see a blip of about 15% utilization on the GPU then it would immediately fall back down to zero. This means that AMD is using the GPU for very very little and most of the performance of the encoder is coming from the Core i7 processor we've got in there. To test our theory, we ran the processor underclocked at 2.6 GHz to see what happened to performance. Performance dropped from between 38 and 43 seconds per run to 56 to 60 seconds per run. This indicates that the performance of Avivo is incredibly CPU bound.

We can see from looking at the task manager during the transcode that there are about three threads that are pretty hard on the CPU. No matter what CPU speed we were running, one core was always pegged. Two other cores sat at over 50% utilization on the 3.2GHz Core i7. Overall utilization hovered between 30 and 40 percent for the most part. When we decreased the clock speed, all three threads ended up pegging their assigned cores at different times, which could indicate the much larger than 25% performance improvement in moving to higher speeds (the change from 2.6GHz to 3.2GHz is just over 23%).

So, we've got a transcoder that doesn't produce watchable output in many cases, doesn't produce consistent output, and doesn't vary in performance with GPU hardware (meaning it doesn't use the GPU very much at all). It doesn't handle incorrect interlacing flags gracefully and there is no option to force deinterlacing. But that's minor compared to the rest of the problems we are seeing here. 



Elemental's Badaboom 1.0: The Redemption

Remember Elemental? It’s the company that put out Badaboom, the world’s first GPU accelerated H.264 video transcoder built using CUDA. NVIDIA was particularly excited about Badaboom as it finally gave NVIDIA a consumer-level CUDA application to point to when making the argument of why its GPUs were better than both ATI’s GPUs and Intel’s CPUs alone.

Unfortunately, the beta release of Badaboom needed some work. It didn’t do anything well, at all. After that original Badaboom review I met with Sam Blackman, Elemental’s CEO and we went through the list of things that needed to be fixed.

I should give credit to Mr. Blackman, normally whenever we post any sort of a critical review of any product, the company is fiercely upset with us. I argued with Intel PR for years over our Pentium 4 reviews, AMD felt our review of the Radeon HD 3870 was unfair, and if we don’t mention PhysX as a feature advantage of its GPUs then NVIDIA gets a little emotional. As harsh as the original review was however, Sam wasn’t irrationally upset, I believe his exact words were “that was harsh” and then it was straight to “what can we do to make it better?”.

It’s Sam’s attitude that was reflected so greatly in what became Badaboom 1.0.

The changes were sweeping, now gone is the Pro version, which is welcome given that the Pro version was anything but that. Elemental is instead only focusing on the consumer version and will be rolling in features into this version over time.

The initial consumer release was only supposed to support up to 480p output files, while the new 1.0 release can do up to 720p (the old “pro” version supported up to 1080p). The 1.1 release due out in the next few days will add 1080p support. While originally being slated for use in the Badaboom Pro, AVCHD and HDV input formats are now both a part of the $29.99 consumer version.

All in all, killing off the pro version and folding mostly everything into the consumer version made a lot of sense.

There are still some pretty serious limitations: 1) there’s no official support for Blu-ray movies, 2) no official support for DivX, 3) the highest H.264 profile supported is still baseline (although Elemental plans on adding Main support in 1.1 and High profile support in the future).

Elemental did add support for Dolby Digital audio input, although DTS is still being worked on. The only audio output format supported is still AAC-LC.

The total sum of all of this is that Elemental’s first version of Badaboom now has a focus, a very specific one, but it gives us a target to shoot for. This isn’t an application that you’re going to use to backup your Blu-ray collection, it’s not even very useful for backing up your DVDs, but what it can do very well is transcode your DVDs for use on a portable media player like an iPhone or iPod.

Funky Issues? Resolved

The biggest problem with the previous version of Badaboom was that it couldn’t do anything right. I tried transcoding Blu-rays, DivX files, chimpanzees, DVDs, and each input file had some sort of quirk associated with it. Even taking a simple DVD, which Badaboom was supposed to support flawlessly, and transcoding it sometimes left me with an unusable output file of the wrong frame rate.

Focusing Badaboom’s attention, Elemental now made one thing work very well: DVDs. Point Badaboom at an unencrypted VIDEO_TS folder or a DVD disc/image and it will now perfectly rip the DVD to the appropriate resolution.

I should mention that DRM is rearing its ugly head here once more as Badaboom won’t automatically convert an encrypted DVD. Thankfully Slysoft’s AnyDVD simply running in the background is enough for Badaboom to transcode any DVD. If you haven’t used AnyDVD, I highly recommend it - it’s a great way of getting rid of encryption on both DVDs and Blu-ray discs.

Elemental also fixed the weird image quality issues, the output no longer gets scaled out of its correct aspect ratio when downscaled. Hooray.

Badaboom: Quad-Core Desired

Badaboom obviously does very well with a fast GPU, but the CPU requirements are also reasonably high. Keeping the GeForce GTX 280 fed actually ate up 50% of the CPU power of our Core 2 Quad Q9450 in our tests, it seems that Badaboom won’t scale beyond two cores.

The problem is that Elemental and NVIDIA make the argument that using the GPU to transcode video frees up your CPU to do other tasks while you’re doing this. The reality is that this is only true if you’ve got four cores, otherwise your dual-core CPU is just as pegged as it would be if you were doing a CPU-based video transcode. The difference here being that the transcode is going a lot faster.

While NVIDIA wants you to spend less money on the CPU and put the savings towards a faster GPU, the correct approach continues to be buying a decent CPU and a decent GPU, even with GPU accelerated video encoding. If you’re going to be doing a lot of video encoding, a quad-core CPU is still a good idea regardless of whether you’re doing your encoding on the GPU or not.



Image Quality

Here’s where Elemental gets off easy. Since Badaboom is best used as an application for getting DVDs onto your iPhone or other low res format, image quality isn’t as big of a deal as it would be if you were viewing these things on a TV.

Compared to the x264 codec, Badaboom’s output seems just fine:


Elemental's Badaboom 1.0


x264

Again, Badaboom avoids the more difficult image quality comparisons by not being useful for high quality conversions.

AMD shows up to this gun fight with a knife, as Avivo’s image quality isn’t acceptable. While the Avivo video converter is free, it’s not useful.

Performance

Once again, I looked at the performance of Badaboom vs. transcoding on a CPU using Handbrake 0.93 (which uses the x264 codec). This time around we have Intel’s Core i7 965, running at 3.2GHz. The comparison stacks up pretty much as it did before:

Empire Strikes Back (1GB Chunk)

The issue is that the Core i7 isn’t running with all 8 threads maxed, instead Handbrake appears to be only utilizing 30 - 40% of the available execution resources, which amounts to less than all four physical cores.

I suspect with better CPU utilization we could have a scenario where the Core i7 was able to perhaps match the performance of the GeForce GTX 280. The only problem then becomes the cost difference.

The Avivo Video Converter does complete our conversion task in around half the time of the GeForce GTX 280 running Badaboom, however the output file is unusable so the performance advantage is meaningless in our opinion. If AMD could fix things however...



Badaboom 1.1 Preview

Elemental was nice enough to get us a preview build of Badaboom 1.1, due out in the next week or so.

The 1.1 release is pretty significant, arguably just as significant as 1.0 given the list of improvements:

This version offers additional input files, including DivX, Xvid, VC-1, and MKV. The file formats in 1.0 are still supported.

- H.264 Main profile output.
- 1920x1080 (1080p) output.
- New output profiles: Blackberry Bold, YouTube, and Zune.
- Multi-GPU capability: Badaboom can now let you select which CUDA-enabled GPU to run the transcoding on in the Advanced options. You can open multiple Badaboom applications and run different video files simultaneously on different GPUs, seeing similar high performance as you see with running Badaboom now. For example, two NVIDIA GPUs can transcode two movies in the time it would take Badaboom 1.0 to transcode one. For this initial version, there are a couple caveats:

1. SLI must be disabled (if it is enabled) .
2. Each GPU needs to be connected to a display.
3. Each GPU must have a desktop enabled on it.

I’m not particularly interested in the multi-GPU support that 1.1 offers given the caveats, but the rest of the feature list is excellent. With DivX support you can now take your old DivX shows and movies, re-encode them using Badaboom to save space. While trying to transcode a DivX file in the early version of Badaboom failed miserably, it worked just fine in the 1.1 preview.


oh wow, real settings

Main profile with CABAC support is also enabled, making Badaboom 1.1 closer to a real alternative for high quality rips of DVDs and Blu-rays. The 1.1 beta isn’t ready for prime time but I wanted to see what it could do, so I grabbed my Casino Royale Blu-ray, ripped it to the hard drive (resulting in a 46GB iso). The Blu-ray file structure is pretty straightforward, in the \BDMV\STREAM directory you’ll find a bunch of m2ts files, in this case the 34GB 00000.m2ts file is the main 1080p movie.


Please don't throw me in jail

Since I just ran the BD through AnyDVD HD it no longer had any encryption thus Badaboom was ok with transcoding it. To my surprise, it just worked. Using the Custom Media Center profile I was able to select a video bitrate of up to 25Mbps, I stuck with 11Mbps which should be enough for most < 50” HDTVs and normal viewing distances. The maximum audio bitrate is 320kbps, which is what I selected for the transcode. The resulting file was an 11.4GB .mp4 file that took 2 hours and 2 minutes to transcode at an average of 28.3 fps (note that this included the credits as well) on a GeForce GTX 280.


Click to Enlarge

The Blu-ray original was mastered at 24 fps, so with Badaboom on a GTX 280 we can get greater than real-time transcoding.

Delivering as promised, Elemental also enabled 1080p output with Badaboom 1.1. While I suspect that most people using to backup their Blu-ray collection will choose to go down to 720p since you can basically cut file sizes in half and maintain good enough quality for most HD displays.

The biggest limitation I see is that the output file is relatively useless on a HTPC. While Badaboom can provide a quick and easy way to rip a Blu-ray to a smaller, more backup-friendly format, you do lose the ability to preserve the DD/DTS audio tracks. Forget about lossless 7.1 support, I’m just talking about maintaining basic DD/DTS 5.1 that your HTPC/receiver are already setup to play.

Elemental still dodges the bullet of having to be a real Blu-ray backup solution by not addressing the audio side of the equation, but it’s clear where Badaboom is headed. I’m still looking into how well it fares on the image quality side compared to x264, but for now it looks like Badaboom 1.1 has potential in this department.

The real competition will be Intel’s Core i7 which, thanks to its incredible encoding performance can actually do quite well in the HD transcode department. NVIDIA has the advantage of its GPUs being much cheaper (you don’t have to buy a whole new platform) but at least at this point Intel has the quality advantage given that the best audio/video transcoding tools on the market are still x86-only and without a CUDA counterpart.



Final Words

We're happy to see AMD including ATI Stream in their latest driver release. It's great that both NVIDIA and AMD are doing what they can to advance GPU computing right now. We still won't see any truly major strides made in consumer level applications until we have OpenCL and DirectX 11 to bring hardware agnostic general purpose data parallel programming to the masses, but getting tools (even proprietary ones) out there and in the hands of developers will definitely help.

We feel similarly about the marketability of ATI Stream as we do about CUDA. GPU computing is still only a niche and there aren't enough applications out there that really bring the kind of value to the consumer that we want and expect. The decision about which graphics card you are going to pick up shouldn't come down to ATI Stream and CUDA unless you are really into one of the applications out there that runs on one or both of these technologies. For the average gamer, we definitely recommend making your purchasing decisions on how hardware performs in the games you want to play.

All that said, we are very disappointed with AMD's Avivo video converter as a vehicle to show off ATI Stream. It is a poor application that provides little to no value in exchange for the immense frustration end users will have when trying to transcode video. It is not worth the time to it takes to download or the space it takes up on your hard drive.

In the course of evaluating Avivo, our second look at Badaboom showed us a much better product than we previewed that adequately fills a niche and provides good support for getting video on to an iPod or iPhone quickly.

Badaboom 1.1 shows Elemental's commitment to the cause.  Normally when I'm promised that things will get better, and that features will be added, they don't.  Or if they do, they take a long time.  It is now less than four months since we first previewed Badaboom and with version 1.1, much of what we asked for has been included.  There's still a long way to go and Elemental still has the difficult tasks of matching the quality of established codecs like x264 and MainConcept, but these past two revisions of Badaboom prove one thing: Elemental is serious and willing to listen to feedback.

No matter how you slice it though, Elemental has a much better product than AMD is offering with the Avivo video converter.

The 8.12 drivers in general do offer some fixes for problems we've had since October. Many of our readers noticed the string of somewhat negative jabs we took at Catalyst over the past few months. We'll spare everyone a redux, but just because this driver is more stable, feature complete, and includes some important outstanding hotfixes doesn't mean the problems AMD has with their approach to driver development have been solved. The train wreck that has been the last few months of Catalyst has happened before and it will happen again as long as AMD puts too many resources into pushing drivers out every month and not enough into making sure those drivers are of high enough quality.

Log in

Don't have an account? Sign up now