CausticTwo, the Long Term, and Preliminary Thoughts

Looking toward the future, Caustic Graphics will bring out the CausticTwo next year. The major differences with this hardware will be with the replacement of the FPGAs with ASICs (application specific integrated circuit - a silicon chip like a CPU or a GPU). This will enable an estimated additional 14x performance improvement as ASICs can run much much faster than FPGAs. We could also see more RAM on board as well. This would bring the projected performance to over 200x the speed of current CPU based raytracing performance.

Of course, next year CPUs will be faster, but based on that kind of projection we are still looking at about two orders of magnitude more performance than CPU based algorithms. This means that instead of seconds per frame, we can start talking about frames per second. Unless we want even more photorealistic images. That will still take a very long time.

The CausticTwo will also be available to end users. Hopefully by this time raytraceing plugins for 3D Studio Max, Maya, and all the other content creation tools that some prosumers and students dabble in will be hardware accelerated on Caustic Graphics hardware. And maybe at this point we'll start to see some realtime raytracing engine demos. Maybe.

Planting their flag firmly in film, video and advanced visualization markets makes the most sense and holds the most potential for long term viability. Jumping completely into games won't be the best way to go at this point -- it needs to be either a gradual adoption or they need to get their hardware into future game consoles. Pushing PC gaming before console adoption will likely prove as just as difficult for Caustic as it did for Ageia, and might not be the best use of resources. Especially if they can cut out a niche in the higher end space.

But they do have their eye on games at some point and are already talking about game consoles. While hardware, service and support for render farms, large scale visualization and those who need the hyper-realism that raytracing can offer has the potential to create a sustainable business, conceiving a piece of hardware that becomes nearly required for gaming (like the GPU) would be the holy grail in this case. It's not likely, but you can bet it's at the back of their mind. Staying focused on more modest goals is definitely a better way to stay in business though.

But they could go another direction. They could try and get themselves acquired by a 3rd party like Ageia did. Of course, NVIDIA killed Ageia's hardware business, and it would be nice if Caustic's hardware technology survived any acquisition. But that is often times how these things go. We'll simply have to wait and see.

There is another factor looming on the horizon as well. As we mentioned earlier, raytracing is very branch heavy, memory dependent and compute heavy. It's a beast of an algorithm that seems to always have a bottleneck no matter what it is running on. Though it will still be a while before we have hardware, Larrabee might just as well be a solution to the raytracing option. The Larrabee architecture tries to blend some of the CPU and GPU approach to processing, and the hybrid may enable a platform that competes with Caustic when it hits the scene. Memory organization and size are probably still going to favor Caustic, but we've continually heard rumblings that raytracing on Larrabee will be where it's at. It will certainly be interesting to compare the two approaches when they both arrive.

Beyond Larrabee, the long term plan for many core CPUs could include application specific processors. We will see combined CPUs and GPUs in the near future, and maybe we'll see dedicated raytracing units integrated as one or more of the many cores on a CPU down the road. The really long term picture is a bit more fuzzy, but they've got short term potential in the markets that need all the power they can get.

For now, we don't have hardware and we don't have developer feedback either. Caustic is going to get us a copy of their SDK so we can play around with it a bit and evaluate it. But as for knowing how applicable or useful Caustic Graphics hardware will be in the realworld, we just don't have the information we need yet.

Here's to hoping for the best.

CausticOne and the Initial Strategy
Comments Locked

48 Comments

View All Comments

  • jido - Thursday, April 30, 2009 - link

    With fast branching and parallel computing, as well as a good amount of RAM, there has to be other applications for this card. Could you do encryption/decryption or AI maybe?
  • lemonadesoda - Thursday, April 30, 2009 - link

    Wouldnt the "clearspeed e710" PCI board outperform this? The hardware is there. This company should focus on the software libraries.

    The CATS™ 700 1U rack module comtains 12 of these and delivers over 1 teraflops. With the right software you probably could do near-real time raytracing (seconds rather than minutes or hours per scene). Farm the CATS 700 and yes you could do real time raytracing on a one or two second "lag". That is, use 20 of these things, all rendering separate frames, one frame apart. It will take you a few seconds to build the first scene, but then you will have the rest ready to show real-time.

    How to manage this? Well that's the software issue these guys should be working on... NOT reinventing hardware that is already out there.

    AFTER they get it working on CATS then perhaps they mighht consider developing their own optimised hardware. But that should come SECOND not FIRST.

    Back to MBA skool boyz.
  • kyleb2112 - Saturday, April 25, 2009 - link

    If this can appreciably speed up raytracing, it'll find a market in the same demo that buys workstation graphics cards. But the way cpu cores are multiplying, they'll have to hurry up. Nothing loves multicores more than 3D rendering, and once we've got 32-core boxes this tech may be obsolete.
  • Draven31 - Thursday, April 23, 2009 - link

    ART PURE/Renderdrive anyone? Its been done, and the last couple times it has been tried, it was cheaper to buy six render nodes that ended up being the same speed, within six months. If its gonna be a year before they are even shipping cards to consumers, i doubt they are going to get *anywhere*
  • 7Enigma - Wednesday, April 22, 2009 - link

    ...and no one has a clue what that will be. Comon people, this article was little more than a puff press piece; interesting to read and make geeks giddy, but no actual substance. To be honest, this should be a blog post. Yes the description of the differences between ray-tracing and rasterization are nice and all, but there is no meat to this product at the time being.

    So no, it's not 20X faster, or 100X faster, or 2% faster, it doesn't yet exist and until independant testing has been done, I won't believe a word I read.
  • simtex - Wednesday, April 22, 2009 - link

    All the people that claim, this is a bad idea, intel will just copy it an make it run on their CPU. Well people the CPU have other tasks too, if ray-tracing is to be used in games, I would be rather anoyed if I couldnt get physics and AI calculations because my CPU had to do all the rendering work. So a card to off-load some of the computations would surely be a nice addition. Of course then Intel could cooperate with nvidia and offload physics and AI calculations to the graphic card, but that doesn't seems very likely atm.

    Also Caustics doesn't claim that this is a production board, infact they state that this is a prototype, and that their final product will use ASICs, and not FPGAs. Furthermore Caustics design does in fact consider the bandwidth requirements for ray-tracing, actually they claim that their algorithms are specially designed to cope with the limited bandwidth, and that this in their major achievement. Personally i think they use some sort of ray-bundling, although this have also been implemented in software ray-tracers they must have invented some new tricks to make it even better.

    Another great aspect of ray-tracing is that the frame rate is more dependent on the number of pixels you wish to render, than it is on the number of triangles on the scene, in contrary to rasterization.
  • ssj4Gogeta - Wednesday, April 22, 2009 - link

    Well I don't know much about all this, but I saw their video. The co-founder says that ray-tracing isn't a compute problem anymore, and that they looked at it in a different way. So, I'm wondering, if they're using a new algorithm or something that's making all the difference, can't Intel use their general purpose Larrabee to simulate that?
  • simtex - Thursday, April 23, 2009 - link

    Depends whether the algorithm is public available, as I understand the claims of Caustics is that it's their own algorithm and propably patented.
  • slusallek - Wednesday, April 22, 2009 - link

    It is strange to see that someone proposes a hardware architecture that by design is bandwidth limited. They have separated ray tracing at the point where the bandwidth is highest -- whihc seems like not a really smart move.

    By doing the ray traversal and intersection on their card and the shading on the GPU they must constantly transfer ray data between the two: transfer the hit point of each ray/pixel to the GPU (point, normal, texture coordinates, shader ID, ...) and then transfer any rays newly generated by a shader back to their chip (origin, direction, min/max_dir, ...). Each of those transfers is easily between 30 to 60 bytes per ray.

    So for a HD screen this is easily 100MB for just a single ray generation and only one sample per pixel. Given a PCI 1.0 4x bandwidth of 1GB/s this gives a theoretical maximum of just 5 fps -- and we have not done any work yet. No AA, no shadow rays, reflections, they all generate multiples of this in bandwidth. Even with PCI 3.0 and 16x lanes this will be a huge bandwidth issue.

    Let me also just point out that one of the first RTRT papers on visualizing car headlights (http://graphics.cs.uni-sb.de/Publications/2002/200...">http://graphics.cs.uni-sb.de/Publications/2002/200... already achieved up to 10 fps at the same video resolution. Note that the headlight used up to 25 rays per pixel and lots of complex multiple reflection and refraction. It ran on a cluster of 16 nodes with dual CPU single-core Athlon 1800s providing a total of 32 cores.

    Compare this with a latest machine used by Casutics with likely Dual Quad-core CPUs giving something 16 cores (with hyperthreading) each one easily providing a multiple of the FLOPS of the old Athlon. So it seems their performance is not even reaching what has already been done 7 years ago.

    So in summary, I am not really impressed by what Caustics claims. Their hardware architecture is severely limited by design and their software results are way behind what has been done many years ago.
  • nubie - Tuesday, April 21, 2009 - link

    I for one am looking forward to a time when games no longer have ugly polygons.

    Even in recent games cylindrical objects are pictured as comprised of as few as 8 sides.

    If you can send the raw math into the equation for the rays to hit it will make the whole thing much more powerful.


    I would love to see this and Larrabee succeed, progress and competition are always good for the consumer.

Log in

Don't have an account? Sign up now