Other Multi-Core Benefits

The benefits of multi-core architectures are not limited to the games themselves. In fact, for many companies the benefits to the in-house developers can be far more important than the improvements that are made to the gaming world. Where actual games have to worry about targeting many different levels of computer systems, the companies that create games often have a smaller selection of high-end systems available. We talked earlier about how content creation is typically a task that can more readily take advantage of multithreading, and the majority of workstation level applications are designed to leverage multiprocessor systems. At the high-end of the desktop computing segment, the line between desktop computers and workstations has begun to blur, especially with the proliferation of dual and now quad core chips. Where in the past one of the major benefits of a workstation was often having dual CPU sockets, you can now get up to four CPU cores without moving beyond a desktop motherboard. There are still reasons to move to a workstation, for example the additional memory capacity that is usually available, but if all you need is more CPU power there's a good chance you can get by with a dual or quad core desktop instead.

Not surprisingly, Valve tends to have computers that are closer to the top end desktop configurations currently available. We asked about what sort of hardware most of their developers were running, and they said a lot of them are using Core 2 Duo systems. However, they also said that they have been holding off upgrading a lot of systems while they waited for Core 2 Quad to become available. They were able to do some initial testing using quad core systems, and for their work the benefits were so tremendous that it made sense to hold off upgrading until Intel launched the new processors. (For those of you that are wondering, on the graphics side of the equation, Valve has tried to stay split about 50-50 between ATI and NVIDIA hardware.)

One of the major tools that Valve uses internally is a service called VMPI (Valve Message Passing Interface). This utility allows Valve to make optimal use of all of the hardware they have present within their offices, somewhat like a distributed computing project, by sending work units to other systems on the network running the VMPI service. Certain aspects of content creation can take a long time, for example the actual compilation (i.e. visibility and lighting calculations) of one of their maps. Anyone that has ever worked with creating levels for just about any first-person shooter can attest to the amount of time this process takes. It still takes a lot of effort to design a map in the first place, but in level design there's an iterative process of designing, compiling, testing, and then back to the drawing board that plays a large role in perfecting a map. The problem is, once you get down to the point where you're trying just clean up a few last issues, you may only need to spend a couple minutes tweaking a level, and then you need to recompile and test it inside the gaming engine. If you are running on a single processor system -- even one of the fastest single processor systems -- it can take quite a while to recompile a map.

The VMPI service was created to allow Valve to leverage all of the latent computational power that was present in their offices. If a computer is sitting idle, which is often the case for programmers who are staring at lines of code, why not do something useful with the CPU time? Valve joked about how VMPI has become something of a virus around the offices, getting replicated onto all of the systems. (Yes, it can be shut off, for those that are wondering.) Jokes aside, creating a distributed, multithreaded utility to speed up map compilation times has certainly helped the level creators. Valve's internal VRAD testing can be seen below, and we will have the ability to run this same task on individual systems as a benchmark.


Running as a single thread on a Core 2 processor, a 2.67 GHz QX6700 is already 36% faster than a Pentium 4 3.2GHz. Enabling multithreading makes the Kentsfield processor nearly 5 times as fast. Looking at distributing the work throughout the Valve offices, 32 old Pentium 4 systems are only ~3 times faster than a single Kentsfield system (!), but more importantly 32 Kentsfield systems are still going to be 5 times faster than the P4 systems. In terms of real productivity, the time it takes Valve's level designers to compile a map can now be reduced to about half a minute, where a couple years back it might have been closer to 30 minutes. Now the level designers no longer have to waste time waiting for the computers to prepare their level for testing; 30 seconds isn't even enough time to run to the bathroom and come back! Over the course of a project, Valve states that they should end up saving "thousands of hours" of time. When you consider how much most employees are being paid, the costs associated with upgrading to a quad core processor could easily be recouped in a year or less. Mod authors with higher end systems will certainly appreciate the performance boost as well. We will take a closer look at performance scaling of the VRAS map compilation benchmark on a variety of platforms in a moment.

Gaming's Future, Continued Test Setup
Comments Locked

55 Comments

View All Comments

  • Nighteye2 - Wednesday, November 8, 2006 - link

    Ok, so that's how Valve will implement multi-threading. But what about other companies, like Epic? How does the latest Unreal Engine multi-thread?
  • Justin Case - Wednesday, November 8, 2006 - link

    Why aren't any high-end AMD CPUs tested? You're testing 2GHz AMD CPUs against 2.6+ GHz Intel CPUs. Doesn't Anandtech have access to faster AMD chips? I know the point of the article is to compare single- and multi-core CPUs, but it seems a bit odd that all the Intel CPUs are top-of-the-line while all AMD CPUs are low end.
  • JarredWalton - Wednesday, November 8, 2006 - link

    AnandTech? Yes. Jarred? Not right now. I have a 5000+ AM2, but you can see that performance scaling doesn't change the situation. 1MB AMD chips do perform better than 512K versions, almost equaling a full CPU bin - 2.2GHz Opteron on 939 was nearly equal to the 2.4GHz 3800+ (both OC'ed). A 2.8 GHz FX-62 still isn't going to equal any of the upper Core 2 Duo chips.
  • archcommus - Tuesday, November 7, 2006 - link

    It must be a really great feeling for Valve knowing they have the capacity and capability to deliver this new engine to EVERY customer and player of their games as soon as it's ready. What a massive and ugly patch that would be for virtually any other developer.

    Don't really see how you could hate on Steam nowadays considering things like that. It's really powerful and works really well.
  • Zanfib - Tuesday, November 7, 2006 - link

    While I design software (so not so much programming as GUI design and whatnot), I can remember my University courses dealing with threading, and all the pain threading can bring.

    I predicted (though I'm sure many could say this and I have no public proof) that Valve would be one of the first to do such work, they are a very forward thinking company with large resources (like Google--they want to work on ANYthing, they can...), a great deal of experience and, (as noted in the article) the content delivery system to support it all.

    Great article about a great subject, goes a long way to putting to rest some of the fears myself and others have about just how well multi-core chips will be used (with the exception of Cell, but after reading a lot about Cell's hardware I think it will always be an insanely difficult chip to code for).
  • Bonesdad - Tuesday, November 7, 2006 - link

    mmmmmmmmm, chicken and mashed potatoes....
  • Aquila76 - Tuesday, November 7, 2006 - link

    Jarred, I wanted to thank you for explaining in terms simple enough for my extremely non-technical wife to understand why I just bought a dual-core CPU! That was a great progression on it as well, going through the various multi-threading techniques. I am saving that for future reference.
  • archcommus - Tuesday, November 7, 2006 - link

    Another excellent article, I am extremely pleased with the depth your articles provide, and somehow, every time I come up with questions while reading, you always seem to answer exactly what I was thinking! It's great to see you can write on a technical level but still think like a common reader so you know how to appeal to them.

    With regards to Valve, well, I knew they were the best since Half-Life 1 and it still appears to be so. I remember back in the days when we weren't even sure if Half-Life 2 was being developed. Fast forward a few years and Valve is once again revolutionizing the industry. I'm glad HL2 was so popular as to give them the monetary resources to do this kind of development.

    Right now I'm still sitting on a single core system with XP Pro and have lots of questions bustling in my head. What will be the sweet spot for Episode 2? Will a quad core really offer substantially better features than a dual core, or a dual core over a single core? Will Episode 2 be fully DX10, and will we need DX10 compliant hardware and Vista by its release? Will the rollout of the multithreaded Source engine affect the performance I already see in HL2 and Episode 1? Will Valve actually end up distributing different versions of the game based on your hardware? I thought that would not be necessary due to the fact that their engine is specifically designed to work for ANY number of cores, so that takes care of that automatically. Will having one core versus four make big graphical differences or only differences in AI and physics?

    Like you said yourself, more questions than answers at this point!
  • archcommus - Tuesday, November 7, 2006 - link

    One last question I forgot to put in. Say it was somehow possible to build a 10 or 15 GHz single core CPU with reasonable heat output. Would this be better than the multi-core direction we are moving towards today? In other words, are we only moving to mult-core because we CAN'T increase clock speeds further, or is this the preferred direction even if we could.
  • saratoga - Tuesday, November 7, 2006 - link

    You got it.

    A higher clock speed processor would be better, assuming performance scaled well enough anyway. Parallel hardware is less general then serial hardware at increasing performance because it requires parallelism to be present in the workload. If the work is highly serial, then adding parallelism to the hardware does nothing at all. Conversely, even if the workload is highly parallel, doubling serial performance still doubles performance. Doubleing the width of a unit could double the performance of that unit for certain workloads, while doing nothing at all for others. In general, if you can accelerate the entire system equally, doubling serial performance will always double program speed, regardless of the program.

    Thats the theory anyway. Practice says you can only make certain parts faster. So you might get away with doubling clock speed, but probably not halving memory latency, so your serial performance doesn't scale like you'd hope. Not to mention increasing serial performance is extremely expensive compared to parallel performance. But if it were possible, no one would ever bother with parallelism. Its a huge pain in the ass from a software perspective, and its becoming big now mostly because we're starting to run out of tricks to increase serial performance.

Log in

Don't have an account? Sign up now