Multi-core support in Games?

Both Quake 4 and Call of Duty 2 now have SMP support, supposedly offering performance improvements on dual core and/or Hyper Threading enabled processors. 

For Call of Duty 2, you simply install the new patch and off you go; SMP support is enabled.  To verify, we ran our CoD 2 benchmark and kept a log of the total processor utilization over time.  Below is a shot of perfmon with a fresh install of CoD2 (sans SMP patch):

Note how the total CPU utilization for our dual-core testbed hovers right around 50%, with the maximum being just under 52% (the remaining 2% can be attributed to driver and other overhead that can eat up extra CPU cycles). 

Now, let's look at CoD2 CPU utilization with the SMP patch installed:

While the average CPU utilization only goes up by around 9%, the maximum CPU utilization increases tremendously, now up to 83%, showing us that the second core is being used. 

We looked at performance at 1024x768 and obviously the higher the resolution, the lesser the impact of a faster CPU (at the same time, the lower the resolution, the greater the impact will be as the game becomes less GPU limited). 

To ensure a fair comparison, we tested using the SMP patch and simply disabled SMP manually by setting the r_smp_backend variable to "0".  We confirmed that SMP support was actually disabled by running perfmon and measuring CPU utilization. 

 Call of Duty 2    SMP Disabled SMP Enabled
AMD Athlon 64 FX-57 (2.8GHz) 80.6 N/A
AMD Athlon 64 X2 4800+ (2.4GHz) 79.8 70.3
AMD Athlon 64 X2 3800+ (2.0GHz) 78.7 68.1
Intel Pentium Extreme Edition 955 (3.46GHz) 79.8 68.4
Intel Pentium Extreme Edition 840 (3.2GHz) 78.1 68
Intel Pentium D 820 (2.8GHz) 75.6 67.1

Surprisingly enough, we actually saw pretty large performance drops in CoD2 with SMP enabled across both AMD and Intel platforms.  This is unfortunate, but the withdrawn SMP support of Quake 3 makes it less than shocking. We do expect that things will get better as time goes on. 

Quake 4 was a different story; with r_useSMP enabled, we saw some extremely large performance gains with the move to dual core:

 Quake 4    SMP Disabled SMP Enabled
AMD Athlon 64 FX-57 (2.8GHz) 115.4 N/A
AMD Athlon 64 X2 4800+ (2.4GHz) 114.9 147.4
AMD Athlon 64 X2 3800+ (2.0GHz) 100.9 143.2
Intel Pentium Extreme Edition 955 (3.46GHz) 98.9 142.3
Intel Pentium Extreme Edition 840 (3.2GHz) 89.0 133.6
Intel Pentium D 820 (2.8GHz) 80.6 125.5

The SMP patch either only spawns two threads, or the instruction mix of Quake 4 with the patch does not mix well with Intel's Pentium EE 955.  The dual core with Hyper Threading enabled platform didn't do anything at all for performance. 

While we're only looking at two games, this is a start for multithreaded game development.  You can expect to see a lot of examples where dual-core does absolutely nothing for gaming, but as time goes on, the situation will change. 

Presler vs. Smithfield - A Brief Look Dual Core and Hyper Threading: Detriment or Not?
Comments Locked

84 Comments

View All Comments

  • JarredWalton - Friday, December 30, 2005 - link

    See above post. The 3800+ OC article has the BF2 benchmarks/tools in it.
  • bob4432 - Friday, December 30, 2005 - link

    thanks, i had just found that. excellent tool ;). what is the difference between average fps and actual fps?
  • Spacecomber - Friday, December 30, 2005 - link

    If you need more direction on how to go about creating and running a timedemo in BF2, take a look at http://www.overclockers.com.au/article.php?id=3841...">this article over at overclockers.com.au.

    The timedemo records the time it takes for each frame to be rendered over the course of the demo being run. It sums these times and divides by the number of frames to come up with an average. You end up with just one number standing in for a rather large collection of data. Some sites, such as hardocp, try to show more than just an average, usually by presenting a graph of the framerates over the length of the timedemo. This can be helpful, because when you are trying to evaluate how well a particular hardware setup will work with your favorite game, you really are looking to see whether it will maintain playable minimun framerates at the resolution and graphics settings that you want to use. An average alone only gives you a rough idea about this, though it does give you a quick and dirty way to compare different video cards in the same game setting.

    If you create and run a Battlefield 2 timedemo and look at the complete results, you'll see how very wide the range of framerates is. For example, running the timedemo, I have gotten an average of 50 fps, but the range is from 2 to 105 fps, with a standard deviation of 12.3. Graphing out the individual frame rates will let you see how often the frame rates drop below 20 fps, for example, which many would consider too low for online gaming.

    http://www.sequoyahcomputer.com/Analysis/BF2memory...">Here is a graph of a BF2 timedemo. It's for the data that gave me an average of 50 fps that I mentioned previously. Although 50 fps sounds like an ok average, looking at the graph, you can see that many might consider these settings on this hardware to be barely playable.

    Space
  • bob4432 - Saturday, December 31, 2005 - link

    thanks, what program did you use to graph the data?
  • Spacecomber - Saturday, December 31, 2005 - link

    The full results of the time demo are saved in a csv file, timedemo_framerates.csv, which can be opened with a spreadsheet program. I used the spreadsheet program in OpenOffice to view the data and eliminate the framerates that are erroneously recorded before the actual gameplay demo has begun (they are easy to recognize, since they are at the begining of the data and unnaturally high), and I also used the spreadsheet program to graph the data.

    Space
  • JarredWalton - Friday, December 30, 2005 - link

    I believe Anand is using the same benchmark that I http://www.anandtech.com/cpuchipsets/showdoc.aspx?...">linked in my Overclocking article. He's probably running the 1.12 version now, which would account for the slightly lower scores than what I got with the 1.03 version and demo files. BF2 is VERY GPU limited, so even at 1024x768 you will start to hit FPS limits on high-end systems. You can see in the above page how FPS scaled with CPU speed on an X2 3800+ chip, and I only improve average frame rates by 18% with a 35% overclock at 1024x768. That dropped to 8% at 1280x1024 and less than 4% at 1600x1200 and above.
  • danidentity - Friday, December 30, 2005 - link

    Has there been any official word on whether or not 975X will support Conroe?
  • coldpower27 - Friday, December 30, 2005 - link

    a 975X Rev 2.0 is probably needed. However the i965 Chipser series for sure as they are rumored to be launched simultaneously.
  • Shintai - Friday, December 30, 2005 - link

    You gonna need i965 I bet for sure, specially if Conroe gonna use a 1333Mhz bus.

    However, Merom should fit in Yonah Socket (Conroe mobile part)
  • Beenthere - Friday, December 30, 2005 - link

    Every hardware site that has tested the power consumption and operating temps of Presler knows full well this is a 65 nano FLAME THROWER almost making the P4 FLAME THROWER look good by comparison. "Normal" operating temps of 80 C are OUTRAGEOUS as is equal or higher power consumption than the FLAME THROWING P4 series. And as the benches show -this is a Hail Mary approach by Intel to baffle the naive with B.S. No one with a clue would touch this inferior CPU design. And to add insult to injury, after the Paper Launch -- when they are actually available for purchase in Feb. or later, the asking price is $999. Yeah, I'll run right out and buy a truckload of Preslers to use for space heaters in my house...

Log in

Don't have an account? Sign up now