Rendering and HPC Benchmark Session Using Our Best Servers

Name: Rendering and HPC Benchmark Session Using Our Best Servers
Item: Rendering and HPC Benchmark Session Using Our Best Servers
Author: Johan De Gelas

by Johan De Gelas on September 30, 2011 12:00 AM EST

52 Comments | Add A Comment

52 Comments

STARS Euler3D CFD

The STARS Euler3D CFD benchmark got popular thanks to Scott of Techreport.com. It is a computational fluid dynamics (CFD) benchmark based on the STARS Euler3D structural analysis routines developed at CASELab, the Computational AeroServoElasticity Laboratory at Oklahoma State University. Since the benchmark has been used for years by Scott, we felt it was a good place to start our HPC benchmarking adventure: we could check if our results are in the right ballpark.

The benchmark is downloadable and described in great detail here. The benchmark score is reported as a CFD cycle frequency in Hertz, with higher results being better.

Stars Euler 3D CFD: maximum score

The Xeon E7 scales quite nicely on the condition that you disable Hyper-Threading. The benchmark is able to take advantage of Hyper-Threading, which can be seen on the dual Xeon system. However, the threads work on the same data grid, so the more threads, the more locking contention rears its ugly head. Here's a more detailed look at scaling with the number of threads:

The Hyper-Threading enabled Xeon X5670 performs worse than the non-HT setup until we run more than 12 threads. Once we do that it can offer a decent performance boost (17%). The benchmark however does not scale enough to take advantage of 80 threads. Hyper-Threading offers better resource utilization but that does not negate the negative performance effect of the overhead of running 80 threads. Once we pass 40 threads on the E7-4870, performance starts to level off and even drop.

Of course, you are probably more interested in the other server result. What happened to the Opteron scores? Why is the 48 core Opteron five times slower than the 40 core Xeon E7? Let's investigate further.

Cinebench Release 11.5 Investigating the Opteron Performance Mystery

PRINT THIS ARTICLE

Post Your Comment
Please log in or sign up to comment.

Comments Locked

52 Comments

View All Comments

derrickg - Friday, September 30, 2011 - link
Would love to see them benchmarked using such a powerful machine.
JohanAnandtech - Friday, September 30, 2011 - link
Suggestions how to get this done?
derrickg - Friday, September 30, 2011 - link
simple benchmarking: http://www.linuxhaxor.net/?p=1346

I am sure there are much more advanced ways of taking benchmarks on chess engines, but I have long since dropped out of those circles. Chess engines usually scale very well from 1P and up.
JPQY - Saturday, October 1, 2011 - link
Hi Johan,

Here you have my link how people can test with Chess calculatings in a very simple way!

http://www.xtremesystems.org/forums/showthread.php...

If you are interested you can always contact me.

Kind regards,
Jean-Paul.
JohanAnandtech - Monday, October 3, 2011 - link
Thanks Jean-Paul, Derrick, I will check your suggestions. Great to see the community at work :-).
fredisdead - Monday, April 23, 2012 - link
http://www.theinquirer.net/inquirer/review/2141735...

dear god, at last the truth. Interlagos is 30% faster

hey anand, whats up with YOUR testing.
fredisdead - Monday, April 23, 2012 - link
everybody, the opteron is 30% faster

http://www.theinquirer.net/inquirer/review/2141735...

follow thew intel ad bucks ... lol
anglesmith - Friday, September 30, 2011 - link
i was in a similar situation on a 48 core opteron machine.

without numa my app was twice slower than a 4 core i7 920. then did a test with same number of threads but with 2 sockets (24 cores), the app became faster than with 48 cores :~
then found the issue is all with numa which is not a big issue if you are using a 2 socket machine.

once i coded the app to be numa aware the app is 6 times faster.

i know there are few apps that are both numa aware and scale to 50 or so cores but ...
tynopik - Friday, September 30, 2011 - link
benhcmark

like it Phenom
JoeKan - Friday, September 30, 2011 - link
I'd llove to see single core workstations used as baseline comparisons. In using a server to render, I'd be wondering which would be more cost effective to render animations. Maybe use an animation sequence as a render performance test.

Rendering and HPC Benchmark Session Using Our Best Servers

Post Your Comment

52 Comments

View All Comments

derrickg - Friday, September 30, 2011 - link

JohanAnandtech - Friday, September 30, 2011 - link

derrickg - Friday, September 30, 2011 - link

JPQY - Saturday, October 1, 2011 - link

JohanAnandtech - Monday, October 3, 2011 - link

fredisdead - Monday, April 23, 2012 - link

fredisdead - Monday, April 23, 2012 - link

anglesmith - Friday, September 30, 2011 - link

tynopik - Friday, September 30, 2011 - link

JoeKan - Friday, September 30, 2011 - link

Log in

Don't have an account? Sign up now