There are Benchmarks, and there are Benchmarks…


1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)
Loading ... Loading ...

I’m a fan of benchmarks. I think they are very helpful in allowing consumers to make informed purchase decisions about products. But they generally have some flexibility built into them so you can focus on those elements you want. And this means you can use a benchmark to tell a number of stories – which means you can choose to tell the story you want.

For example, take a recent review by AnandTech entitled “Sixteen Cores, Four Sockets” published on June 17, 2008. This article featured Quad-Core AMD Opteron™ processor-based systems. One of the performance evaluations in this article was a SPECjbb2005 benchmark estimate. What is particularly interesting about this article is that the published estimates list the 4 socket server running AMD Opteron processors model 8356 as 25% faster than the competition while running at similar frequencies and 7% faster than the fastest competitive solution. These results vary widely from the official scores posted on the SPEC site. Now you might ask – how can that be? How can you run what is considered to be an industry standard benchmark and get a different set of numbers? That can’t be right!

Taking a closer look at the SPECjbb2005 benchmark helps to unravel this mystery. SPECjbb2005 is a memory-intensive benchmark that is intended to evaluate the performance of servers running typical Java business applications. Its results evaluate the interaction of the CPU, caches, memory hierarchy, JVM (Java Virtual Machine), and JIT (Just-In-Time) compiler. SPECjbb2005 can be configured to run in a variety of ways, resulting in different performance outcomes. Different configuration = different story. For example, you can get different results based on the operating system used, the version of JVM used, the level of optimization of the JVM and JIT, JVM tuning options, and thread allocations.

The SPECjbb2005 scores published by SPEC tend to be achieved using very aggressive software tuning and processor settings. These settings help to achieve a “best possible score” but do not necessarily reflect how a system would be configured in a data center environment to provide the most stable and efficient performance. The scores published in the AnandTech article, according to the author, are more likely to reflect real world configurations with optimizations used consistent over the different processor architectures.

Indeed – if you do a survey around the internet you can find reference to other SPECjbb2005 scores and estimates that reflect a variety of configuration options and the resulting differences in the benchmark scores:

http://blogs.sun.com/bmseer/entry/sun_fire_x4440_best_opteron

Blog featuring SPECjbb2005 results with the 4 socket Sun Fire x4440 running quad-core AMD Opteron processors with Solaris 10 and Sun JVM. Also highlights power consumption of featured systems – reminding us that in today’s economy of escalating energy costs raw performance has less meaning to data centers than performance/watt.

http://techreport.com/articles.x/13176/4

An article by TechReport featuring SPECjbb2005 estimates for 2 socket servers running quad-core processors with Windows Server 2003 x64 edition and the Sun JVM. The author states the goal of this performance evaluation was to test relative performance on equal footing.

Taking a closer look at the official SPECjbb2005 scores and the estimates published in the various articles, you can see how confusing a benchmark can be. This serves as a reminder to us that benchmarks are just an indicator of performance and that a benchmark like SPECjbb2005, which allows for a wide variety of configurations, can produce a wide variety of results. And remember – the story being told is not always the one that best reflects reality….

Pat Moorhead is Vice President of Advanced Marketing at AMD.His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

my-linkedin-profile follow-me-on-Twitter My-FriendFeed

Tagged with: , , , ,

  1. #1 by sal cangeloso - March 13th, 2009 at 12:36

    Nice post Pat. Benchmarking has been a problem with the hardware community for some time. I’d say we are overly reliant on the numbers, but it’s really all we have to go on most of the time. I’d love to see an open source solution that the manufacturers would get behind. There are some options out there, but nothing overly comprehensive or easy to use.

  2. #2 by Computer Ed - March 13th, 2009 at 12:37

    Pat while I respect you I have to disagree, benchmarks are about useless right now for most people. While they might offer a good way to evaluate the performance of two systems in direct comparision it in no way gives any kind of landmark for those numbers to have meaning.

    For example if a benchamrk throws some numbers at me, what number on the system means I will enjoy a solid gaming experience? What I mean is how can a user make use of the numbers and there by get a feel for what might provide the better experience with their computing and be able to then compare the price.

    Lets say you have two computers, both will play your favorite game at the same detail level and both will run faster than 70 FPS, now the speed is less important than the price since both hit the experience you want.

    Microsoft is BLOWING IT with the WIE system. They have a chance to have a SIMPLE benchmark that will be usable by an average user. Tie this to the games and make them put on the WIE needed for proper play and suddenly people have an idea of what will best provide for their needs becuase they have a number with a solid quantifiable meaning, next they can look at features and price.

  3. #3 by Toby - March 13th, 2009 at 12:38

    Pat, I noticed in a recent multiple 4870×2 review a stunning setup with a huge number of monitors and with perhaps a game spread over all of them – 2-deep in the center.

    I think that perhaps a moving show like this at dealers’ showrooms (I suppose permanent might be too expensive) Would really knock a lot of people’s socks off and promote the capability of the all-AMD gaming platform.

    I’m sure the computer monitor suppliers would be in favor of it, and perhaps chip in. :)

    Regards

*
* (it won't be published)
Your Comment:*
* denotes a required field
We moderate the comments submitted to our blogs. Please do not submit your comment twice -- it will appear shortly.
  1. No trackbacks yet.