View All Business Blog Blogs

It’s All About the Cores

by John Fruehe

As we embark on an interesting 2010 and even more interesting 2011, it is clear that the age of clock speed is behind us; it’s all about cores for the next few years until “Fusion-based” computing hits the server market.

And let’s be clear – AMD is designing its AMD Opteron™ processors to have the cores that customers need to drive their enterprise applications.

To begin with, let’s dissect the difference between threads and cores.  Cores are physical blocks of logic in the processor that can run applications. In the old world, it was simple, one CPU = one core.  Today, Six- Core AMD Opteron processors (formally code named “Istanbul”) are quickly becoming the mainstream and by the end of this quarter, eight and twelve will be the operative core counts per processor.

Threads on the other hand aren’t physical –  they are software-generated tasks that can execute independently. In order for a program to run on multiple cores, you need to thread the program, or run multiple tasks  simultaneously. The operating system takes the threads spawned by the program and schedules them to run on available cores.

So – cores are like bikes, threads are the riders. Running more threads increases throughput for applications as long as you have available cores. If you have threads waiting to be scheduled and no available cores – you have a bottleneck.

There are two major strategies to getting more efficiency out of your server. The first is the simple, straightforward way – feed that application more cores.  That is why you are seeing 4+ cores in processors today. Nobody will argue against the point that giving applications more real cores will help increase overall throughput. However, some see another answer and wonder why AMD has chosen not to go down that path.

Simultaneous Multithreading (SMT) is a method for squeezing two threads into one core. SMT was first researched by IBM in 1968 and  introduced to x86 processors by Intel in 2002 under the name of HyperThreading.  That sounds great, in concept. Carpooling is more efficient than giving everyone their own car, right?

Well, car pooling falls apart if the two employees live too far from each other and the office is close.  If Bob lives 3 miles north of the office and Mary lives 2 miles south of the office, it really doesn’t make sense for them to carpool.  In the bike and rider example above, think of SMT as a tandem bike.  Yes it can move two riders, but not as quickly or efficiently as two separate bikes.

The challenge with SMT is that as a technology, it forces two threads to share a single  physical core.  

Consider a software thread running on a hardware thread, where a second runnable software thread is then executed on another hardware thread on the same core. This could be triggered by an event like a stall due to a cache miss. The second thread does not necessarily thrash the cache; in fact there are situations where the cache lines used by both threads are shared resulting in little cache churn. However, in many cases the second thread causes the cache to be refilled with its own data, requiring the first thread to refill the cache in turn when it resumes execution.  This competition for shared core resources on a processor with SMT is what can result in diminishing returns for SMT based processor, or worse, in situations with negative performance characteristics. (This paragraph was updated for clarity and to correct a statement that could have been misinterpreted…)

Generally speaking, SMT can give applications as much as an extra 10-20% increase in performance, which feels like that mythical “free lunch” that you were always told doesn’t exist.  Well, don’t start eating yet, because there is a dark side to SMT. What if adding that extra thread actually decreased your throughput?  What if 8 threads on 4 cores provided worse throughput than 4 threads on 4 cores?

Here are a few examples of opinions on the other side of the SMT discussion:

There are more examples, but the “free lunch” is obviously not quite as tasty as you might have originally expected.

So, if SMT (or “core sharing”) yields both positive and negative results, what is the better answer?  How about more cores?  When you add more cores, you add more throughput. Period.

When you run multiple threads over multiple cores, you can expect better performance, and that is the AMD strategy. With “Magny Cours” we’re planning 8 and 12 cores per processor running 8 and 12 threads, not 8 or 12 threads sharing 4 or 6 cores.  No sharing needed, every thread can be as selfish as it needs to be. Then in 2011, we plan to introduce “Interlagos” and increase the core count again, to 12 and 16. With “Interlagos” we’re designing  some shared components that help reduce power consumption and die size, but you won’t see us sharing integer pipelines, the “meat” of the core.

By keeping discrete integer cores, and delivering more of those cores per CPU, AMD is designing processors that are designed to help you get more throughput for your enterprise applications.

Here’s AMD’s Core Commitment for servers:

  1. AMD is working to deliver more cores for your business critical applications and a wider choice of core configurations.  From 4 cores through 12 cores per processor planned for 2010 and 6 to 16 cores planned in 2011, AMD is working to deliver more of the resources that you need to drive your business forward.
  2. Our cores are real. Threads can run faster when they have their own core underneath rather than having to share. If you have to run 12 threads, we know you would rather have 12 cores with unfettered access than worry about sharing cores.

Of course there are those that can say “well, things like SMT can be implemented inexpensively and don’t consume that much power.” To those, I ask you, historically hasn’t AMD been the one committed to deliver better value and lower power?  Why would we stray from our core principles?

If you can get all the cores you need at the price you need and the power envelope that you need, then why would you ever consider anything else? Why would you ever compromise? Have your cake, and eat it too.  THAT is your free lunch.  And it’s delicious.

John Fruehe is the Director of Product Marketing for Server/Workstation products at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.

SHARE: twitter stumble upon delicious facebook

COMMENTS: 32

32 Comments

  • Pingback: Interesting Reading #402 – killer robots, racist cameras, fake Blu-ray players, Self-Aware Mercedes and much more… – The Blogs at HowStuffWorks

  • Shaun January 25, 2010

    RE Se’s comment with Single threaded Performance.

    But can you still clarify in the server workspace whether or not single threaded performance is greater than current K8 designs?

    • John Fruehe January 27, 2010

      A 16-core Interlagos will have 33% more cores than a 12-core Magny Cours processor. The performance increase will be greater than 33%, so, from that aspect, you get a greater “per core performance.

      The “per core” performance is really a poor way of looking at the processor. For instance, somewhere in my product stack I will have a 12-core processor that will be priced at the same price as my competitor’s 4-core processors. In that case, my “per core price” is only 1/3 the price of my competitor’s core. Are you comparing the per core price as well? If you head down the trap of looking at per core performance, you have to go the whole way and look at per core price and per core power.

      Regardless of where we end up with per core performance, I can assure you that per core price and per core power will be SIGNIFICANTLY lower for AMD. That spells far better value in my book.

  • cad January 26, 2010

    wy not add it in, but have specific calls from instruction sets to use the most efficiant way?

    like you said, if you can gain up to 20% increase, then why not?

    • John Fruehe January 27, 2010

      It is a philosophical question. You don’t just “bolt it on” to the existing technology. You have to make architectural choices about how you design the architecture. We chose a path that gets higher scalability and performance.

  • redisnidma January 29, 2010

    “somewhere in my product stack I will have a 12-core processor that will be priced at the same price as my competitor’s 4-core processors.”

    If that’s the case, that’s a shame because it means that even at 8 more cores, your product is under performing against that of your competitor. I would expect otherwise, not you guys doing so.

    john: I know that you’re a marketing guy and your job is to make AMD look good no matter the scenario, but you can’t deny that a resource-restrained company like AMD will make any serious money selling bigger/hotter processors (more cores and at bigger nodes) at less than half the price of its competitor. Also, remember this: You guys don’t have fabs any more!!!

    • John Fruehe January 30, 2010

      I said they would be priced the same, I did not say that they will perform the same. I anticipate much better performance out of my processor.

      And hotter? I don’t think so. We will have very good per core power, so don’t expect that a 12-core has to have more power consumption than my competitor’s 4-core.

      As a customer, if you can get more cores, more performance and the same power at a better price, why wouldn’t you want that?

  • redisnidma January 31, 2010

    John Fruehe :

    As a customer, if you can get more cores, more performance and the same power at a better price, why wouldn’t you want that?

    Have you ever thought that indeed you might not only be talking with a customer, but also with a shareholder? ;)

    • John Fruehe January 31, 2010

      It is always highly likely. But I let the folks in investor relations handle the shareholders, my portfolio is a good example of how I know more about servers and processors than stocks :)

  • kmpst February 7, 2010

    I’m about to build a small render farm for 3d renderings. I was thinking of buying 8x quad cores for a total of 32 cores.(3400usd)

    My questions are:
    - will the Magny-cours 12 cores be available in march 2010/ *a in europe (switzerland)

    -will I be able to get for the same amount of money at least double the amount of cores?

    -will I be able to integrate the chip sets in to blades *we where thinking of the supermicro 6026tt

    How would you build a render farm with as many cores as possible with a budget of 8000usd? how many cores would your farm have and how would you build it?

    I’m in quite a clinch since we need to invest rather sooner than later – but I really don’t feel like spending all this money in the next two weeks if in 4 weeks there will be a lot more cost efficient solutions around …..

    thanks++

    • John Fruehe February 8, 2010

      1. yes
      2. I can’t speak to prices prior to launch, but expect excellent value
      3. I can’t speak to SuperMicro’s products, but expect to see G34-based blades

      If you are looking for the largest core density at the best price, hands down you will find that from AMD when our products launch this quarter.

      • rabah Khamis February 24, 2010

        Hi John,

        Thanks for simplifying you point of view.. But I have few things that made me confused. For one, I like the analogy you presented with SMT being the free lunch . I agree with you, SMT may not be the best and cores give more performance usually than SMT but at a premium price. Here is where I differ with you: As of the free lunch, my compay started providing free fruit which previously I did not eat every day. I started picking up the fruit and eat it and while it is not the best fruit, it helped me add fruit to my diet and while they do not taste the best, they taste just good. Secondly, I do not blame you for the bias your article direct readers to against SMT. But You seemed to pick up the negative comments from microsoft and others regarding SMT and did not balance it with bench marks on apps that actually do better.

        Thirdly, At this time and age, SMT and # of cores do not seem to make much difference to most of the client applications/users (and I understand your statement about you are not a client guy). But here is reality: the 4 core Xeon 5560 still beats the 6 core opteron in most applications and the 2 core processors Intel released recenlty beat the 4 core Athlon in many many applications. So your argument for the increase number of cores being best strategy may not hold ground after all. Are you promising the 8-core and 12 core procs coming soon will top performance, performance/watt or whatever measuring scale you use?

        Finally, you probably should put a disclaimer about the 8 and 12 core being the best in the market when they come. That disclaimer should be in a context of assuming no other products are coming out of your competitors.

        • John Fruehe February 24, 2010

          Both companies make philosophical decisions about what their architecture will be.

          We believe that larger numbers of smaller cores is the best way to design a processor. Our competitor believes that fewer larger cores with SMT is the way to go.

          I believe that our upcoming products will beat not only our competitors current products, but also beat our competitor’s soon to be released products. No disclaimer is necessary.

          It is somewhat simplistic to look at performance as the only criteria. People look at performance, price and power consumption. Those that believe performance is the only key metric ignore that more than 95% of the server processors sold are NOT the top bin speed. As a matter of fact the most popular models are generally a few steps down the stack.

          Yes, I believe we will have better performance than westmere, but what will really shake up the market is the combination of performance, price and power efficiency. Those three combine to show true value, and that is what we will deliver.

  • Pingback: Intel launches new generation of server chips to make the cloud more efficient | reinstein TV

  • Pingback: AMD Best Practices Series: Understanding the Bigger Picture of VMmark Benchmarks | The Virtualization Blog

  • Pingback: AMD Best Practices Series: Understanding the Bigger Picture of VMmark Benchmarks |

Submit a Comment

Connect with Facebook

Reminder about Comments:

All comments will be moderated by AMD before they are published. Unrelated comments or requests for support will not be published. Please post your technical questions in the AMD Forums or for drivers and other support resources visit AMD Support. By submitting a comment, you are agreeing to AMD Terms and Conditions.