The Value 4P – Courtesy of the AMD Opteron™ 6000 Series Platform
When is a 4P server not a 4P server? When it is priced like a 2P server.
The launch of AMD Opteron™ 6000 Series platform rings in a new era of computing that sees the removal of the “4P Tax.” Since we are so close to tax day in the US, this is a good time to discuss this tax and how eliminating it actually gives your business more options than ever before.
Traditionally, 4P capable processors have sold at a pretty stiff premium to their 2P cousins. Back before AMD entered the server business, a typical 2P processor might have been ~$1000USD and the typical 4P capable processor could have been as much as ~$3500USD. 3.5X more expensive, but unfortunately it did not yield 3.5X greater performance. In reality, the 4P capable processor was often essentially the same silicon. Processor companies would build a business model around assuming a certain mix of volume 2P processors and high margin 4P processors.
When AMD got into the server market in 2003, this price was pushed down (for our products at least) to closer to ~$2500USD. Still a premium, but a smaller bite out of your wallet.
Fast forward to 2010. AMD introduces the new AMD Opteron 6000 Series platform, and we actually smash the 4P tax at the same time. All AMD Opteron 6100 processors are 2P and 4P capable. There is no longer the distinction between them.
What does this mean for customers? Performance actually scales cleanly and price-performance follows that same trend. No longer does it take a huge jump up on the price scale when 4P technologies are deployed.
Why do this? This is the most common question from people. It is almost as if they believe customers want to pay more for 4P capable processors. I have yet to meet any server buyer that would willingly pay a premium if they didn’t have to. Now they don’t.
AMD believes that by removing this tax, we can help the 4P market. Sound familiar? The 4P market has been in decline since 2000. Under extreme pressure from the 2P market, which is reaching performance increases at a faster pace, the 4P market is struggling to remain viable. And this is at a time when workloads are becoming more demanding and can really utilize the CPU, I/O, and memory capabilities that were traditionaly only deivered by 4P. We believe the AMD Opteron 6000 Series platform is the “shot in the arm” that the 4P market needs.
Here is a quick example of how this new strategy can pay dividends for customers. I have dropped in a table with processor prices to prove a point.
Below we are comparing our competitor’s top bin processor pricing for 2 processors to our pricing on 4 mid-bin processors (AMD Opteron processors Model 6136 at $744 each). As you can see, we are almost 50% higher in integer throughput performance while also being 11% lower in total processor cost. Truly, these economics start to change the game when it comes to what platform you will deploy for which applications.
Considering that differential, customers may be very interested in a Value 4P configuration, offering 32 total cores, instead of a 2P that only offers 12 physical cores.
So where does a system like this play? Not everywhere, but there are a few places.
- Highly parallel HPC - Clusters where compute density is critical and the fabric interconnect is not saturated can be great targets. Imagine being able to use 1 4P instead of 2 2P servers. You can probably cut down on the cost of your fabric interconnect (cards, cables and switches), and may be able to reduce power and management costs by having fewer nodes.
- Virtualization – One of the main goals of virtualization is to consolidate resources. Many virtualized platforms are 2P today because of cost pressures. Many of these customers might gladly jump to a more scalable 4P as the main focus of consolidation if they could just make the economics work. The core and memory capabilities of the value 4P server enable the robust VMs needed to run demanding workloads like database and web serving.
- Database – a natural worload for 4P value server. Databases like SQL Server are highly threaded and can easily consume the up to 48 cores you can have on the value 4P server. You could also consolidate databases (with or without virtualization) on these platforms, and run Business Intelligence workloads with the analyst and database engines on the same server, helping to cut down on network traffic.
- Growing applications - Nobody ever wants to be the most expensive house on the block. It is the same with servers. Do you want to buy the top end 2P, with the fastest processors, all full of memory, if you think that the application will need more scalability in a year? With a Value 4P, you could buy in at the middle of the neighborhood and still have expandability for the future.
There are plenty of compelling reasons why Value 4P servers make a lot of sense, and it looks like today, only one company is willing to remove the 4P tax for you, so be sure that when you are looking for value, you look for the AMD Opteron processor.
John Fruehe is the Director of Product Marketing for Server/Workstation products at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.
POSTED IN: AMD Opteron
TAGS: 4P tax, AMD Opteron, Magny Cours, Price Performance



The 4P has 2x # CPUs and 1.5x to 2x # memory modules sucking power, so it’ll be interesting if you have power consumption at the wall for the 4P vs 2P as well.
In a 1:1 replacement, 2P to 4P, the power for a 4P will be higher, but the performance will scale much higher than the power. Because of that, absolute performance per watt will be better in most environments. The only area where this would not be the case would be applications that are not scaling well beyond the # of cores that a 2P provides. In that case, a 2P is the better choice. You always need to look at your applications when making decisions like this.
Storage platform nirvana! As enterprise densities compound, centralized storage is under increasing pressure to sink and source I/O.
Open storage platforms are leveraging industry standard PC servers to manage massive disk pools – bypassing proprietery RAID controllers in favour of direct I/O to disk and SSD. Next, that disk and cache must be redircted to demanding I/O consumers – typically running multi-core, high-I/O platforms. High bandwidth and low latency are keys to success here.
To do this efficiently and with high performance, more cores, memory and I/O capability are needed: sounds like a great fit for MC/4P. With the massive core count, I/O and memory potential, the 4P, G34 platform offers power for intensive storage processes like in-line dedupe, compression, checksum calculation and SSD off-load.
I hope AMD continues to work with it’s partners to deliver MB and systems that can fulfill the promise of dense computiing beyond HPC. The Open Storage community need robust numbers of PCI-E 2.0 slots for SAS2, 10GE and CNA I/O, not just boards with memory and processor slots.
BTW, on the history of the 2P/4P taxes, I wrote this in another comment:
“I am surprised no one mentioned the history behind it. The [original Pentium had built-in support for SMP with 2 precessors and the] Pentium Pro could support SMP with 4 processors. The Pentium II crippled it to only two processors. For 4 processor SMP systems, Intel later introduced the Xeon line with the Pentium II Xeon. Later the Pentium 4 was crippled to only one CPU, and later a separate P4-based Xeon DP line was introduced for dual processors. Later the DP Xeon line got renamed to Xeon 5000 series, and the MP Xeon line got renamed to Xeon 7000 series, with a new single processor only Xeon 3000 series being introduced later that is mostly versions of desktop CPUs with enhanced features like ECC in the most recent Xeon 3400/3500 series (they even use the same socket). “
Hi John!
According to people from Intel, Magny Cours in the performance/watt/price might be really the winner but it loses badly when it comes to security.
Take a look at this benchmark.
http://images.anandtech.com/graphs/amd12core_032610044429/22224.png
According to them, it makes Magny Cours useless compared to their solution for anything related to encryption.
They say that it is being widely adopted, even by US federal agencies. You can check that information here:
http://en.wikipedia.org/wiki/Advanced_Encryption_Standard
Daniel,
It is interesting that Intel is putting so much on AES encryption without mentioning the following:
1. Nehalem does not have it
2. Beckton does not have it
3. Software applications need to be recompiled to take advantage of the new instructions
So, either Beckton and Nehalem are useless (I am guessing that they aren’t about to step up and say that) or someone is really overplaying their hand on this one.
As for applications being recompiled, I’m sure that you would agree that security companies are not the first guys to jump on new instructions and new programming because of the sensitivity of what they do. These new instructions will get integrated in over time, once the software vendors have tested and feel comfortable with the solution. You won’t see them rush products out to market. So, eventually these instructions will be important. And we will be there with support in Bulldozer.
Enterprise technologies are slow to be adopted (nobody slept out in front of best buy to get the first westmere or magny cours), all of this takes time, evaluation periods, and a solid business need.
For years AMD Opteron had a significant advantage over Xeon in AES performance (software based) and intel generally dismissed this as irrelevant and niche whenever it was brought up. Now that it is integrated into their processor, if you don’t have it, your chip is “useless”? Hardly.
And, I do agree with your first statement. While the AES performance might be great, is someone really going to spend 42% more on their processors in order to get that?
plus, there’s more to security than AES. (for an immediate example, http://www.anandtech.com/print/2978). Note that, although for AES performance of intel’s latest chip wins hands-down, AMD beats it out for SHA (i.e. when you no longer have the special instructions). And there are a large number of non-AES crypto algos out there, e.g. blowfish and RSA. (the ssh manpage lists a number).
In short, AES and security are hardly synonymous.
As a side note, I really like the blender benchmark there, which *reverses* (Opteron vs Xeon; the inter-generational trends are still there) when you switch from Windows to Linux!
The HPC benchies there are tasty too.
It’s probably very true that the commercial sector validates their products for the new instructions for at least a product cycle which could be anything from six moths to few years. I wonder how many commercial banks, for example, have implemented the fixes for the SSL man-in-the middle problem since they became available.
That said, a user like the aforementioned and slightly mysterious US federal agency might use some custom codes and hardware (fpgas, asics, gpus) already so the impact on sales for this kind of clients might not be so large, and if the question is about the unit price they might as well use a product like the VIA C5 or Nano.
The use cases for encryption acceleration in general seem to be more client-focused at this point (secure video conferencing, with minimal power consumption, for example).
Quite frankly, for most of these critical security environments, there are in-line security devices (SSL, AES, etc.) that are employed. Rarely do they rely on the server processor, the traffic encryption functions are offloaded.
I think this platform makes a lot of sense pricewise – but I can’t find ANY motherboard with 48 DIMM sockets (12 DIMM/CPU) to make it competetive for our use (large DB) with lower-end Xeon 7500 4P servers.
Are they currently available or at least expected soon?
There are several coming, can’t comment right now.
And what the time frame may be – this month or next month or later?
Who won the contest?
Right now AMD is really in better position to be engaged into multi-core race; 61xx core IPC is lower then i7 but the cores are lighter and multi-core design is well scalable; Intel does realize that light cores are the way to go to massive multi-core design (40 cores anointment is the indicator of this); yet, majority of current applications are not massively- multithreaded so higher IPC will remain an important Intel advantage but it is going to diminish with multithreading improvements. Software companies with advanced multithreading designs are going to be a “hot assets”. We will witness quite a race (thanks to AMD).
StefanBanev
In the client space IPC is far more important than in the server space. The server space focuses on throughput.
>In the client space IPC is far more
>important than in the server space. The >server space focuses on throughput.
For the current client applications it is definitely a correct statement but “vast deposits” of perfectly multithreaded algorithms simply did not have a hardware MIMD ground to exist beyond niche market. Ray-tracing is one of such example (SIMD friendly GPU is definitely not a right hardware for that). Besides, even existing software development environment is quite up to develop an efficiently multithreaded applications yet, it requires a higher IQ from developers but coming multithreading tools will soften that (as similar has happen in the past).
StefanBanev