AMD Opteron A1100

The 28nm octal-core AMD Opteron A1100 is a lot more modest and aims at the low end Xeon E3s. Stephen has described the chip in more detail. To ensure a quick time to market, the AMD Opteron A1100 is made of existing building blocks already designed by ARM: the Cortex-A57 core and the Cache Coherent Network or CCN.

The AMD Opteron A1100 is one of the few vendors that uses the ARM interconnect. ARM put a lot of work into this design to enable ARM licensees to build SoCs with lots of accelerators and cores. CCN is thus a way of attaching all kinds of cores, processors, and co-processors ("accelerators") coherently to a fast crossbar, which also connects to four 64-bit memory controllers, integrated NICs, and L3 cache. CCN is very comparable to the ring bus found inside all Xeon processors beginning with "Sandy Bridge". The top model is the CCN-512 which supports up to 12 clusters of quad-cores. This could result in an SoC with 32 (8x4) A57 cores and four accelerators for example.

AMD would not tell us which CCN they are using but we suspect that it is CCN-504. The reason is this CCN was available around the time work started on the Opteron A1100 and the fact that AMD mentions the ARM bus architecture AMBA 5 in their slides. And it also makes sense: the CCN-504 supports up to 4 x 4 cores and supports the Cortex-A57.

It was rumored that the A1100 still used the CCI-400 interconnect, which is used by smartphone SoCs, but that interconnect uses the AMBA 4 architecture. Meanwhile the CCN-502 was announced in October 2014, way too late to be inside the A1100.

The AMD Opteron A1100 consists of four pairs of "standard" triple issue Cortex-A57 cores and 1MB L2 cache, with 8MB L3 cache.

The key differentiator is the cryptographic processor that can accelerate RSA (Secure Connection/hand shake) and AES (encrypting the data you send and receive) and SHA (part of the authentication). Intel uses the PCIe Quick Assist 89xx-SCC add-in card or the special Intel Communication chipset to provide a cryptographic coprocessor. These coprocessors are mostly used in professional firewalls/routers. As far as we know such cryptographic processors are of limited use in most https web services. Most modern x86 cores now support AES-NI, and these instructions are well supported. As a result, the current x86 CPUs from AMD and Intel outperform many co-processors when it comes to real world AES encoding/decoding of encrypted data streams.

A cryptographic coprocessor could still be useful for the RSA asymmetric encrypted handshake, but it remains to be seen if offloading the handshakes will really be faster than letting the CPU take care of it, as each offload operation causes all kinds of overhead (such as a system call). A cryptographic coprocessor running on the same coherent network as the main cores could be a lot more efficient than a PCIe device though. It has a lot of potential, but AMD could not give us much info on the current state of software support.

Cavium Thunder-X Late to the Party: Broadcom and Qualcomm
Comments Locked

78 Comments

View All Comments

  • aryonoco - Wednesday, December 17, 2014 - link

    I just wanted to thank you Johan De Gelas for this very insightful and interesting article.

    Hugely enjoyed reading it and your thoughts on the subject.

    Good to see high quality content continue to be published at AT now that Anand has left.
  • JohanAnandtech - Wednesday, December 17, 2014 - link

    aryonoco, Jann Thanks for letting me know. A good motivation to always push a bit harder to make sure I don't let my readers down :-).
  • jann5s - Wednesday, December 17, 2014 - link

    Thank you Johan, for writing this very interesting article!
  • przemo_li - Wednesday, December 17, 2014 - link

    Very well written walk through current and possible CPU/SOC parts.

    Will there be similar piece for software?
    ARM (embedded) folks aren't famous for quality drivers/code.

    It must change, so it will change. But for now such overview would be great!
  • bobbozzo - Wednesday, December 17, 2014 - link

    Typo on page2:
    "(4 Slots x 8 DIMMs)" - change 8 to 8GB

    Thanks
  • bobbozzo - Wednesday, December 17, 2014 - link

    and page 4:
    "you will be able to choose between SoCs that have 100 Gbit Ethernet and 10GBit Ethernet."

    should 100 be 40?
  • bobbozzo - Wednesday, December 17, 2014 - link

    Page 12:
    "Most of them are the usual IPSec, TPC offloading engines"

    Should that be TCP?

    Also, are there still accelerators for AntiVirus engines and IDS/IPS search (there were some back in 2005).

    Thanks
  • bobbozzo - Wednesday, December 17, 2014 - link

    ...
    I guess that's what the RegEx would be useful for.

    However, not all IDS/IPS / A/V patterns use RegEx, and there are other means of acceleration.
  • eanazag - Wednesday, December 17, 2014 - link

    Welcome back Johan.

    Glad to see you're still writing here. Good stuff in the article.
  • JKflipflop98 - Wednesday, December 17, 2014 - link

    I simply don't get where this whole "microserver" thing is coming from.

    By the time you cluster up enough ARM processors to match the processing power of an Intel/AMD solution, you're burning just as much power and spent just as much money as you would have by using x86 in the first place. Except now you have to use some janky middleware solution because all your software is x86 and you're running on ARM cores.

Log in

Don't have an account? Sign up now