Platform Strategy: 4x4, Torrenza, Trinity, and Raiden

It wouldn't be fair to completely ignore AMD Live!, as there was a fair amount of time spent talking about it. Unfortunately, AMD Live! is much like Intel's VIIV. That is to say, the "technology" is more of a suggestion about what components to include in computers built for a specific purpose in order to assist in the marketing of an idea. Certainly, the "computer as media center" idea isn't something new. Intel and AMD simply enabled the magic co-branding fairy to make end users feel all warm and squishy inside about their purchase. To be fair, mindshare is a large part of the game, and Centrino has served Intel very well on the mobile front (though I wish this early Centrino I've got had been a Pentium-M with onboard 802.11g).

Moving on, there were some very high powered (as in power draw) announcements. First off, AMD is pushing a new high end enthusiast platform consisting of dual socket motherboards for dual core processors combined with quad GPU solutions. In an incredibly unoriginal moment of indiscretion, this platform has been dubbed 4x4. Uninspired, yet very appropriate: the platform will very likely be large, loud, and so power hungry we will need a gas powered generator to run it. That doesn't mean we wouldn't want to own a system. We just aren't sure we'd want to pay for it.

So, the first question we asked about 4x4 was: how much different is this than taking an off the shelf 2P board and dropping in a couple 2xx series Opteron processors with NVIDIA's quad SLI? Unfortunately, we haven't gotten any answer other than to say that there is something that makes it different. From what we understand, 4x4 will support unbuffered DIMMS (while Opterons still require registered memory), and the platform will be focused towards tweakable motherboards. We are looking into all the details. While the more power! kick is always interesting, we have to wonder if there will be any software in the near term that can really harness all this raw potential.

Stepping past the enthusiast platform, we have arguably the most exciting announcement of the day: Torrenza. Along with K8L, AMD plans on openly licensing it's (until now proprietary) coherent HyperTransport technology. At first glance, this may not seem exciting, but AMD is throwing in a little twist: HTX slots. These HTX slots will be standard interfaces connected directly to an AMD CPU's HyperTransport link. If both of these links are coherent, the device and the CPU will be able to communicate directly with each other with cache coherency. Because of this, latency can be reduced greatly over other buses as well, enabling hardware vendors to begin to create true coprocessor technology once again.

In addition to the flexibility of allowing the addition of such "accelerators" (as AMD calls them) to be added in via HTX slots, the architecture of the K8L line will be flexible enough that AMD could choose to incorporate some of these coprocessor technologies on a CPU package, or even on a CPU die itself. This is possible because the interconnect interface is the same at any level of integration. Not only will companies be able to develop their own unique solutions to extend the capabilities of the system processor, but it may even be possible to see such technology integrated into future AMD parts at a more fundamental level.

The next two platform level technologies AMD spoke on are named Trinity and Raiden. At many levels, Raiden seems more like an AMD Live! style initiative enabled by Trinity and other technologies, but we're getting ahead of ourselves. At its core, Trinity is AMD's platform level support for hardware virtualization. In addition to previously introduced Pacifica technology, AMD is working with the PCISIG to develop advanced I/O virtualization in addition to enhancing security and manageability of virtualized hardware at every level. The actual hardware that will enable Trinity wasn't explicitly expounded upon, but we did get these two slides with a brief description of how security, manageability and virtualization can't be handled as three separate problems.

Moving on to Raiden, AMD wishes to change the way businesses look at the way they provide computing to their employees. Rather than hardware, AMD believes businesses would be better served by focusing on compute cycles. Server and PC hardware can be setup in blade-like configurations, and employees can run thin clients which stream their OS from the compute server. Ideally, the reality of where their "compute power" comes from won't be important to the end user as long as there was no difference in experience. Having a large number of under utilized computers is a cost companies could avoid by sharing the processing power of fewer machines over a large number of people.

If there is any technology that is Raiden specific, AMD was not forthcoming. From what we can tell, AMD will leverage the current enthusiasm over blade systems and its Trinity virtualization platform to push customers toward a centralized computational model on the basis of power and cost savings. Certainly the benefits are there if the technology can support it, and hopefully we will be able to get some clarification on how Raiden translates to actual hardware.

Index K8L Architecture
Comments Locked


View All Comments

  • MrKaz - Monday, June 5, 2006 - link

    Did you talk someone at AMD if they have some one interested (or going to do) some SQL accelerator, or CAD calculations accelerator, or even multimedia accelerator accelerator?

    It would be nice to boost the performance of SQL by 2X, or even media encoding from minutes to seconds...
  • DerekWilson - Tuesday, June 6, 2006 - link

    they are certainly talking heavily about the possibility of hardware like that, but no hardware designers have commited to building anything yet.
  • IntelUser2000 - Sunday, June 4, 2006 - link


    As with K8, K8L will have 3 ALUs (arithmetic logic units) and 3 AGUs (address generation units). Combined with cache enhancements and the new ability to reorder loads, K8L has a shot at outpacing Core in integer performance.

    No. Because Core Duo(Yonah) with inferior decoder configuration, inferior memory bandwidth(which won't matter a lot but will make slight difference) and platform, still manages to outperform the current K8's. The Pentium M, which is even worse than Core Duo(slightly) still manages to outperform the K8's in integer. Now put Core with integrated memory controller, and comparison will look like Core Duo against Athlon XP.

    Core microarchitecture will exceed K8's in general integer architecture, and at least equal in K8L's ability. Integer superiority is still gonna be there, K8L will be faster than Core in FP and SSE performance because of low latency integrated memory controller with lots more real-world bandwidth(well that depends on how AMD implements SSE, Intel may still have an advantage if AMD puts a poor implementation like they did with Athlon XP's SSE, or at least it looked poor).
  • JarredWalton - Sunday, June 4, 2006 - link

    If ~33% of all instructions are Loads, and K8 pretty much totally lacks the ability to reorder Loads, adding that feature could substantially boost performance. It definitely "has a shot" at beating Core, but it may also fall short. Anyone making blanket statements one way or the other - i.e. it *will* beat Core, or it *won't* come close - needs to take a step back and check what they really know and what they are just assuming.

    At present, AMD is saying K8L is going to have the ability to reorder Loads. They might only do minor reordering, or they might go so far as to have something similar to Conroe's memory disambiguation. Given that AMD hasn't done a major update to K8 in over 3 years (no, DDR2 controller and going dual core don't really count as major updates to the underlying architecture), K8L could be a lot of things. It migth only match Core Duo 2 on a clock-for-clock basis; it might fall short; it might even come out ahead. Also, there has been no indication that Intel is seriously planning on-die memory controller in the near future, probably to continue to protect their chipset market.

    Personally, I really hope AMD manages to basically match CD2 performance, because runaway performance leads don't help the consumer. In the end, theoretical integer, PF, SSE, etc. performance isn't as important as real-world application performance. Right now, it's just too soon to declare a victor in the Core Duo 2 vs. K8L match-up. CD2 vs. K8 is already pretty much a done deal, though, and there's no indication that AMD will be able to come out on top in that rivalry. K8L is their "counterattack", and that's the architecture that needs to compete with CD2.
  • IntelUser2000 - Sunday, June 4, 2006 - link


    If ~33% of all instructions are Loads, and K8 pretty much totally lacks the ability to reorder Loads, adding that feature could substantially boost performance. It definitely "has a shot" at beating Core, but it may also fall short. Anyone making blanket statements one way or the other - i.e. it *will* beat Core, or it *won't* come close - needs to take a step back and check what they really know and what they are just assuming.

    It's easy to see the performance in integer against Core. Core has ability to reorder loads, but Core Duo is in same situation as K8, it doesn't really have the ability either. Other than that, on the basic block diagram, K8 is superior architecturally to Core Duo, yet Integer performance is somewhat better on Core Duo. The difference probably goes deeper than that. One of the articles mention K7/K8 has similar technique to Intel's micro op fusion. It could be Intel's is much better, etc. If a K8 with substantially better microarchitecture(+ODMC) can't beat integer performance of Core Duo, will K8L with basically same microarchitecture(or may be worse) beat Core?? It's simple to see it probably won't.
  • DerekWilson - Tuesday, June 6, 2006 - link

    core duo can reorder loads as the Pentium M could reorder loads --">

  • MrKaz - Monday, June 5, 2006 - link

    P3 on steroids may beat the K7 on steroids in performance.
    But performance isn’t everything or Intel employees where out of job since K7 came out and beat P3 and P4. And Intel didnt recover yet!

    I didn’t see any presentation of Intel new architecture, but I bet even the Hammer look better than any thing Intel will release.">

    4MB cache, 128bit SSE that tells me nothing. Other than the P3 started with PC100 SDRAM, 256Kb cache and SSE and it's now at DDR2 667, 4MB cache and SSE4.
  • Sceptor - Saturday, June 3, 2006 - link

    Finally a real interconnect that can be used for a serious co-processor...perhaps a physics co-pro not limited by the PCI bus would help smooth transition to more realistic games.

    Or a dedicated video co-pro to use with Cad or 3D Modeling programs...
  • od4hs - Friday, June 2, 2006 - link">

    -> UK firm to unveil wall-socket PC

    The Jack PC thin client fits into a wall socket and is so energy-efficient it can get its power over Ethernet,39020330,39272166,00.htm">,39020330,39272166,00.htm
  • lopri - Friday, June 2, 2006 - link

    I totally agree that the "direct connect" is the most desirable way but I cannot help but think AMD is somewhat daydreaming. That is, what's showing in the slides seems way ahead of today's "practicality".

    I mean, we've had this PCI Express which has been strongly pushed by core logic vendors, but so far all we practically have are video cards. I sometimes think all these mobo makers pay more attention to "asthetic" point when they design PCI-E slots so the boards look prettier. (lol)

    If my understanding is correct, AMD will introduce a new type of slot, HTX, on motherboards. Will other technology/market follow? Or will it just give another chance to graphics card manufacturers to push us to buy new cards? On today's desktop boards, basically everything is "integrated", sans video. I know that a video card has its own core and frame buffer, and transfers data via Hyper Transport, but if a physics card can utilize the HTX, what stops a video card from connecting directly to CPU, without passing the core logic or system memory?

    I think this will also be closely related to the available bandwidth of HTX per CPU core (or cores), and I can't really think of any add-in board that'll prioritize the bandwidth other than video cards, (OK and the physics cards) even though the HTX will be an open standard. (look at the lazy/lame Creative)

    A very desirable case would be where storage (hard disks) can take advantage of this "direct" connection but then again there is a such thing called "memory", so my imagination stops there. (maybe solid-state/I-Ram type of storage can make use of the HTX? Then what's the use of memory? Taking care of I/O?) Talking about I/O, I just thought it'd be interesting to see keyboards/mice connect to CPU via HTX. (Sorry I couldn't resist)

    All in all, like the article says, this roadmap seems just too broad/ambiguous/futuristic. I'm not a CPU engineer so my thinking could be totaly off, though. If so, please enlighten.


Log in

Don't have an account? Sign up now