Simultaneous MultiThreading (SMT)

Zen will be AMD’s first foray into a true simultaneous multithreading structure, and certain parts of the core will act differently depending on their implementation. There are many ways to manage threads, particularly to avoid stalls where one thread is blocking another that ends in the system hanging or crashing. The drivers that communicate with the OS also have to make sure they can distinguish between threads running on new cores or when a core is already occupied – to achieve maximum throughput then four threads should be across two cores, but for efficiency where speed isn’t a factor, perhaps power gating/clock gating half the cores in a CCX is a good idea.

There are a number of ways that AMD will deal with thread management. The basic way is time slicing, and giving each thread an equal share of the pie. This is not always the best policy, especially when you have one performance dominant thread, or one thread that creates a lot of stalls, or a thread where latency is vital. In some methodologies the importance of a thread can be tagged or determined, and this is what we get here, though for some of the structures in the core it has to revert to a basic model.

With each thread, AMD performs internal analysis on the data stream for each to see which thread has algorithmic priority. This means that certain threads will require more resources, or that a branch miss needs to be prioritized to avoid long stall delays. The elements in blue (Branch Prediction, INT/FP Rename) operate on this methodology.

A thread can also be tagged with higher priority. This is important for latency sensitive operations, such as a touch-screen input or immediate user input elements required. The Translation Lookaside Buffers work in this way, to prioritize looking for recent virtual memory address translations. The Load Queue is similarly enabled this way, as typically low latency workloads require data as soon as possible, so the load queue is perfect for this.

Certain parts of the core are statically partitioned, giving each thread an equal timing. This is implemented mostly for anything that is typically processed in-order, such as anything coming out of the micro-op queue, the retire queue and the store queue. However, when running in SMT mode but only with a single thread, the statically partitioned parts of the core can end up as a bottleneck, as they are idle half the time.

The rest of the core is done via competitive scheduling, meaning that if a thread demands more resources it will try to get there first if there is space to do so each cycle.

New Instructions

AMD has a couple of tricks up its sleeve for Zen. Along with including the standard ISA, there are a few new custom instructions that are AMD only.

Some of the new commands are linked with ones that Intel already uses, such as RDSEED for random number generation, or SHA1/SHA256 for cryptography (even with the recent breakthrough in security). The two new instructions are CLZERO and PTE Coalescing.

The first, CLZERO, is aimed to clear a cache line and is more aimed at the data center and HPC crowds. This allows a thread to clear a poisoned cache line atomically (in one cycle) in preparation for zero data structures. It also allows a level of repeatability when the cache line is filled with expected data. CLZERO support will be determined by a CPUID bit.

PTE (Page Table Entry) Coalescing is the ability to combine small 4K page tables into 32K page tables, and is a software transparent implementation. This is useful for reducing the number of entries in the TLBs and the queues, but requires certain criteria of the data to be used within the branch predictor to be met.

The Core Complex, Caches, and Fabric Power, Performance, and Pre-Fetch: AMD SenseMI
Comments Locked

574 Comments

View All Comments

  • Meteor2 - Friday, March 3, 2017 - link

    ...In which case you'd be better off with a 7700K, looking at the benchmark results. Cheaper too.
  • ddriver - Thursday, March 2, 2017 - link

    Ryzen offers the same performance at half the cost. More pci-e lanes is good for io, however quad channel memory is pretty much pointless, aside of pointless synthetic benches. Ryzen might not make it to my personal workstation due to the low pci-e lane count, but it has enough to replace my aging 3770k farm nodes, to which it will be a significant upgrade, provided the chip and platform turn out to be stable and bug free,

    Intel has gotten lazy and sloppy, bricking products, chipset bugs, they haven't really done anything new architecture wise for years, milking the same old cow.

    It is rather silly to assume that gaming dictates CPU prices, this IS NOT a gaming product, if your ass-logic is to be followed, the intel needs to drop the 7700k price to 168, because in games it is barely any faster than the i3-7350K, and has the same pathetic, even lower than ryzen, number of pci-e lanes.

    This is a chip for HPC, which gaming is NOT. Go back to the kiddie garden, eight core chips are for grown ups ;)
  • imaheadcase - Thursday, March 2, 2017 - link

    People compare it to gaming, because its the main driving for these type of CPUs, its not even gaming specfic but VR, Graphics modeling, etc. You honestly think people are buying these for offices or industry for complex math problems? lol
  • ddriver - Thursday, March 2, 2017 - link

    It is not "people" but "fanboys", and they cling to gaming because it is the only workload where intel can offer better performance for the price, albeit by comparing products from different tiers, which is quite frankly moronic.

    Cars are faster than trucks, so who in the world needs to spend money on trucks? That's the kind of retarded logic you are advocating...

    Smart people buy whatever suits their needs. Obviously, if all you do is play games you wouldn't be buying ryzen or a lga2011 system. Just get an unlocked i5 and overclock it, best bang for the buck. You must realize that even if you don't, other people use computers for tasks other than gaming. And for a large portion of them ryzen will be the best deal, because it is versatile - it is good enough for gaming too, while still offering significant performance advantage compare to an intel quad in tasks that are time staking, and are very much competitive with intel's 8 and 10 core chips while delivering more than twice the value, which is important for everyone who doesn't have money to throw away.

    Claiming that "gaming is the main driving for these type of CPUs" is foolish to say the least, because games don't benefit from that particular type of CPUs. Most of the games can't even property utilize 4 threads. And this is not likely to change soon, because the overhead of complexity and thread synchronization is not worth it for non-performance demanding tasks such as games.
  • Lord-Bryan - Thursday, March 2, 2017 - link

    That's one really well thought out argument
  • rarson - Thursday, March 2, 2017 - link

    Ryzen's versatility and price are the two biggest factors that make it so good. It might not beat the very best gaming CPU that Intel has, or the very best multi-threaded monster that Intel has in every scenario, but it's competitive with both at half the price of the high-end stuff. Hence, while I do game some and want to build a computer to use for gaming, I also do other stuff like audio production that benefits greatly from Ryzen's multi-threaded performance. To me, it's a no-brainer: Ryzen right now is the best bang-for-the-buck chip for someone who wants all-around high-end performance, by far. Maybe not the 1800X, I kind of think the 1700X is a better value, but still, for most people who want multiple-use performance instead of absolute maximum gaming performance, Ryzen is the clear choice.

    Ryzen's max clock speeds seem, like Intel's, to be hindered by the total number of cores on chip, so it should be extremely interesting to see how the 4- and 6-core chips overclock once they arrive, and what kind of performance they'll achieve. I actually think that, like Intel, a 4-core Ryzen might be a better gaming chip than the 8-core, and if that's the case, then it might be really darn close to Intel's best Kaby Lake, because like you pointed out, most games aren't threaded well at all.

    Additionally, from a gaming perspective, it seems like AMD has done more to push technology forward in that respect than anyone else. They've worked on Mantle, Vulkan, FreeSync, TrueAudio, and others. They've always tried to give performance value by offering more cores, but software has been slow to take advantage of them. Intel is content to stagnate by offering extremely incremental increases because performance is "good enough" so developers have no reason to really try to take advantage of extra cores aside from outside use cases. With Ryzen, AMD is pushing chips towards higher core counts (much like they did with the Athlon X2) but this time, they're trying harder to get developers on board and help them achieve good results. So while it always takes forever for software to better utilize the hardware, once the hardware becomes more common the software will start to follow and you'll see the actual gaming performance improve. Is that a valid reason to buy Ryzen today if your sole focus is gaming? Of course not, but it does bode well for Ryzen owners in the future. The performance can only get better. Can the same be said about Intel? Well, probably not if you're using one of the 4-core chips. It's pretty much a known quantity.

    I had high hopes for Bulldozer and Ryzen is the exact opposite of what Bulldozer was. I feel like the CPU market has been stagnant for years and now suddenly there's a reason to be excited. This makes AMD competitive again, which will be good for pricing even if you're an Intel fan. It's been a long wait, but it was worth it, this is a good product.
  • Notmyusualid - Friday, March 3, 2017 - link

    http://www.gamersnexus.net/hwreviews/2822-amd-ryze...
  • Makaveli - Thursday, March 2, 2017 - link

    +1 ddriver you destroyed that kid with your logic well done.
  • khanikun - Friday, March 3, 2017 - link

    Gaming definitely isn't the main driving force for CPUs, as use case changes. I bought a 7700K for my gaming rig. I'd get a 1700 for a VM host, as I'd like to start building a lab again. It won't be this round though. I'd rather wait for AMD to iron our any kinks and buy the next generation. It's something the 7700K could do, but more cores would definitely make it a much better lab.
  • Meteor2 - Friday, March 3, 2017 - link

    Nobody buys mid/high-end consumer chips for HPC. They buy them for gaming. A few for video production. That's it.

Log in

Don't have an account? Sign up now