This is a very volatile time for Intel. In an ARM-less vacuum, Intel’s Haswell architecture would likely be the most amazing thing to happen to the tech industry in years. In mobile Haswell is slated to bring about the single largest improvement in battery life in Intel history. In graphics, Haswell completely redefines the expectations for processor graphics. There are even some versions that come with an on-package 128MB L4 cache. And on the desktop, Haswell is the epitome of polish and evolution of the Core microprocessor architecture. Everything is better, faster and more efficient.

There’s very little to complain about with Haswell. Sure, the days of insane overclocks without touching voltage knobs are long gone. With any mobile-first, power optimized architecture, any excess frequency at default voltages is viewed as wasted power. So Haswell won’t overclock any better than Ivy Bridge, at least without exotic cooling.

You could also complain that, for a tock, the CPU performance gains aren’t large enough. Intel promised 5 - 15% gains over Ivy Bridge at the same frequencies, and most of my tests agree with that. It’s still forward progress, without substantial increases in power consumption, but it’s not revolutionary. We compare the rest of the industry to Intel’s excellent single threaded performance and generally come away disappointed. The downside to being on the top is that virtually all improvements appear incremental.

The fact of the matter is that the most exciting implementations of Haswell exist outside of the desktop parts. Big gains in battery life, power consumption and even a broadening of the types of form factors the Core family of processors will fit into all apply elsewhere. Over the coming weeks and months we’ll be seeing lots of that, but today, at least in this article, the focus is on the desktop.

Haswell CPU Architecture Recap

Haswell is Intel’s second 22nm microprocessor architecture, a tock in Intel’s nomenclature. I went through a deep dive on Haswell’s Architecture late last year after IDF, but I’ll offer a brief summary here.

At the front end of the pipeline, Haswell improved branch prediction. It’s the execution engine where Intel spent most of its time however. Intel significantly increased the sizes of buffers and datastructures within the CPU core. The out-of-order window grew, to feed an even more parallel set of execution resources.

Intel added two new execution ports (8 vs 6), a first since the introduction of the Core microarchitecture back in 2006.

On the ISA side, Intel added support for AVX2, which includes an FMA operation that considerably increases FP throughput of the machine. With a doubling of peak FP throughput, Intel doubled L1 cache bandwidth to feed the beast. Intel also added support for transactional memory instructions (TSX) on some Haswell SKUs.

The L3 cache is now back on its own power/frequency plane, although most of the time it seems to run in lockstep with the CPU cores. There appears to be a 2 - 3 cycle access penalty as a result of decoupling the L3 cache.

Power Improvements
Comments Locked


View All Comments

  • bji - Monday, June 3, 2013 - link

    +10 false dichotomy. Look it up.
  • kenjiwing - Saturday, June 1, 2013 - link

    Any reviews comparing this gen to a 980x??
  • Ryan Smith - Saturday, June 1, 2013 - link

    It's available in Bench.
  • owikh84 - Saturday, June 1, 2013 - link

    4560K??? Not 4770K & 4670K?
  • karasaj - Saturday, June 1, 2013 - link

    4670K is the Haswell equivalent of a 3570K.
  • hellcats - Saturday, June 1, 2013 - link

    I read with some concern that the TSX instructions aren't going to be available on all SKUs. This is the main thing that I've been looking forward to on Haswell! Not providing the capability across the family is reminiscent of the 486SX/DX debacle. TSX could be huge for game physics as it would allow for far more consistent scaling. I know it is supposed to be backwards compatible, but what's the point of coding to it if it isn't always there?
  • zanon - Saturday, June 1, 2013 - link

    Agreed, TSX is one of the most interesting parts of Haswell so I'm sorry not to see it get more discussion. And as you say (and like with VT-d or other tech) I think Intel is being stupid and self-defeating by trying to make it an artificial differentiator. Unlike general basics of a chip such as clock rate, cache, hyperthreading or raw execution resources these sorts of features are only as valuable as the software that's coded for them, and nothing kills adoption amongst developers like "well maybe it'll be there but maybe not." If they can't depend on it, then it's not worth spending much extra time with and tremendously limits what it can be used for. That principal shows up over and over, it's why consoles can typically hold their own for so long. Even though on paper they get creamed, in reality developers are actually able to aim for 100% usage of all resources because there will never be any question about what is available.

    For features like this Intel should aim for as broad adoption as possible, or what's the point? They can differentiate just fine with pure performance, power, and physical properties. Disappointing as always.
  • penguin42 - Saturday, June 1, 2013 - link

    Agreed! I'd also be interested in seeing performance comparisons with a transactionally optimised piece of code.
  • Johnmcl7 - Saturday, June 1, 2013 - link

    Definitely, I was a bit puzzled reading the review to find barely a mention of TSX when I thought it was meant to be one of the ground breaking new features on Haswell. Even if there was only a synthetic benchmark for now it would be extremely interesting to see if it works anything like as well as promised.

  • bji - Sunday, June 2, 2013 - link

    TSX is so esoteric in its applicability that I think you'd be very hard pressed to a) find a benchmark that could actually exercise it in a meaningful way and b) have any expectation that this benchmark would translate into any actual perceived performance gain in any application run by 99.999% of users.

    In other words - TSX is only going to help performance in some very rare and obscure types of software that "normal" users will never even come close to using, let alone caring about the performance of.

    However I am intruiged by your speculation that TSX will be beneficial for physics simulation, which I guess could translate to perceivable performance increases for software that end users might actually use in the form of game physics. I found a paper that described techniques for using transactional memory to improve performance for physics simulation but it only found a 27% performance increase, which is not exactly earth shattering (I wouldn't call it "huge for game physics" personally).

Log in

Don't have an account? Sign up now