An Update on Apple’s A7: It's Better Than I Thought

When I reviewed the iPhone 5s I didn’t have much time to go in and do the sort of in-depth investigation into Cyclone (Apple’s 64-bit custom ARMv8 core) as I did with Swift (Apple’s custom ARMv7 core from A6) the year before. I had heard rumors that Cyclone was substantially wider than its predecessor but I didn’t really have any proof other than hearsay so I left it out of the article. Instead I surmised in the 5s review that the A7 was likely an evolved Swift core rather than a brand new design, after all - what sense would it make to design a new CPU core and then do it all over again for the next one? It turns out I was quite wrong.

Armed with a bit of custom code and a bunch of low level tests I think I have a far better idea of what Apple’s A7 and Cyclone cores look like now than I did a month ago. I’m still toying with the idea of doing a much deeper investigation into A7, but I wanted to share some of my findings here.

The first task is to understand the width of the machine. With Swift I got lucky in that Apple had left a bunch of public LLVM documentation uncensored, referring to Swift’s 3-wide design. It turns out that although the design might be capable of decoding, issuing and retiring up to three instructions per clock, in most cases it behaved like a 2-wide machine. Mix FP and integer code and you’re looking at a machine that’s more like 1.5 instructions wide. Obviously Swift did very well in the market and its competitors at the time, including Qualcomm’s Krait 300, were similarly capable.

With Cyclone Apple is in a completely different league. As far as I can tell, peak issue width of Cyclone is 6 instructions. That’s at least 2x the width of Swift and Krait, and at best more than 3x the width depending on instruction mix. Limitations on co-issuing FP and integer math have also been lifted as you can run up to four integer adds and two FP adds in parallel. You can also perform up to two loads or stores per clock.

I don’t yet have a good understanding of the number of execution ports and how they’re mapped, but Cyclone appears to be the widest ARM architecture we’ve ever seen at this point. I’m talking wider than Qualcomm’s Krait 400 and even ARM’s Cortex A15.

I did have some low level analysis in the 5s review, where I pointed out the significantly reduced memory latency and increased bandwidth to the A7. It turns out that I was missing a big part of the story back then as well…

A Large System Wide Cache

In our iPhone 5s review I pointed out that the A7 now featured more computational GPU power than the 4th generation iPad. For a device running at 1/8 the resolution of the iPad, the A7’s GPU either meant that Apple had an application that needed tons of GPU performance or it planned on using the A7 in other, higher resolution devices. I speculated it would be the latter, and it turns out that’s indeed the case. For the first time since the iPad 2, Apple once again shares common silicon between the iPhone 5s, iPad Air and iPad mini with Retina Display.

As Brian found out in his investigation after the iPad event last week all three devices use the exact same silicon with the exact same internal model number: S5L8960X. There are no extra cores, no change in GPU configuration and the biggest one: no increase in memory bandwidth.

Previously both the A5X and A6X featured a 128-bit wide memory interface, with half of it seemingly reserved for GPU use exclusively. The non-X parts by comparison only had a 64-bit wide memory interface. The assumption was that a move to such a high resolution display demanded a substantial increase in memory bandwidth. With the A7, Apple takes a step back in memory interface width - so is it enough to hamper the performance of the iPad Air with its 2048 x 1536 display?

The numbers alone tell us the answer is no. In all available graphics benchmarks the iPad Air delivers better performance at its native resolution than the outgoing 4th generation iPad (as you'll soon see). Now many of these benchmarks are bound more by GPU compute rather than memory bandwidth, a side effect of the relative lack of memory bandwidth on modern day mobile platforms. Across the board though I couldn’t find a situation where anything was smoother on the iPad 4 than the iPad Air.

There’s another part of this story. Something I missed in my original A7 analysis. When Chipworks posted a shot of the A7 die many of you correctly identified what appeared to be a 4MB SRAM on the die itself. It's highlighted on the right in the floorplan diagram below:


A7 Floorplan, Courtesy Chipworks

While I originally assumed that this SRAM might be reserved for use by the ISP, it turns out that it can do a lot more than that. If we look at memory latency (from the perspective of a single CPU core) vs. transfer size on A7 we notice a very interesting phenomenon between 1MB and 4MB:

That SRAM is indeed some sort of a cache before you get to main memory. It’s not the fastest thing in the world, but it’s appreciably quicker than going all the way out to main memory. Available bandwidth is also pretty good:

We’re only looking at bandwidth seen by a single CPU core, but even then we’re talking about 10GB/s. Lookups in this third level cache don’t happen in parallel with main memory requests, so the impact on worst case memory latency is additive unfortunately (a tradeoff of speed vs. power).

I don’t yet have the tools needed to measure the impact of this on-die memory on GPU accesses, but in the worst case scenario it’ll help free up more of the memory interface for use by the GPU. It’s more likely that some graphics requests are cached here as well, with intelligent allocation of bandwidth depending on what type of application you’re running.

That’s the other aspect of what makes A7 so very interesting. This is the first Apple SoC that’s able to deliver good amounts of memory bandwidth to all consumers. A single CPU core can use up 8GB/s of bandwidth. I’m still vetting other SoCs, but so far I haven’t come across anyone in the ARM camp that can compete with what Apple has built here. Only Intel is competitive.

 

Introduction, Hardware & Cases CPU Changes, Performance & Power Consumption
Comments Locked

444 Comments

View All Comments

  • Wilco1 - Thursday, October 31, 2013 - link

    The "cheats" do not prevent thermal throttling down at all - they simply switch to maximum frequency immediately. The issue is that many of the popular benchmarks run for such short periods (~10 milliseconds) that the DVFS has no time to switch to maximum frequency. A long running benchmark always runs at the maximum frequency.
  • KoolAidMan1 - Wednesday, October 30, 2013 - link

    CPU throttling for thermal reasons, one of the most common functions in CPUs and GPUs, is the exact opposite of an application boosting clock speed based on triggers from specific benchmarks and applications in order to game performance numbers.

    Blown away by the lack of thinking in the comments
  • KPOM - Wednesday, October 30, 2013 - link

    Maybe because he had previously commented that Apple, Motorola, and Google don't cheat on benchmarks.
  • Graag - Wednesday, October 30, 2013 - link

    Look, lilo, we all know that you're an anti-Apple shill based on your posting history, so it's probably too much for you to actually understand what you're writing.

    The kind of cheating you are incorrectly remembering occurred when a device that *usually* throttled its frequency to save battery power did not throttle its frequency when it detected that a benchmark program was running.

    Obviously, that's not what's going on here.
  • tential - Wednesday, October 30, 2013 - link

    Dunno where all the hate is on this review.
    I read the Engadget review first, and was meh about it. I come here and I read a full hardware review. It's just amazing the amount of effort that goes into these reviews.
    My only complaint is that anandtech doesn't review more products! Haha. I'd really like to see some ultrabook/lower end laptops reviewed. Some more routers as well. This is a great review though, I realy don't even want to waste my money now on a new tablet/phone when apple seems to have went full steam ahead to destroy competition in performance. it's too bad the average consumer knows nothing about this stuff though.
  • tential - Wednesday, October 30, 2013 - link

    To further elaborate since people are saying this review favors Apple. The review of the Nexus 7 was also VERY NICE in terms of what google did. Anandtech is praising apple in this review, but they praise companies that do a good job period. I don't hitnk this review favors apple at all. It does point outa number of short comings as well.

    I think the Ipad Air is great, but Google NExus 7 definitely holds its own at the price point given and clearly there is an advantage for Apple with it's focus on its own ARM processor development because they develop their own processor for their own new OS while SnapDragon and Google work independently (to my knowledge at least).
  • Streamlined - Wednesday, October 30, 2013 - link

    It's because the web is full of paid shills who post comments for money from Apples competitors.
  • KoolAidMan1 - Thursday, October 31, 2013 - link

    Its because tech forums are filled with paid shills from Apple's competitors and anti-Apple fanboys who will jump through as many mental hoops as possible to deny any positives their products and ecosystem has
  • prashy21 - Wednesday, October 30, 2013 - link

    Anand,
    A very nice and detailed review ( as always ).
    I have couple of points
    "Touch ID" should have been part of it ( it's deal breaker for me to upgrade my 3rd gen iPad ).
    As you pointed out 32GB should be norm now as Apple charges premium for their product.
  • azazel1024 - Wednesday, October 30, 2013 - link

    Good overall review.

    However, I have to disagree a lot with the comment on speaker location.

    Yes, you have trade-offs between both on top or bottom if held in portrait postion or one on top one on the bottom in portrait position. However, I do think Apple choose poorly with this. I can't believe I am the only one who pretty much watches movies 100% in landscape postion. On top of that probably 90% of my gaming is in landscape with the rare exception of a game that is only portrait (cut the rope for example). Probably more than 75% of my music listening is also in landscape (either because it is on a stand in landscape, or I am browsing the web in lanscape).

    So for me personally, and I'd bet most other users, the times they'd be using the speakers, they are going to be using their iPad in landscape mode...which means they get no stereo.

    The RF window could have been relocated to what is the side in portrait mode, or top when in landscape to accomodate repositioning of the speakers. Well, it should have anyway, though it might not be feasible with the internal layout of battery and boards.

    One of those things that interests me in the T100 (other than, well, full windows) is proper placement of the speakers for stereo sound when I'd actually be using the speakers 90+% of the time (and better speakers it sounds like?).

    Apple has kind of lost me through no real fault of their own though. The iPad was amazing when it first came out. The iPad 2 was great. The Retina iPad 3 had some big trade offs but an amazing screen (weight and charging time). The iPad 4 is...uh...an increment. The Air looks great and seems to be a huge improvement over the iPad 3/4.

    The problem is that others have improved their game. Android has gotten much better as an OS over the intervening years and OEM/ODMs have really stepped up their game in Android tablet designs.

    Windows 8 was an okay touch OS. 8.1 was a better (if not great, an okay) touch OS. Intel finally came out with a good enough (and in some ways, damned good) x86 Atom processor in Silvermont/Bay Trail.

    iOS just doesn't have productivity potential in the areas I am interested in and zero compatibility with the productivity apps I use on the desktop/laptop (Lightroom being one of the main ones). For general content consumption, Android is cheaper with some rather good designs. Windows tablets, with the T100...are also cheaper and so much more productivity potential when needed.

    I'll keep my iPhone, thank you very much. For a phone and phone OS, iPhones and iOS are great IMHO. For a tablet though, now that the hardware has continuously improved, I feel like it can finally deliver on productivity like I've always wanted it to. So an "appliance" operating system is no longer appropriate to me in a tablet (well, not a tablet I'd buy). The hardware is great, but the OS not so much anymore (for a tablet).

Log in

Don't have an account? Sign up now