Custom Code to Understand a Custom Core

Section by Anand Shimpi

All Computer Engineers at NCSU had to take mandatory programming courses. Given that my dad is a Computer Science professor, I always had exposure to programming, but I never considered it my strong suit - perhaps me gravitating towards hardware was some passive rebellious thing. Either way I knew that in order to really understand Swift, I'd have to do some coding on my own. The only problem? I have zero experience writing Objective-C code for iOS, and not enough time to go through a crash course.

I had code that I wanted to time/execute in C, but I needed it ported to a format that I could easily run/monitor on an iPhone. I enlisted the help of a talented developer friend who graduated around the same time I did from NCSU, Nirdhar Khazanie. Nirdhar has been working on mobile development for years now, and he quickly made the garbled C code I wanted to run into something that executed beautifully on the iPhone. He gave me a framework where I could vary instructions as well as data set sizes, which made this next set of experiments possible. It's always helpful to know a good programmer.

So what did Nirdhar's app let me do? Let's start at the beginning. ARM's Cortex A9 has two independent integer ALUs, does Swift have more? To test this theory I created a loop of independent integer adds. The variables are all independent of one another, which should allow for some great instruction level parallelism. The code loops many times, which should make for some easily predictable branches. My code is hardly optimal but I did keep track of how many millions of adds were executed per second. I also reported how long each iteration of the loop took, on average.

Integer Add Code
  Apple A5 (2 x Cortex A9 @ 800MHz Apple A5 Scaled (2 x Cortex A9 @ 1300MHz Apple A6 (2 x Swift @ 1300MHz Swift / A9 Perf Advantage @ 1300MHz
Integer Add Test 207 MIPS 336 MIPS 369 MIPS 9.8%
Integer Add Latency in Clocks 23 clocks   21 clocks  

The code here should be fairly bound by the integer execution path. We're showing a 9.8% increase in performance. Average latency is improved slightly by 2 clocks, but we're not seeing the sort of ILP increase that would come from having a third ALU that can easily be populated. The slight improvement in performance here could be due to a number of things. A quick look at some of Apple's own documentation confirms what we've seen here: Swift has two integer ALUs and can issue 3 operations per cycle (implying a 3-wide decoder as well). I don't know if the third decoder is responsible for the slight gains in performance here or not.

What about floating point performance? ARM's Cortex A9 only has a single issue port for FP operations which seriously hampers FP performance. Here I modified the code from earlier to do a bunch of single and double precision FP multiplies:

FP Add Code
  Apple A5 (2 x Cortex A9 @ 800MHz Apple A5 Scaled (2 x Cortex A9 @ 1300MHz Apple A6 (2 x Swift @ 1300MHz Swift / A9 Perf Advantage @ 1300MHz
FP Mul Test (single precision) 94 MFLOPS 153 MFLOPS 143 MFLOPS -7%
FP Mul Test (double precision) 87 MFLOPS 141 MFLOPS 315 MFLOPS 123%

There's actually a slight regression in performance if we look at single precision FP multiply performance, likely due to the fact that performance wouldn't scale perfectly linearly from 800MHz to 1.3GHz. Notice what happens when we double up the size of our FP multiplies though, performance goes up on Swift but remains unchanged on the Cortex A9. Given the support for ARM's VFPv4 extensions, Apple likely has a second FP unit in Swift that can help with FMAs or to improve double precision FP performance. It's also possible that Swift is a 128-bit wide NEON machine and my DP test compiles down to NEON code which enjoys the benefits of a wider engine. I ran the same test with FP adds and didn't notice any changes to the data above.

Sanity Check with Linpack & Passmark

Section by Anand Shimpi

Not completely trusting my own code, I wanted some additional data points to help understand the Swift architecture. I first turned to the iOS port of Linpack and graphed FP performance vs. problem size:

Even though I ran the benchmark for hundreds of iterations at each data point, the curves didn't come out as smooth as I would've liked them to. Regardless there's a clear trend. Swift maintains a huge performance advantage, even at small problem sizes which supports the theory of having two ports to dedicated FP hardware. There's also a much smaller relative drop in performance when going out to main memory. If you do the math on the original unscaled 4S scores you get the following data:

Linpack Throughput: Cycles per Operation
  Apple Swift @ 1300MHz (iPhone 5) ARM Cortex A9 @ 800MHz (iPhone 4S)
~300KB Problem Size 1.45 cycles 3.55 cycles
~8MB Problem Size 2.08 cycles 6.75 cycles
Increase 43% 90%

Swift is simply able to hide memory latency better than the Cortex A9. Concurrent FP/memory operations seem to do very well on Swift...

As the last sanity check I used Passmark, another general purpose iOS microbenchmark.

Passmark CPU Performance
  Apple A5 (2 x Cortex A9 @ 800MHz Apple A5 Scaled (2 x Cortex A9 @ 1300MHz Apple A6 (2 x Swift @ 1300MHz Swift / A9 Perf Advantage @ 1300MHz
Integer 257 418 614 47.0%
FP 230 374 813 118%
Primality 54 87 183 109%
String qsort 1065 1730 2126 22.8%
Encryption 38.1 61.9 93.5 51.0%
Compression 1.18 1.92 2.26 17.9%

The integer math test uses a large dataset and performs a number of add, subtract, multiply and divide operations on the values. The dataset measures 240KB per core, which is enough to stress the L2 cache of these processors. Note the 47% increase in performance over a scaled Cortex A9.

The FP test is identical to the integer test (including size) but it works on 32 and 64-bit floating point values. The performance increase here despite facing the same workload lends credibility to the theory that there are multiple FP pipelines in Swift.

The Primality benchmark is branch heavy and features a lot of FP math and compares. Once again we see huge scaling compared to the Cortex A9.

The qsort test features integer math and is very branch heavy. The memory footprint of the test is around 5MB, but the gains here aren't as large as we've seen elsewhere. It's possible that Swift features a much larger branch mispredict penalty than the A9.

The Encryption test works on a very small dataset that can easily fit in the L1 cache but is very heavy on the math. Performance scales very well here, almost mirroring the integer benchmark results.

Finally the compression test shows us the smallest gains once you take into account Swift's higher operating frequency. There's not much more to conclude here other than we won't always see greater than generational scaling from Swift over the previous Cortex A9.

Decoding Swift Apple's Swift: Visualized
Comments Locked

276 Comments

View All Comments

  • rarson - Thursday, October 18, 2012 - link

    Me too, other than the stupid proprietary connection that jacks the price of everything up.
  • Spunjji - Friday, October 19, 2012 - link

    I sit pretty firmly in this camp, too. Despite the physical durability flaws, I do find the overall package of the iPhone 4/4S/5 to be superior to most comparable 'Droid handset. I just find the software to be unbearably obstructive to my desired use patterns.
  • steven75 - Wednesday, October 17, 2012 - link

    Maybe some people want a still larger display but keep the industry leading app support, industry leading hardware ecosystem, airplay, apple store support, industry leading resale value, industry leading OS upgrade support, and without any carrier bloatware?

    Seems pretty possible to me.
  • GotThumbs - Tuesday, October 16, 2012 - link

    "we've just got to deal with it." Wrong. You have to "deal with it".

    Believe it or not....everyone does NOT own one of these phones. The idea of getting a brand new item, be it a car, camera, laptop, tablet or phone and having to deal with the fact that that the companies quality controls are sub-standard is one of the lamest things I've heard....Oh! besides the number one example...... "You're holding if wrong" - Steve Jobs.

    I try to not let this kind of monologue that reeks of Apple fanism not bother me...but come on! Talk about romancing about a freaking phone. Please keep it to a level of unemotional comparisons and the feel free to give your personal thoughts and not assume to speak for everyone else.

    The fact that you felt compelled to write 5 or more paragraphs on the anodizing process is just pathetic. I stopped reading about it after the first paragraph and skipped to the bottom. I thought this is supposed to be a phone review, not a discovery channel episode on the anodizing process. I can't speak for you or anyone else, but I'm pretty comfortable telling you most consumers probably don't care about the process of anodizing, they just expect a quality product for their money.

    People are paying good money...in a bad economy and your saying all they can do is "deal with it"? How about having an open mind and mention they have the choice to buy a different phone or wait for Apple to fix it in their next generation. Your only position appears to be....suck it up, its Apple and thats just part of being in the collective.

    Again,,,,, YOU do not speak for everyone so please drop the "WE".

    Rant done.

    Best wishes to all on your choices in life.
  • crankerchick - Tuesday, October 16, 2012 - link

    Talk about overreaction. Keep the statement within the context of the article. iPhone users have to deal with it if they want to remain iPhone users. Anand is an iPhone user as well are more than a few people reading his review--thus the use of the word "we" instead of the use of the term "iPhone users."

    That said, he should take care lose the "we" but wow, what a rant for something that one can easily use common sense and say, "No I'm not stuck with it." Other bloggers and review sites do the same thing.

    LOL, everyone is always looking for someone to point the fanboy finger at.
  • KPOM - Tuesday, October 16, 2012 - link

    Apparently you have never read an AnandTech review. They go into that kind of detail all the time. That's what people like about them. You aren't going to read that in a CNet, The Verge, or Engadget article. You might get some of that at Ars Technica. But AnandTech goes into excruciating detail.
  • VivekGowri - Tuesday, October 16, 2012 - link

    I mean, it's not like Apple is going to radically alter it a month after production starts, so if you want an iPhone 5, your options are to either put a case on it, or suck it up and live with the scratches. Alternatively, you could buy a 4S (if you want iOS) or any other phone that floats your boat.

    I'm a guy that daily drives a Galaxy Nexus, so accusing me of iOS fanboyism isn't necessarily the most productive way of going about your day.
  • phillyry - Sunday, October 21, 2012 - link

    Yes!

    Well said Vivek.

    But ya, Apple should still have the pressure put on them. So, I could see how people might take it the wrong way. 'Cause it could seem like you're just like, "It's all good Apple, we'll just suck it up." When, in actuality, your ideas are as you stated here. As per the OP's rant, I definitely thought it was off-base but could see where he would draw that conclusion, as it came across that way to me too. And, perhaps like me, he has a hard time keeping track of who uses which phone from the podcasts.

    Again, well retorted though.
  • jiffylube1024 - Tuesday, October 16, 2012 - link

    You are seriously complaining about the depth the review went into on the anodizing
    process? You're reading a review of a product and you're complaining that you're being given more information? How about you just skip over that section if it doesn't interest you.

    I lol'ed that you called that kind of serious scientific investigation into the anodizing process (which I found incredibly informative) "pathetic". Real, fact-based journalism apparently bores you; you'd just rather read opinion pieces and pass judgment on them. How high minded of you!

    Other reviews don't even mention anything about the anodizing process other than that it's there. I don't get why you'd even bother reading a review on AnandTECH if you don't care about the technology...

    As for the author's position to "deal with it" (the anodization scratch issue) -- what more can the author do? He can't fix the problem or even address it from a manufacturing standpoint. The review points out the issues with it; the decision making process is up to the consumer and the fixing of the problem is up to the manufacturer.
  • GotThumbs - Wednesday, October 17, 2012 - link

    "Other reviews don't even mention anything about the anodizing process other than that it's there. I don't get why you'd even bother reading a review on AnandTECH if you don't care about the technology..."

    Anodizing a piece of aluminum does not constitute "technology" when compared to the design of a SoC or camera, at least in my opinion. I see it as a finishing process. My point is that a side link to more detailed information on the anodizing process would have sufficed and kept the reader on track with the hardware review.

    I visit Anandtech on a daily basis and have been reading/visiting this site from the early years when Anand was still in High School. I thoroughly enjoy reading/learning about how new technologies in hardware are evolving and when they are compared to other current hardware available in the marketplace. But I feel there has been a growing tendency in Apple product reviews to have a hint of personal/emotional input rather than sticking to an analytical/technical assessment and let each reader digest the information without the personal emotional spin. It's like todays "News" casters interjecting thier personal thoughts/opinions on a news story. I prefer to get the facts and come to my own conclusion.

    In case you haven't realized, more and more in todays society, we are "Marketed to" in ways that are growing exponentially. Todays marketing companies continue to market to us using methods not just like Product Placement in TV shows, Reality shows, Movies, Red Carpet runways. etc., but on FB, Twitter, blogs, and weak "tech reviews" like CNN's (Read more like product ads than a review) etc. Because of this bombardment of marketing from every possible source imaginable and newly evolving, I don't think its wrong to call out a reviewer when I feel there is even a whiff of non-neutrality. They can take it with a grain of salt or ponder on their next review to be sure they are approaching it in a clear and unbiased manner.

    Complacency in a society and lowering on one's expectations is not something to embrace, its to be challenged and called out.

    Listen, no one is perfect and yes I may have been a little high strung in my post, but it was fueled by emotion and passion and I won't apologize for that.

    I do apologize to Vivek Gowri if I offended him in any way. It was not my intent.

    Best wishes

Log in

Don't have an account? Sign up now