Broadwell-U: On Performance

As part of the Broadwell-U launch, it would not be complete without a list of performance related metrics direct from Intel indicating how Broadwell-U improves over Haswell-U. Without hardware on hand to test for ourselves it is hard to verify the numbers, but it provides a number of interesting talking points and how they compare to the previous Intel presentations leading up to this.

Core Improvements

We covered the transistor numbers on the previous page, but Intel’s direct performance metrics are most important when we consider graphics and battery life. Moving from Haswell-U to Broadwell-U, in terms of productivity, will not be that much of a jump as it is a similar architecture but on a different process node. It allows Intel to catch the low hanging fruit and move the IPC up by around 5%, achieved by the following:

Larger OoO scheduler
Faster store-to-load forwarding
Larger (+50%) L2 transaction lookaside buffer (TLB)
New dedicated 1GB page mode for L2 TLB
2nd TLB page miss handler
Faster FP multiplier
Faster Radix-1024 divider
Improved address prediction for branches and returns
Targeted cryptography instruction acceleration

The node adjustment has more weight when it comes to power saving, resulting in a lower voltage required for similar performance, but combined with Intel’s 2:1 policy for Broadwell (+2% performance uses at most +1% power) is good all around.

However the bigger change is on the GPU side. Intel is quoting a +22% synthetic graphics improvement from HD 5500 to HD 4400 with 3DMark and +50% for Cyberlink MediaEspresso for video conversion.

One might consider that Intel should bring alternating CPU and GPU performance each U series cycle, to give each platform a serious talking point. Haswell gave a half-generation increment in the name scheme after all (Gen7 to Gen7.5) but the CPU architecture was new compared to Ivy Bridge.

Intel is also a fan at looking into historical improvements. If you consider that a number of users are upgrading a 2-4 year old system, this makes a good amount of sense to see where the multi-generation improvements add up. On the other hand, when a person does upgrade, you would hope that every area has been improved over the 2-3 generations in the interim.

Naturally in order to give the best comparison data we look back at the oldest reasonable product for comparison – in this case Intel pitted an i5-5300U (HD 5500, GT2 with 24 EUs) against an i5-520UM. In the time between these two platforms, the concept of attacking mobile devices has changed significantly because of the base performance. If we put the 4.5W equivalent of the i5-520UM into a fanless tablet for example, the quality and features we know today would (I assume) feel slow almost to a point of excruciating. One argument is that back then, in 2010-ish (and before), our concept of software features and gaming was not at the level of detail it is today (which is true) and the same comparison will most likely be made in four years looking back at this era. Not only does the hardware improve, but also the understanding of the market and the concept of user experience.

Nevertheless, now we have devices that wake from sleep in fractions of a second rather than seconds, or turn on in seconds rather than minutes. Battery life has improved because integrated graphics are a bigger portion of the equation and we have thrown the graphics card away for most devices that need a sense of mobility. My old 8lb brick of a mobile 15-inch 1200p workstation used a 45W GPU with a 35W CPU, which was a nightmare for working on-the-go. The 11-inch netbook wasn’t a lot better, with the low 1366x768 resolution and underwhelming performance. As I am writing this review, my sub-3lb UX301 laptop is in a low power mode and on this flight I have managed three hours of active writing time, looking at text on white backgrounds, and still have half of the battery remaining. At this point four years ago, I would be getting out my charger for my 8lb brick with its extended battery and then wondering if I have exceeded the power limit for the flight socket. A popular feeling is to look back fondly to the past, but when it comes to the combination of laptop battery life with performance, the only way is forward.

Battery Life and the Audio DSP

Almost all the Intel suggested use scenarios, outside static All-In-Ones and mini-desktops, rely on some form of battery, so it makes sense that power efficiency is one card in play for Broadwell-U. In the past this relates in terms of actual performance per watt but also in regards to time-to-sleep, especially when parts of the system can be put into a lower power state or shut off completely when not in use. This makes designs complicated with disconnected clock domains as introduced in previous designs and so forth.

The test for battery life is also important as well because users typically do not run blank screens at idle when performing daily tasks. The two metrics Intel has provided is a 100 nit display idle with Windows 8.1, with the other requiring local HD video playback. 

For the former, Intel is quoting +60 minutes of battery life on their test platform at idle, equivalent to +11.0%. Most of this power saving comes from the SoC using better power saving techniques, but also the rest of the platform, such as the PCH, also reduces its power use to around half.

During the (local) video playback, a 90 minute difference equates to a substantial +20.8% battery use gain. A small amount of this is from the SoC and platform, but the biggest saving by far is the audio. Broadwell-Y and Broadwell-U both integrate Intel’s audio DSP (Digital Signal Processor) into the PCH. This removes a couple of Realtek components from the motherboard and allows Intel to bring it under their own manufacturing process, as well as configure the power gating needed.

The DSP is more powerful, presumably equating to a good race-to-sleep performance as well as dealing with HD audio under a lower power budget. Interestingly enough I would point out that the power usage of the DSP will be directly related to how much data is flowing through. If a HD video with little to no audio is involved, then the power usage will be quite low anyway. I would like to perhaps put a SYL metal live-show DVD through its paces to see how this affects power consumption.

As we mentioned back during the Core M discussions, the audio DSP lends itself to being a configurable and programmable entity, much in the same way that AMD’s solution is actively promoted. Similar to the response we had back then, Intel is considering opening it up with a public SDK, although that side of the equation is not on the roadmap as of yet.

Broadwell-U Platform Controller Hub (PCH)

As a writer, my bread and butter at AnandTech these past four years has revolved around motherboards and thus examining the connectivity provided by a chipset is always interesting. Because Intel bundle both the processor and the PCH on the same package, it allows manufacturers to save space in their design but it also allows Intel to control power consumption tighter to give better performance or longer battery life as a whole. There is still room for manufacturers to differentiate in their IO offerings, which is a good thing for consumers.

The new PCH for Broadwell-U focuses on that power consumption, especially when it comes to throttling sections and data pathways when not in use. The ‘Dynamic Power and Thermal Framework’ entry for the 5th Gen PCH should allow the performance to either respond as a function of battery life or skin temperature. This means throttling where necessary to reduce temperature or increase battery life. Wake on Voice is also a target for Intel, allowing devices to maintain a super-low power state but still respond without direct touch.

When it comes to direct connectivity, the PCH offers four SATA 6 Gbps, four USB 3.0 (two of which are muxed similar to a hub), eight USB 2.0 ports, TPM, a PCIe 2.0 x4 and another 12 PCIe 2.0 lanes split into 6 ports, allowing six devices maximum. We asked Intel regarding PCIe storage support for RST, and were told that with additional hardware support (remapping logic), Broadwell-U can support one PCIe 2.0 x2 PCIe storage device. This means that if a PCIe storage device based Broadwell-U came to market, with RST capabilities, it would cost a bit more than the base model. Also worth noting is that Broadwell-U is still using PCIe 2.0. On the PCH side this is perhaps not so much a big deal, and when asked about PCIe 3.0 Intel reiterated their stance on not commenting on possible future plans but they are monitoring demands and industry trends.

On the DRAM front, we got confirmation that Broadwell-U will support a maximum of 16GB of DDR3L/DDR3L-RS or LPDDR3 memory. No comment was made on a move towards two modules per channel memory or DDR4. Regarding video connectivity, Broadwell-U was too early for HDMI 2.0 and thus has HDMI 1.4b.

WiDi 5.1

Also new on the table is WiDi 5.1, which brings support for 4K to the ecosystem.

A part of WiDi that has been lacking has been the business features, and as a result Intel is focusing on security, privacy and controls needed for a professional environment. These will need a driver update for the ultra-early adopters of Broadwell, but Intel is driving down the costs of the WiDi adapters to a more palatable price point. My Belkin WiDi receiver, for example, retailed at 120 GBP-ish back in 2013 and requires an external power supply. Compare that to the product Intel promoted with their conference call - the Actiontec Mini2 which uses HDMI and is only $40.

Intel Wireless AC-7265

While not strictly speaking new to the market, Intel is promoting its new low power WiFi solution to the manufacturers to use in conjunction with Broadwell. The AC 7265 is an upgrade over the AC 7260 that was used extensively in Haswell from mobile devices all the way up to big desktop partners, and the AC 7265 brings about both performance and power benefits.

The form factor specifically for Broadwell-U is provided as a BGA M.2 part, with the package being 12mm x 16mm (given by the 1216 form factor designation). Low powered wireless is an important part of lower performance systems, as without the right configuration a sustained network load can eat up a portion of the processor performance. Intel’s partners with Broadwell-U are presumably not bound to use the AC 7265 and can use other products based on other performance metrics, but Intel is targeting networking as a source of power drain and working to correct that issue.

Devices! Where and When?

Most of AnandTech are here in Vegas, attending CES 2015 and (almost literally) running between meetings, press events and product showcases. Broadwell-U is high on our priority list, and we know several are due for announcement this week. Watch this space.

Fitting in With Core M & Release Dates
POST A COMMENT

85 Comments

View All Comments

  • kpb321 - Monday, January 5, 2015 - link

    We will have to wait and see. There might be more of a performance difference for Haswell than in the past because they decreased the # of EU's per slice from 10 down to 8 and increased the cache size. That should mean a lot more cache available per EU which should help keep it from being as bandwidth limited as in the past. It will probably still be bandwidth limited but hopefully just not as much making the GT3 version without eDRAM more reasonable.

    With that said integrated GPUs will always be behind dedicated GPUs in performance because graphics is so parallel that is scales easily with more units but those additional units mean higher power and bandwidth requirements. That's why you see high end GPUs using 200+ watts and very wide/fast memory interfaces both of which are much higher than can be reasonable handled in integrated GPU setting.
    Reply
  • III-V - Monday, January 5, 2015 - link

    Gen8 actually makes a lot of changes that reduce its reliance on external memory. Take a look at the bit on caches in this article. It'll still be constrained by bandwidth, but not as much as you seem to be expecting. Reply
  • texasti89 - Monday, January 5, 2015 - link

    It is nice to see audio DSP element is integrated into the PCH. I hope to see more and more integration in the near future. The power charts show clearly that display panel still has the major contribution in the overall platform power consumption. I think Intel and other SoC players have reached the point where SoCs can no longer provide pronounced improvements in overall power saving given demand for higher display resolution. Igzo display technology can cut the display power by at least half which will give further opportunity for SoC designers to effectively improve efficiency. Reply
  • thunderising - Monday, January 5, 2015 - link

    So, the fastest Intel Core i7, which costs a lot of $$, and spends nearly 70% of its die space on graphics, produces 844.8 GFlops.

    Whereas, NVIDIA's Tegra X1 outputs 1024 TFlops.

    *Claps*
    Reply
  • Pork@III - Monday, January 5, 2015 - link

    NVIDIA's Tegra X1 outputs 1024 TFlops

    >(in FP16)< But we already live in 2015 and work with FP32 and FP64 mostly
    Reply
  • TiGr1982 - Monday, January 5, 2015 - link

    Talking FP64, Tegra X1 may not even have it at all, or, at best, I suppose, it may have it at the same ratio, as GM204, which is just 1/32. So, I bet, FP64 capability does not really apply seriously to Tegra X1. FP16 and FP32 to be used there. Reply
  • III-V - Monday, January 5, 2015 - link

    I'm sure it'll have some FP64 support... Probably at 1/32, 1/48, or 1/64 rate. It'd be ludicrous for it to not support it at all. Reply
  • TiGr1982 - Monday, January 5, 2015 - link

    I suppose, FP64 can be at 1/32, like I said, is the case for GM204. But that's not a lot, certainly. Reply
  • TiGr1982 - Monday, January 5, 2015 - link

    X1 gives this flops for FP16 (half precision). Don't be fooled by usual nV marketing and compare "apples to apples".
    However, this is not to say that this Broadwell-U is very impressive. To me, it looks just as one more evolutionary step over Haswell-U. Nothing special, I would say. Still dual core x86, as a lot of people complain here - for some reason Intel strongly believes quad core is not need in -U segment. Instead, they beef up only the GPU, which may be bottle-necked anyway by DDR3 just as in AMD Kaveri case.
    And all of these Broadwell-U i5 and i7 are offered for big $$$, as usual in Intel's case. Somewhat disappointing - I agree with some other posters in this thread.
    Reply
  • DigitalFreak - Monday, January 5, 2015 - link

    It is a node shrink, so you shouldn't expect anything major over Haswell. Now if Skylake doesn't bring the goods, then they'll have an issue. Reply

Log in

Don't have an account? Sign up now