What are the Benefits of a Larger Cache?

Bigger is better right? So a 512KB L2 cache must be better than a 256KB one - after all, AMD wouldn't spend 17 million transistors for no gain. Although it's very true that a larger cache is generally beneficial, the real question is how beneficial and in what situations. To answer that question, we should have a quick lesson in caches and what makes them so useful.

Think of a cache as a bridge between two entities - a slower and a faster one. In this case, the cache we are talking about is part of a multilevel cache system and it helps to bridge the gap between the CPU and main memory.

It's no surprise that main memory runs significantly slower than today's CPUs. Not only does memory run at significantly slower clock speeds (e.g. 200MHz for DDR400) than today's CPUs, but main memory is physically located very far away from the processor. Our multi-gigahertz CPUs have to waste well over 100 clock cycles to retrieve data from main memory as their requests must cross over slow front-side buses, through an external memory controller, to the memory and back. Making this trip can wreak havoc on performance, especially for CPUs with very long pipelines, as these pipelines generally remain idle if the data necessary to populate them has to be fetched from main memory.

The idea behind a processor's caches is that you store important data in these high speed memories (now located on the processor's die itself), so that most of the time, your CPU doesn't have to make the long trip to main memory. The reason caches are split into multiple levels is because the larger your cache is, the longer it takes to fetch data. Therefore, it ends up being that having one smaller but very low latency cache combined with a larger and somewhat higher latency (but still significantly quicker than main memory) cache provides the best balance of performance in today's microprocessors. These two caches are the Level 1 (L1) and Level 2 (L2) caches you hear about all the time.

Caches work based on two major principles - spatial and temporal locality. These two principles are simple; spatial locality states that, if you are accessing data, then, the data around it will be accessed soon, and temporal locality states that if you are accessing data, chances are that you'll access that same piece of data again. In practice, this means that frequently accessed data is kept in cache, as well as data physically around it. Since caches are of relatively small sizes (rightfully so, it would be cost and performance prohibitive to have main memory-sized caches), the algorithms they use to make sure that the right information remains in the cache is even more critical to performance than the sheer size of the cache.

With Barton, AMD left their L1 the same as before, but increased their L2 cache size by a total of 256KB. AMD didn't change any of the specifications of the cache (e.g. it is still a 16-way set associative L2 cache) Luckily, AMD increased the cache size without sacrificing access time, but where will the added L2 cache help?

Let's look at those two principles we mentioned before, spatial and temporal locality. If an application's usage pattern does not abide by either one of these principles, then it doesn't matter how much cache you add, the performance will not improve. So what are some examples of applications that are and are not cache-friendly?

For starters, let's talk about things that don't abide by the principle of temporal locality - mainly multimedia applications, more specifically - encoding applications. If you think about how encoding works, the data is never reused, simply encoded on a bit-by-bit basis and then the original data is never touched again. At the other end of the spectrum, we have things like office applications that happily abide by the principle of temporal locality. In these sorts of applications, you are often re-using data, performing very similar tasks to them over and over again and thus making great use of larger caches.

The principle of spatial locality applies to a much wider range of applications, including multimedia encoding applications because of the fact that data is generally stored in contiguous form in main memory and is thus very cache-friendly. Spatial locality is why you will see some improvement from larger caches even in applications that don't exhibit much temporal locality.

What's a Barton (continued) AMD’s Cache Benefits vs. Intel’s Cache Benefits
Comments Locked

1 Comments

View All Comments

  • Anonymous User - Tuesday, October 21, 2003 - link

    Curious? Athlon XP 3000+ (2.167GHz) Barton is running with Intel's P4 2.5 and above and keep up? Intresting

Log in

Don't have an account? Sign up now