Preparing For Future Software: AMD's Lightweight Profiling Proposalby Ryan Smith on August 16, 2007 12:00 AM EST
- Posted in
The Importance of Profiling
To understand what exactly AMD is attempting to accomplish, we first need to backtrack somewhat and talk about performance analysis, the task AMD is intending to improve with the LWP. Because of the sheer complexity of modern high-performance processors and high-performance applications, searching for performance bottlenecks solely through an understanding of code is virtually impossible. Due to this limitation, developers have over the years written toolsets that attempt to straddle the hardware/software boundary by recording and interpret the actions of the hardware, to identify what code is being run and what the hardware is doing that is greatly impacting the performance of an application. This is performance analysis, and the tools used to do this task are called profilers.
Traditional profilers attempt to measure performance via timing, hardware interrupts, and other low-level tricks to coerce the hardware in to giving additional information on its state and current instructions being run. This kind of performance analysis can be extremely effective, but it also has an inherent downside: these profilers slow down the system and interrupt the program being profiled. The hardware may be acting differently because of the profiler, conflicting with the intended goal.
Unfortunately for this problem, profiling is increasingly necessary as developers continue to move to higher-level languages and parallel processing technologies. C and C++, the traditional high-level languages for high-performance applications are being usurped by managed environments such as Java and the Microsoft .Net framework. Managed code can offer better security, improved threading, and write-once run-anywhere functionality through virtual machines which in turn avoids many issues with porting a program to other platforms.
Meanwhile the entire imperative programming model that is behind the design of C, Java, Visual Basic, C#, and the other major programming languages is poor suited for multithreading. Some of the most pessimistic predictions for game development for example put the development time of a multithreaded engine at three times that of a single threaded engine. Profilers help in this regard by helping developers catch stalls and other problems that result from managing multiple threads.
It's all of these problems that AMD wants to resolve with their Lightweight Profiling Proposal. What AMD proposes is a section of silicon on a CPU dedicated to assisting with profiling, and a new set of instructions to work with the hardware. The profiling hardware would be able to properly monitor the rest of the CPU, as opposed to the guessing done by software profilers, and return this more precise information to the developer.
It's a fundamentally simple concept, the proposal calls for all of this being done with only two instructions: LLWPCB and SLWPCB to enable/disable profiling and retrieve the data respectively. Yet the potential results could be extremely useful, allowing developers to identify the precise latency of certain operations, count cache hits, or retrieve the exact instruction being processed. Furthermore all of this occurs while triggering a fraction of the interrupts (the reasoning behind the "lightweight" name) and without causing the processor to act different as software profiling tools can cause, all of this leading to better profiling that should translated in to more finely optimized applications.
Even wilder ideas for using these instructions exist in the realm of managed code. Because LWP is lightweight and real time, the possibility is left open that Just-In-Time(JIT) compilers used by managed environments could use the profilers on themselves and change how they're compiling code and handing data to improve performance on the fly. As we'll see there are some outstanding issues with AMD's proposal that would specifically affect this use, but the potential is there.