Managing Caches for SMT and CMP (correct submission 2)


Power-efficient Cache technologies

Extensive discussions in our "Power-Efficient Cache Technologies" cluster have led to the development of new approaches to manage caches in SMT and CMP environments. Already, one visit has taken place in Patras and we have laid the groundwork for further advancements. In the mean time we discovered significant overlap with reconfigurable-cache work done at the University of Edinburgh by Marcelo Cintra and Mike O'Boyle. The purpose of expanding the "Power-Efficient Cache Technologies" is twofold: first to expand its scope to a more general cache management approach and second to include Edinburgh as a third member.
The new cluster will be open for further expansion and we already see overlap with other potential clusters.

Technical: The idea of managing the caches in an SMT or CMP environment refers to the development of policies and mechanisms to control and manage the cache requirements of multiple threads sharing a single large L2 (SMT case and CMP case) or sharing a collection of distributed L2s (CMP case only). Instead of letting multiple threads "fight it out" in the cache (which leads to significant destructive interference) we propose to develop policies that allow us to know (predict/compute/detect) what are the cache requirements of a thread and what happens if we restrict/expand its cache footprint. Then, we develop mechanisms to allow us to implement the decisions of our policies. We have already identify both policies and mechanisms and we are about to implement them in a simulator to produce our initial results. Below are the most important policies and mechanisms we are going to examine first:

Policies:

1. StatCache (Uppsala): allows us to know what would happen to the miss rate of a thread if we gave it a certain footprint in the cache

2. Compiler-driven (Edinburgh): allows us to compute the cache footprint/miss rate in loops or other regular code.

Mechanisms:

1. Cache Decay (Patras): Cache decay is a working set mechanism that allows us to control in a flexible and non-strict manner the _average_ cache footprint (active ratio) of a thread. Each thread has its own decay interval though which we control how much of the cache the thread will occupy according to the wishes of our high-level policy. Cache decay ties extremely well with StatCache where we can integrate the decay interval directly in the StatCache statistical model as well as with compiler approaches when the compiler can identify cache-line reuse distances.

2. A general reconfigurable cache that can be partitioned dynamically (but in strict way) among the threads (Edinburgh). The difference from Cache decay is that when in a reconfigurable cache we partition the cache among the threads we give each thread its own space (e.g, its own ways). With cache decay threads are compelled to use some average space but they do not adhere to any hard limits (e.g., hard limits in size or number of ways).

 

The above concrete examples are a result of our initial cluster and our interactions with Edinburgh. We are currently exploring more policies and more mechanisms for a general approach to cache management. We believe that cache management is an important direction and can be integrated very well with other clusters such as the existing cluster between Patras and Chalmers "Adaptive Prediction Techniques for branching, prefetching, and coherence" which examine related mechanisms such as prefetching.

We have already laid the groundwork for the three institutions (Uppsala, Patras, Edinburgh) to work together (and we maintain compatibility with Chalmers via the Patras-Chalmers cluster) and we have specific short term goals (publications) and a longer-term vision for cache work in SMT and CMP architectures.

 

 


Research cluster

Requested: € 16000
Granted: € 16000

Requested: € 0
Granted: € 0

We want to cover our collaboration expenses for the year 2006. We believe that member visits are most productive when graduate students accompany them (since they do the actual work) so we plan for 2-person visits. Initially, we plan for at least 2 meetings in 2006 of all the cluster-members (Uppsala, Edinburgh --2 HiPEAC members--, Patras). This corresponds to 16000 euro (divided equally among the 4 HiPEAC members) assuming a bare minimum of 1000 euro per person per trip.


Requested: 12 month(s)
Granted: 12 month(s), starting on: Tue, January 1, 1980

CINTRA Marcelo (Edinburgh University) (--member--)
HAGERSTEN Erik (Uppsala University) (--colleague--)
KAXIRAS Stefanos (University of Patras) (--member--)
O'BOYLE Michael (Edinburgh University) (--member--)