EVALUATING SHARED-CACHE PERFORMANCE WITH MICROBENCHMARKS AND REUSE DISTANCE ANALYSIS
MetadataShow full metadata
Emergence of multicore architectures has opened up new opportunities for thread-level parallelism and dramatically increased the theoretical peak on current systems. However, achieving a high fraction of peak performance requires careful orchestration of many architecture-sensitive parameters. In particular, the presence of shared-caches on multicore architectures makes it necessary to consider, in concert, issues related to both parallelism and data locality. This research evaluates the shared-cache performance of several scientic kernels. A synthetic microbenchmark along with hardware performance counter measurements are used to estimate cache sharing among multiple threads in parallel applications. A novel reuse-distance based algorithm is developed to identify correlations between reused distance patterns and shared-cache utilization.