Dynamic Feedback-Driven Thread Migration for Energy-Efficient Execution of Multithreaded Workloads
MetadataShow full metadata
Multicore architectures require sound thread to core mapping policies in order to exploit the efficiency and parallelism that multi-threaded programs offer. Traditionally, the operating system scheduler focuses on temporal aspects of performance such as execution time and latency, disregarding other factors that may have significant impact on the system. For example, judicious thread migration decisions can provide significant power savings. Typical schedulers, however, fail to make power aware migration. This master thesis focuses on comparing the effects of using resource aware analytical models, and a machine learning model, on making power aware thread migration decisions.
The first analytical model uses a greedy algorithm, and aims to balance the load on processors, based on each core’s utilization level, via thread migration. We use a novel approach to derive core utilization levels, utilizing the dynamic feedback provided by performance counters, as well as a modified utilization metric that more accurately reflects the state of a processor.
The second analytical model is aware of the processors’ shared resources and aims to reduce any contention via thread consolidation. Hardware performance counters that reflect high miss rates in certain shared resources are evaluated and compared to established thresholds; the model then recommends to either consolidate or preserve the default scheduling. The new affinity configuration, if any, is expected to promote greater power savings.
The last evaluated model uses a machine learning approach to recommend a final affinity configuration for a workload at runtime. The novelty of this approach again lies in the utilization of hardware performance counters for model training. A total of four metrics derived from a subset of available counters comprise the feature vector of the model. Three algorithms are employed with the model: a decision tree to aid with visualization, a support vector machine which provides a categorical approach, and the statistically based Bayesian model.
The three models are evaluated with various single and multi-program workloads, where each workload differs in certain tunable parameters such as initial affinity configuration or thread count. Results reflect differences in execution times when applying the models and when utilizing the default OS scheduler. Additional comparisons in power consumption reveal strengths and weaknesses of each approach, and a final evaluation recommends the most beneficial approach for preserving power.