Improving Carbon, Cost, and Energy Efficiency of Large Scale Systems via Workload Analysis

Date

2022-05

Authors

Everman, Brad

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The global COVID-19 pandemic has transformed the way businesses utilize digital technologies, with an increasing reliance on cloud resources due to the paradigm shift from traditional to work-from-home models. Cloud computing resources are expected to expand annually by 14.8\% from 2022 to 2030, a three-fold increase overall, driven by more and more reliance on decentralization and the changing workplace. As the need for large scale systems continues to grow, the cost, energy consumption and carbon footprint have increased accordingly at unprecedented rates. It is expected that digital industry will contribute to 14\% of global greenhouse gas emissions by 2040. Therefore, it is essential to put sustainability at the core of digital technologies and reduce their operating cost and negative impact on environment. In the past decades, scientists and industry pioneers have made tremendous endeavors in improving the energy efficiency of various digital technologies. The exemplary achievements including, but not limited, to using more energy efficient hardware such as GPUs, FPGAs, and AISCs to solve appropriate problems, using Power Usage Effectiveness (PUE) as a metric to measure the energy efficiency of data centers, using the big.LITTLE architecture to balance the high performance and low power needs of mobile applications, using Dynamic Voltage and Frequency Scaling (DVFS) to decrease energy consumption based on overall system load, using virtual machines to share resources in the cloud, using carbon-aware scheduling to allocate jobs to the least wasteful or most carbon efficient resources, as well as neuromorphic computing that mimics a human brain to minimize the energy consumption of AI applications. All these works have significantly advanced the research and industry practices of sustainable computing. However, the ever growing data volume and more complex workloads running on large scale systems have brought the challenges to a whole new level. How to improve the carbon, cost, and energy efficiency of large scale systems from the big data and workload analysis perspective has not been fully studied in the literature. This dissertation explores ways to improve energy efficiency via workload analysis, which provide the additional benefits of improving carbon emissions and lowering operational costs for both large systems and the end-users relying on those system. More specifically, it investigates three typical workloads that have high energy requirements and are widely deployed: website workloads, cloud workloads, and AI workloads. The study of website workloads monitored the power consumption of five different types of web servers and recorded the quality of service (QoS) provided by those servers while simulating real user load. The results demonstrated that a low-powered web server can provide comparable QoS to a higher powered one in many instances. For private cloud workloads, the 2017 and 2018 Alibaba cluster traces were analyzed, and a simulator was designed to test the effectiveness of decreasing the number of servers while maintaining the required level of performance. The simulation results showed that decreasing the number of servers by 5\% resulted in negligible impact on performance while lowering yearly electricity costs. A public cloud workload analysis was conducted using the 2019 Microsoft Azure trace, which revealed that a large portion of VMs was underutilized thus wasting significant amount of energy and resources in the cloud. A recommendation algorithm was proposed to help cloud users reduce cost without compromising QoS. Lastly, the energy efficiency and carbon emissions of several foundation AI models were analyzed using the recently released industry standard - Software Carbon Intensity (SCI), which provided an effective methodology on evaluating the environmental impact of large-scale AI models and shed lights on future design of green AI.

Description

Keywords

Cloud waste, Cloud cost, Big data analysis, Server utilization, Data center efficiency

Citation

Everman, B. (2022). Improving carbon, cost, and energy efficiency of large scale systems via workload analysis (Unpublished dissertation). Texas State University, San Marcos, Texas.

Rights

Rights Holder

Rights License

Rights URI