Energy efficiency analysis and optimization of relational and NoSQL databases
MetadataShow full metadata
As big data becomes the norm of various industrial applications, the complexity of database workloads and database system design has increased significantly. To address these challenges, conventional relational databases have been constantly improved and NoSQL databases such as MongoDB and Cassandra have been proposed and implemented to compete with SQL databases. In addition to traditional metrics such as response time, throughput, and capacity, modern database systems are posing higher requirements on energy efficiency due to the large volume of data that need to be stored, queried, updated, and analyzed. While decades of research in the database and data processing communities has produced a wealth of literature that optimize for performance, research on optimizations for energy efficiency has been historically overlooked and only very few studies have investigated the energy efficiency of database systems. To the best our knowledge, currently no comprehensive studies analyze the impact of query optimizations on performance and energy efficiency across both SQL and NoSQL databases. In fact, the energy behavior of many basic database operations (e.g. insertion, deletion, searching, update, indexing, etc) remains largely unknown due to the lack of accurate power measurement methodologies for various databases and queries. In this thesis, we developed a tool that can accurately measure the real-time power consumption of queries running on both SQL and NoSQL databases and investigated a series of query optimization techniques for improving the energy-efficiency of both Relational Databases and NoSQL Parallel databases. We used both widely acceptable benchmarks (e.g. Yahoo! Cloud Server Benchmark) and customized datasets (converted from 100GB of Twitter data) in our experiments to evaluate the effectiveness of optimization techniques. We performed cross database analysis on SQL based database (MySQL) and NoSQL based databases (MongoDB and Cassandra) to compare their performance and energy efficiency. Additionally, we studied a variety of optimization techniques that can improve energy efficiency without compromising performance on the databases derived from the Twitter data. Using these techniques, we were able to achieve significant energy savings without performance degradation.