Mastersthesis,

Reliable Resource Demand Estimation

.
University of Würzburg, Am Hubland, Informatikgebäude, 97074 Würzburg, Germany, Master Thesis, (October 2016)

Abstract

Resource demands are key parameters of performance models used to predict the behavior of data centers. They define the amount of time a request spends obtaining a limited resource like the CPU. Requests can be grouped into different workload classes. Measuring these resource demands is usually unfeasible in practice. Therefore, several different approaches to estimate the resource demands of different workload classes exist. However, different use-cases with individual properties influence the accuracy of the estimators. Among others the number of different workload classes to estimate is known to have an impact on the solution quality, but affects some approaches more than others. Additionally, most approaches offer specific parameters to configure and optimize the estimators. Nevertheless, in order to optimize the parameters of one estimation approach or to choose the best estimator for a given scenario either expert knowledge or exhaustive testing is required. While some works on comparing different approaches and configurations exist, we extend this by learning on a given training set and specially adapting the estimation approaches in order to optimize performance for the required target scenario. We simplify automated resource demand estimation by designing a framework for ready-to-use reliable resource demand estimation. In order to do so, we develop generic algorithms that can be used to autonomously optimize parameter configurations of black-box estimation approaches on a given training set. Secondly, machine learning algorithms analyze the behavior of the resource demand estimators on different training traces and automatically pick the best approach for a prior unseen trace. The framework is modularized and configurable and can be trained on any kind of trace data. We implement different algorithms for optimization as well as machine learning and evaluate them on a training set containing measurements of a real system. The results show that parameter optimization is very promising and can increase the accuracy of single approaches of up to 10%. When recommending one approach as opposed to running all simultaneously, comparable results can be achieved, while saving more than 50% of the runtime. However, a combination of both approaches does not seem useful on our data set.

Tags

Users

  • @se-group
  • @joh.grohmann

Comments and Reviews