Using Probabilistic Reasoning to Automate Software Tuning
David Gerard Sullivan, Ph.D. Thesis, Harvard University, September 2003.
Complex software systems typically include a set of parameters that can be adjusted to improve the system’s performance. System designers expose these parameters, which are often referred to as knobs, because they realize that no single configuration of the system can adequately handle every possible workload. Therefore, users are allowed to tune the system, reconfiguring it to perform well on a specific workload. However, manually tuning a software system can be an extremely difficult task, and it is often necessary to dynamically retune the system as workload characteristics change over time. As a result, many systems are run using either the default knob settings or non-optimal alternate settings, and potential performance improvements go unrealized. Ideally, the process of software tuning should be automated, allowing software systems to determine their own optimal knob settings and to reconfigure themselves as needed in response to changing conditions.
This thesis demonstrates that probabilistic reasoning and decision-making techniques can be used as the foundation of an effective, automated approach to software tuning. In particular, the thesis presents a methodology for automated software tuning that employs the influence diagram formalism and related learning and inference algorithms, and it uses this methodology to tune four knobs from the Berkeley DB embedded database system. Empirical results show that an influence diagram can effectively generalize from training data for this domain, producing considerable performance improvements on a varied set of workloads and outperforming an alternative approach based on linear regression.
The thesis provides a detailed roadmap for applying the proposed software-tuning methodology to an arbitrary software system. It also proposes novel methods of addressing three challenges associated with using an influence diagram for software tuning: modeling the performance of the software system in a tractable manner, determining effective discretizations of continuous variables in the model, and estimating parameters of the model for which no training data is available. In addition, the thesis discusses the design of workload generators that can produce the necessary training data, and it proposes a technique for ensuring that a workload generator captures the steady-state performance of the system being tuned.