02 Intelligent Configuration Performance Learning

Configuration Sampling Strategies

Because testing every configuration is impossible, we must strategically select a subset of data samples for training.

Binary Options (On/Off)

Option-wise (OW): Selects a set where every option is enabled at least once. It minimises other active options to reduce unknown interactions. The sample size is linear to the number of options.
T-wise Strategy: Covers all combinations of $T$ options (where $T > 1$ ). For example, $2$ -wise ensures every possible pair of options is tested together. The sample size is exponential to $T$ .
Negative Option-wise: Starts with an “all-enabled” configuration and then creates samples where exactly one option is disabled.
Random Strategy (Binary): Random sampling simply selects $n$ configurations at random from the full binary configuration space. Sampling is purely random. The size of the set $n$ is a parameter chosen by the user.

Non-Binary/Numeric Options

One-Factor-At-A-Time (OFAT): Varies one numeric option while keeping all others at their centre value. Assumes no interactions between variables.
Box-Behnken Design (BBD): A subset of a “Full Factorial Design” that looks at a quadratic influences and 2-wise interactions. it uses three levels: min, max and centre.
Central Composite Design (CCD): Combines a factorial design with “star points” at a specific distance $α$ from the centre to capture curvature in the performance landscape.
Plackett-Burman Design (PBD): Used when interactions are negligible. It uses a “seed” configuration and shifts it to identify individual influences.

Tip

To handle systems with both binary and numeric operations, sample each type independently and then perform a permutation of the results to create the final training set.

Data Encoding Schemes

Before training a model, configuration data must be converted into numeric formats that machines can process.

Label Encoding: Assign a unique integer to each option value (e.g., Strategy_A=0, Strategy_B=0)
Scaled Label Encoding: Similar to label encoding, but values are normalised between $0$ and $1$ , using Min-Max Scaling $x^{'} = (x - min) / (ma x - min)$
One-Hot Encoding: Creates a binary column for every possible value of an option. If an option has three levels, it becomes a vector of length three.

Machine Learning for Performance

DeepPerf (Deep Neural Networks)

DeepPerf uses Neural Networks (DNNs) with more that $3$ layers to handle small, sparse datasets.

Architecture: Consists of an Input Layer, multiple Hidden Layers (typically using ReLU activation), and an Output Layer (Linear activation for regression).
L1 Regularisation (Lasso): Used at the first layer to penalise less relevant features. This helps the model ignore “noise” from options that don’t impact performance.
Backpropagation: The process of updating weights by propagating errors from the output back through the network to minimise prediction error.

DaL (Divide-and-Learn)

Standard DNNs struggle with sample sparsity (limited data points in a complex landscape). DaL solves this by splitting the problem.

Dividing: Uses a CART (Classification and Regression Tree) to split data into “divisions” where performance changes are smoother.
Training: Trains a dedicated local DNN for each specific division.
Predict: A Random Forest Classifier directs new configurations to the correct label model for prediction.

Cross-Environment Learning (BEETLE)

Software performance changes across different environments (e.g., different hardware or OS). Retraining a model for every environment is too expensive.
Transfer Learning involves using knowledge from a “Source” environment to predict performance in “Target” environment.

The Bellwether Effect
BEETLE identifies a Bellwether Environment, one specific environment that is best “source” for predicting performance in all others.

The Racing Algorithm:
1. Sample $10%$ of data from all environments.
2. Build model using each environment as a source and others as targets.
3. Rank them by accuracy and eliminate the worst performers.
4. Repeat with more data until the rank stabilises.

White-Box Learning (Comprex)

Unlike the “Black-box” methods that only look at inputs/outputs, White-box learning analyses the source code.

Taint Analysis: Tracks how configuration options (source) flow through the code to reach control-flow decisions like if or while statements (sinks)
Process:
1. Decompose: Use taint analysis to identify code “regions” influenced by specific options.
2. Sample: Test configurations that trigger different execution paths within those regions.
3. Compose: build small linear model for each region and add them together to create a global model: M_total = M_regiona1 + M_regions2 + ....

Ayush Acharjya's Notes

Explorer

02 Intelligent Configuration Performance Learning

Configuration Sampling Strategies

Binary Options (On/Off)

Non-Binary/Numeric Options

Data Encoding Schemes

Machine Learning for Performance

DeepPerf (Deep Neural Networks)

DaL (Divide-and-Learn)

Cross-Environment Learning (BEETLE)

White-Box Learning (Comprex)

Graph View

Table of Contents

Backlinks