04 Intelligent Software Testing

Fundamentals of Software Testing

Software testing is a critical process to ensure the reliability and functional/non-functional performance of a system under various conditions. It provides developers with the confidence that the software will behave as intended.

Core Definitions:

Test Case: A specific sequence of inputs and corresponding expected outputs.
Test Suite: A collection of test cases designed to exercise the code thoroughly and efficiently.
The Goal: To find a test suite that achieves maximum code coverage (exercising every statement, branch, or path) while remaining fast to execute

Branch vs Path Coverage

Branch Coverage: Ensures every possible outcome of every decision point (e.g., if-else statements) is executed at least once.
Path Coverage: A more exhaustive metric requiring every possible sequence of execution through a method to be tested.
- Note: Path coverage often requires significantly more test cases than branch coverage because it accounts for all combinations of branches.

The Need for Automation

Testing is one of the most expensive phases of software development, with companies spending an average of $2 million annually. Developers often spend 1/3 of their effort just writing test cases.

Key

Manual testing is unscalable for large systems due to astronomical numbers of possible input conditions. Automating the generation and evaluation of test suites is essential to reduce costs and human error.

Evolutionary Algorithms (EA) in Testing

Evolutionary Algorithms, particularly Genetic Algorithms (GA), are the foundation of many automated testing tools. They mimic biological evolution to “search” for optimal test cases.

The Evolutionary Cycle

Initialisation: Create a random starting population of “test case” individuals
Evaluation: Assign a Fitness Value to each individual based on how well it meets objectives (e.g., high coverage)
Parent Selection: Choose the best individuals to reproduce, often using Tournament Selection(picking $k$ random members and selecting the best).
Recombination (Crossover): Combine “genes” from two parents to often create offspring, hoping to merge the best traits.
Mutation: Apply small, random changes to offspring to maintain diversity and discover new traits.
Survivor Selection: Decide which individuals move to the next generation (e.g., Age-based or Fitness-based).
Termination: Stop once a specific coverage goal is met or a generation limit is reached.

Biological Mapping

Biological Term	Software Testing Equivalent
Genotype	The representation/encoding of the test case.
Phenotype	The actual behavior/execution of the test case.
Fitness	How close the test case gets to covering a target branch.

Case Study: EvoSuite (Test Case Generation)

EvoSuite is a classic tool that generates whole test suites for Java classes using Genetic Algorithms.

Representation of a Test Case

In EvoSuite, a test case is a sequence of statements:

Primitive: Numeric or string variables (e.g., int x = 5).
Constructor: Creating instances (e.g., Stack s = new Stack()).
Field/Method: Accessing members or invoking methods.

Fitness and Bloat Control

The fitness function rewards branch coverage. To prevent Bloat (test cases becoming infinitely long and consuming memory), EvoSuite limits the maximum number and length of test cases and rejects offspring that do not improve coverage.

Test Case Prioritisation

Even with a good suite, the order of the execution matters for efficiency. The goal is to detect faults as early as possible so repairs can begin.

Metric (APFD): The Average Percentage of Faults Detected measures the rate at which a prioritised suite uncovers bugs
Search Strategy: GAs are used to find the optimal sorting of test cases without adding or discarding any.

Multi-Objective Testing (Sapienz & NSGA-II)

Real-world testing often requires balancing multiple, often conflicting, goals. For example, a tool might want to maximise coverage, minimise the number of test cases, and maximise the number of crashes found.

NSGA-II (Non-dominated Sorting Genetic Algorithm II)

This is a standard algorithm for multi-objective optimisation

Pareto Dominance: Solution A dominates B if it is better in at least one objective and no worst in all others.
Non-dominated Sorting: Solutions are ranked into “fronts”. The first front contains all solutions not dominated by any others
Crowding Distance: Within a front, the algorithm prefers solution that are “most different” from others to maintain diversity in the search.

Sapienz

Developed for Android apps (later bought by Facebook), Sapienz uses NSGA-II to generate test sequences consisting of Atomic Events (touches, rotations) and Motif Events (complex inputs like filling forms).

Testing for AI Fairness

AI testing goes beyond functional correctness to focus on equity.

Fairness Bug: Occurs when an AI model gives different predictions for two inputs that differ only by a sensitive feature (e.g., race, age, gender).
Objective: To ensure all groups are treated equitably through metrics like Demographic Parity and Equal Opportunity.

Ayush Acharjya's Notes

Explorer

04 Intelligent Software Testing

Fundamentals of Software Testing

Core Definitions:

Branch vs Path Coverage

The Need for Automation

Evolutionary Algorithms (EA) in Testing

The Evolutionary Cycle

Biological Mapping

Case Study: EvoSuite (Test Case Generation)

Representation of a Test Case

Fitness and Bloat Control

Test Case Prioritisation

Multi-Objective Testing (Sapienz & NSGA-II)

NSGA-II (Non-dominated Sorting Genetic Algorithm II)

Sapienz

Testing for AI Fairness

Graph View

Table of Contents

Backlinks