Fundamentals of Software Testing
Software testing is a critical process to ensure the reliability and functional/non-functional performance of a system under various conditions. It provides developers with the confidence that the software will behave as intended.
Core Definitions:
- Test Case: A specific sequence of inputs and corresponding expected outputs.
- Test Suite: A collection of test cases designed to exercise the code thoroughly and efficiently.
- The Goal: To find a test suite that achieves maximum code coverage (exercising every statement, branch, or path) while remaining fast to execute
Branch vs Path Coverage
- Branch Coverage: Ensures every possible outcome of every decision point (e.g.,
if-elsestatements) is executed at least once. - Path Coverage: A more exhaustive metric requiring every possible sequence of execution through a method to be tested.
- Note: Path coverage often requires significantly more test cases than branch coverage because it accounts for all combinations of branches.
The Need for Automation
Testing is one of the most expensive phases of software development, with companies spending an average of $2 million annually. Developers often spend 1/3 of their effort just writing test cases.
Key
Manual testing is unscalable for large systems due to astronomical numbers of possible input conditions. Automating the generation and evaluation of test suites is essential to reduce costs and human error.
Evolutionary Algorithms (EA) in Testing
Evolutionary Algorithms, particularly Genetic Algorithms (GA), are the foundation of many automated testing tools. They mimic biological evolution to “search” for optimal test cases.
The Evolutionary Cycle
- Initialisation: Create a random starting population of “test case” individuals
- Evaluation: Assign a Fitness Value to each individual based on how well it meets objectives (e.g., high coverage)
- Parent Selection: Choose the best individuals to reproduce, often using Tournament Selection(picking random members and selecting the best).
- Recombination (Crossover): Combine “genes” from two parents to often create offspring, hoping to merge the best traits.
- Mutation: Apply small, random changes to offspring to maintain diversity and discover new traits.
- Survivor Selection: Decide which individuals move to the next generation (e.g., Age-based or Fitness-based).
- Termination: Stop once a specific coverage goal is met or a generation limit is reached.
Biological Mapping
| Biological Term | Software Testing Equivalent |
|---|---|
| Genotype | The representation/encoding of the test case. |
| Phenotype | The actual behavior/execution of the test case. |
| Fitness | How close the test case gets to covering a target branch. |
Case Study: EvoSuite (Test Case Generation)
EvoSuite is a classic tool that generates whole test suites for Java classes using Genetic Algorithms.
Representation of a Test Case
In EvoSuite, a test case is a sequence of statements:
- Primitive: Numeric or string variables (e.g.,
int x = 5). - Constructor: Creating instances (e.g.,
Stack s = new Stack()). - Field/Method: Accessing members or invoking methods.
Fitness and Bloat Control
The fitness function rewards branch coverage. To prevent Bloat (test cases becoming infinitely long and consuming memory), EvoSuite limits the maximum number and length of test cases and rejects offspring that do not improve coverage.
Test Case Prioritisation
Even with a good suite, the order of the execution matters for efficiency. The goal is to detect faults as early as possible so repairs can begin.
- Metric (APFD): The Average Percentage of Faults Detected measures the rate at which a prioritised suite uncovers bugs
- Search Strategy: GAs are used to find the optimal sorting of test cases without adding or discarding any.
Multi-Objective Testing (Sapienz & NSGA-II)
Real-world testing often requires balancing multiple, often conflicting, goals. For example, a tool might want to maximise coverage, minimise the number of test cases, and maximise the number of crashes found.
NSGA-II (Non-dominated Sorting Genetic Algorithm II)
This is a standard algorithm for multi-objective optimisation
- Pareto Dominance: Solution A dominates B if it is better in at least one objective and no worst in all others.
- Non-dominated Sorting: Solutions are ranked into “fronts”. The first front contains all solutions not dominated by any others
- Crowding Distance: Within a front, the algorithm prefers solution that are “most different” from others to maintain diversity in the search.
Sapienz
Developed for Android apps (later bought by Facebook), Sapienz uses NSGA-II to generate test sequences consisting of Atomic Events (touches, rotations) and Motif Events (complex inputs like filling forms).
Testing for AI Fairness
AI testing goes beyond functional correctness to focus on equity.
- Fairness Bug: Occurs when an AI model gives different predictions for two inputs that differ only by a sensitive feature (e.g., race, age, gender).
- Objective: To ensure all groups are treated equitably through metrics like Demographic Parity and Equal Opportunity.