Overfitting Solutions
- Use more data
- Reduce model power (fewer parameters)
- Early Stopping: Stop training before the model starts memorising noise.
Data Augmentation
Applying realistic random transformations (flips, rotations, brightness) to create “many” training pairs from a single image.
Dropout Regularisation
Randomly turning off neurons with probability during training to prevent co-dependency.
Skip Connections (ResNet)
- Adds the input back to the output of a weight layer: .
- Benefit: It “smooths” the loss surface, making deep networks (e.g., 34 layers) much easier to train than “plain” deep networks.
Transfer Learning
Taking a model pre-trained on a large dataset (e.g., ImageNet) and “fine-tuning” it on a new task. This reduces computation time and requires less data for the new task.