Overfitting Solutions

  • Use more data
  • Reduce model power (fewer parameters)
  • Early Stopping: Stop training before the model starts memorising noise.

Data Augmentation

Applying realistic random transformations (flips, rotations, brightness) to create “many” training pairs from a single image.

Dropout Regularisation

Randomly turning off neurons with probability during training to prevent co-dependency.

Skip Connections (ResNet)

  • Adds the input back to the output of a weight layer: .
  • Benefit: It “smooths” the loss surface, making deep networks (e.g., 34 layers) much easier to train than “plain” deep networks.

Transfer Learning

Taking a model pre-trained on a large dataset (e.g., ImageNet) and “fine-tuning” it on a new task. This reduces computation time and requires less data for the new task.