Data splitting strategy

  1. Training Set: Used to model to learn the task.
  2. Validation set: Used to tune the hyperparameters (like learning rate) and pick the best model version.
  3. Test Set: Used only at the end to evaluate true generalisation

Ethical “Don’ts” in Data Splitting

  • Do not train on the test set: Unethical; the model will overfit the test data, making performance appear better than it is.
  • Do not train on the validation set: You will fail to notice when the model begins to overfit.
  • Do not validate on the training set: It does not show if the model generalises.

Advanced Techniques

  • Early Stopping: Monitor the validation set during training; pick the model state where validation loss is lowest, just before overfitting starts.
  • Regularisation (Weight Decay): Adds a penalty to the loss function to prevent weights from becoming too large: , where is a hyperparameter.