Data splitting strategy
- Training Set: Used to model to learn the task.
- Validation set: Used to tune the hyperparameters (like learning rate) and pick the best model version.
- Test Set: Used only at the end to evaluate true generalisation
Ethical “Don’ts” in Data Splitting
- Do not train on the test set: Unethical; the model will overfit the test data, making performance appear better than it is.
- Do not train on the validation set: You will fail to notice when the model begins to overfit.
- Do not validate on the training set: It does not show if the model generalises.
Advanced Techniques
- Early Stopping: Monitor the validation set during training; pick the model state where validation loss is lowest, just before overfitting starts.
- Regularisation (Weight Decay): Adds a penalty to the loss function to prevent weights from becoming too large: , where is a hyperparameter.