Ayush Acharjya's Notes

❯

Neural Computation

❯

13 Training Protocols & Ethics

13 Training Protocols & Ethics

May 27, 20261 min read

Data splitting strategy

Training Set: Used to model to learn the task.
Validation set: Used to tune the hyperparameters (like learning rate) and pick the best model version.
Test Set: Used only at the end to evaluate true generalisation

Ethical “Don’ts” in Data Splitting

Do not train on the test set: Unethical; the model will overfit the test data, making performance appear better than it is.
Do not train on the validation set: You will fail to notice when the model begins to overfit.
Do not validate on the training set: It does not show if the model generalises.

Advanced Techniques

Early Stopping: Monitor the validation set during training; pick the model state where validation loss is lowest, just before overfitting starts.
Regularisation (Weight Decay): Adds a penalty to the loss function to prevent weights from becoming too large: $L (θ) = L_{t a s k} (θ) + λ \sum θ^{2}$ , where $λ$ is a hyperparameter.

Graph View

Ethical “Don’ts” in Data Splitting
Advanced Techniques

Backlinks

NC

Created with Quartz v4.4.0 © 2026

GitHub