Ayush Acharjya's Notes

❯

Neural Computation

❯

17 CNN Structure 4D Data

17 CNN Structure 4D Data

May 27, 20261 min read

Invariance vs. Equivariance:

Convolutional Layers achieve shift Equivariant (the output shifts as the input shifts).
Pooling/Fully Connected Layers achieve shift invariance (the classification remains the same regardless of small shifts in input).
Semantic Segmentation Idea: Using “Fully Convolutional” networks to make prediction for every pixel simultaneously

Understanding 4D Data

Deep learning frameworks typically process data in $4$ dimensions:

Batch Index $(N)$ : which training example in the batch (the “Archive”).
Channel ( $C$ ): which feature map or colour channel (the “Cabinet”).
Row $(Y)$ : vertical coordinate
Column $(X)$ : Horizontal coordinate

Weight Initialisation

Crucial Rule: Never set all weights to $0$ or $1$ . If weights are equal, all derivatives will be the same, making neurons “interchangeable” and preventing learning.
Random Initialisation: Generally centred around 0.
He/Kaiming Initialisation: Best for ReLU activation functions.
Xavier/Glorot Initialisation: Best for Sigmoid/Tanh functions.

Graph View

Invariance vs. Equivariance:
Understanding 4D Data
Weight Initialisation

Backlinks

NC

Created with Quartz v4.4.0 © 2026

GitHub