Appendix A: Tensor Shapes Cheat Sheet¶
“Because debugging starts with dimensions.”
Task / Layer | Expected Shape |
---|---|
Single image (grayscale) | [1, H, W] |
Single image (RGB) | [3, H, W] |
Batch of grayscale images | [B, 1, H, W] |
Batch of RGB images | [B, 3, H, W] |
Fully connected input | [B, features] |
LSTM input | [seq_len, batch, input_size] or [B, T, F] (batch_first=True ) |
Transformer input | [B, seq_len] or [B, seq_len, emb_dim] |
nn.Embedding input |
[B, T] (int64 tokens) → Output: [B, T, emb_dim] |
CNN output to FC layer | [B, C, H, W] → x.view(B, -1) |
Classification output | [B, num_classes] |
Regression output | [B, 1] or [B] |
Always
print(.shape)
at each step of your model to avoid dimensional disasters.