Skip to content

Mastering Convolutional Neural Networks

Table of Contents

Table of Contents

Vision in Code: Mastering Convolutional Neural Networks for Real-World Image Modeling¶

A Practical Guide to CNN Implementation with PyTorch and TensorFlow¶

Contents¶

📖 Preface ¶

Part I – Foundations of Image Tensors and Preprocessing ¶

Chapter 1: How a Neural Network Sees an Image

1.1 What is an image (JPEG, PNG, etc.) in memory?

1.2 From pixel data → NumPy array → Tensor

1.3 RGB channels, 8-bit scale, float conversion

1.4 [H, W, C] vs [C, H, W] — framework differences explained

1.5 Why model input shape matters

1.6 Visual walkthrough of image-to-input pipeline

Chapter 2: What is a Tensor (in Code and in Mind)?

2.1 Tensor shapes and memory layout

2.2 Dimensionality intuition

2.3 PyTorch: torch.tensor, .permute(), .view(), .reshape()

2.4 TensorFlow: tf.Tensor, .reshape(), .transpose(), broadcasting

2.5 Visual walkthroughs of shape manipulations

Chapter 3: From Pixels to Model Input

3.1 Full image input pipeline:

3.2 RGB loading → float32 conversion → normalization

3.3 Resizing and reshaping to expected input size

3.4 Batch dimension handling: unsqueeze() vs expand_dims()

3.5 Feeding tensors into Dense or Conv2D layers

3.6 Debugging mismatched shapes

3.7 Framework comparison of entire image → tensor → model flow

Part II – Preprocessing and Input Pipelines ¶

Chapter 4: Standard Image Preprocessing

4.1 Resize, Normalize, Augment

4.2 Mean-std normalization vs 0–1 scaling

4.3 Format mismatches and their impact on accuracy

4.4 PIL vs OpenCV vs tf.image

4.5 Visualizing preprocessing effects

4.6 Matching preprocessing between training and inference

Chapter 5: Preprocessing for Pretrained Models

5.1 Matching pretrained model expectations: MobileNetV2, EfficientNet, ResNet, etc.

5.2 transforms.Normalize vs tf.keras.applications.*.preprocess_input()

5.3 PyTorch: torchvision.models

5.4 TensorFlow: keras.applications

5.5 Inference vs training preprocessing pitfalls

5.6 Side-by-side code snippets for each model

Chapter 6: Image Datasets: Getting Data Into the Network

6.1 Folder structure conventions

6.2 PyTorch: Dataset, DataLoader, ImageFolder, transforms

6.3 TensorFlow: tf.data.Dataset, image_dataset_from_directory()

6.4 Label mapping, batching, and shuffling

6.5 Visualizing batches from both frameworks

Chapter 7: Data Augmentation Techniques (Expanded)

7.1 Common augmentations: RandomCrop, ColorJitter, Cutout

7.2 Advanced augmentations: Mixup, CutMix (optional)

7.3 PyTorch: torchvision.transforms

7.4 TensorFlow: tf.image, Keras preprocessing layers

7.5 Before/after visualization of augmentation effects

Part III – CNN Architectures and Concepts ¶

Chapter 8: Understanding CNN Layers

8.1 Kernels, filters, channels, strides, padding

8.2 Pooling (Max, Average), ReLU, BatchNorm

8.3 PyTorch: nn.Conv2d, nn.MaxPool2d, nn.BatchNorm2d, etc.

8.4 TensorFlow: Conv2D, MaxPooling2D, BatchNormalization, etc.

8.5 Conceptual breakdown + syntax comparison

Chapter 9: The CNN Vocabulary (Terms Demystified)

9.1 Key terms: kernel, convolution, stride, padding

9.2 Input/output channels, feature maps

9.3 Convolutional layer vs residual block

9.4 Layer variants: ReflectionPad2d, InstanceNorm2d, AdaptiveAvgPool2d

9.5 Visual and code-based examples

Chapter 10: Writing the forward() / call() Function

10.1 PyTorch: forward(), self.features, self.classifier

10.2 TensorFlow: call(), subclassing Model

10.3 Layer-by-layer flow visualized

10.4 Common mistakes in model building

Chapter 11: Model Summary and Parameter Inspection

11.1 PyTorch: model.parameters(), summary(), state_dict()

11.2 TensorFlow: .summary(), get_weights(), trainable_variables

11.3 How to freeze/unfreeze layers for fine-tuning

Chapter 12: Building Your First CNN: Patterns and Pitfalls

12.1 Simple architectures: LeNet-style, Mini-VGG

12.2 Choosing filter sizes, kernel shapes, stride

12.3 Stacking layers: when and why

12.4 Common design mistakes (too few filters, wrong input shape, etc.)

Part IV – Training and Fine-Tuning ¶

Chapter 13: Loss Functions and Optimizers

13.1 PyTorch: loss_fn(), .backward(), optimizer.step()

13.2 TensorFlow: GradientTape, optimizer.apply_gradients()

13.3 Common losses: CrossEntropy

13.4 Optimizers: SGD, Adam

13.5 Visualizing gradient flow

Chapter 14: Training Loop Mechanics

14.1 PyTorch: full training loop with train_loader

14.2 TensorFlow: model.fit() vs custom training loop

14.3 Logging loss and metrics

14.4 Checkpoint saving, early stopping

14.5 Adding visuals for debugging and learning

Chapter 15: Training Strategies and Fine-Tuning Pretrained CNNs

15.1 When to Fine-Tune vs Freeze

15.2 Adapting Pretrained Models

15.3 Regularization Techniques

15.4 Training Strategies for Generalization

15.5 Recognizing Overfitting and Underfitting

Part V – Inference, Evaluation, and Visual Debugging ¶

Chapter 16: Train vs Eval Mode

16.1 PyTorch: model.train(), model.eval(), no_grad()

16.2 TensorFlow: training=True/False

16.3 Dropout and BatchNorm behavior

16.4 Impact of mode on inference

Chapter 17: Visualizing Feature Maps and Filters

17.1 Getting intermediate layer outputs

17.2 PyTorch: forward hooks, manual slicing

17.3 ensorFlow: defining sub-models

17.4 Visualizing what the model is focusing on

Part VI – Deployment-Ready Insights ¶

Chapter 18: Inference Pipeline Design

18.1 Keeping preprocessing consistent (train vs inference)

18.2 Reusable preprocess functions

18.3 Input validation, test-time augmentation

Chapter 19: Common Errors and How to Debug Them

19.1 Model always predicts one class? Check normalization

19.2 Input shape mismatch? Check dataloader

19.3 Nothing’s working? Try a single image pipeline

19.4 Debugging checklist for CNN-based models

Appendices ¶

A. PyTorch vs TensorFlow Cheatsheet

B. Troubleshooting Image Model Failures

C. Glossary of Key Terms

D. Pretrained Model Reference Table (with links)

E. Sample Projects and Mini-Exercises per Chapter

Chapter Format¶

Each chapter ends with:

Conceptual Breakdown
PyTorch Implementation
TensorFlow Implementation
Framework Comparison Table
Use Case or Mini-Exercise