Chapter 3: Dummy Classifiers — The Baselinec
        3.1 Math Intuition: No math—random or majority voting.
        3.2 Code Walkthrough: Implement on Iris dataset; compare strategies.
        3.3 Parameter Explanations: Strategy options (most_frequent, stratified).
        3.4 Your personal growth and career alignment.
        3.4 Source Code Dissection of DummyClassifier.

     Chapter 4: Logistic & Linear Regression
        4.1 Math Intuition + Geometry: Sigmoid function, log-odds, decision boundary.
        4.2 Code Walkthrough: Binary/multi-class on Wine dataset.
        4.3 Parameter Explanations: C (regularization), solvers, multi_class.
        4.4 Model Tuning + Diagnostics: Grid search C; check coefficients for interpretability.
        4.5 Source Code Dissection of LogisticRegression.

        4.6 Math Intuition + Geometry: Least squares, hyperplanes; Ridge/Lasso penalties.
        4.7 Code Walkthrough: Predict Boston Housing prices; compare OLS vs Ridge.
        4.8 Parameter Explanations: Alpha for regularization, degree for polynomial.
        4.9 Model Tuning + Diagnostics: Cross-validate alpha; plot residuals.
        4.10 Source Code Dissection of LinearRegression.

     Chapter 5: K-Nearest Neighbors (KNN)
        5.1 Math Intuition + Geometry: Distance metrics (Euclidean), voting in feature space.
        5.2 Code Walkthrough: Classify on Iris dataset with varying k.
        5.3 Parameter Explanations: n_neighbors, weights, metric.
        5.4 Model Tuning + Diagnostics: Elbow plot for k; curse of dimensionality.
        5.5 Source Code Dissection of KNeighborsClassifier.

     Chapter 6: Decision Trees
        6.1 Math Intuition + Geometry: Entropy/ Gini, recursive splitting.
        6.2 Code Walkthrough: Build on HAR dataset; visualize tree.
        6.3 Parameter Explanations: max_depth, min_samples_split, criterion.
        6.4 Model Tuning + Diagnostics: Prune with CV; feature importance.
        6.5 Source Code Dissection of DecisionTreeClassifier.

     Chapter 7: Support Vector Machines (SVM)
        7.1 Math Intuition + Geometry: Margins, kernels, Lagrange multipliers.
        7.2 Code Walkthrough: RBF SVM on HAR dataset with PCA.
        7.3 Parameter Explanations: C, gamma, kernel types.
        7.4 Model Tuning + Diagnostics: Grid search; plot decision boundaries.
        7.5 Deep Dive: Advanced kernel math.
        7.6 Source Code Dissection of SVC.

     Chapter 8: Naive Bayes Classifiers
        8.1 Math Intuition + Geometry: Bayes theorem, conditional independence.
        8.2 Code Walkthrough: Text classification on a simple dataset.
        8.3 Parameter Explanations: Alpha (smoothing), priors.
        8.4 Model Tuning + Diagnostics: Handle zero probabilities; compare variants.
        8.5 Source Code Dissection of GaussianNB.

     Chapter 9: Random Forests and Bagging
        9.1 Math Intuition + Geometry: Bootstrap aggregating, ensemble voting.
        9.2 Code Walkthrough: Random Forest on Wine dataset.
        9.3 Parameter Explanations: n_estimators, max_features, bootstrap.
        9.4 Model Tuning + Diagnostics: OOB score; feature importance.
        9.5 Source Code Dissection of RandomForestClassifier.

     Chapter 10: Gradient Boosting (HistGradientBoostingClassifier)
        10.1 Math Intuition + Geometry: Gradient descent on residuals, additive trees.
        10.2 Code Walkthrough: Boost on HAR dataset.
        10.3 Parameter Explanations: learning_rate, max_depth, early_stopping.
        10.4 Model Tuning + Diagnostics: Monitor loss; avoid overfitting.
        10.5 Deep Dive: XGBoost comparison.

Part III – Core Algorithms (Unsupervised Learning)¶

     Chapter 11: K-Means Clustering
        11.1 Math Intuition + Geometry: Centroids, within-cluster sum of squares.
        11.2 Code Walkthrough: Cluster Iris dataset; elbow method for k.
        11.3 Parameter Explanations: n_clusters, init, n_init.
        11.4 Model Tuning + Diagnostics: Silhouette scores; visualize clusters.
        11.5 Source Code Dissection of KMeans.

     Chapter 12: Hierarchical Clustering
        12.1 Math Intuition + Geometry: Dendrograms, linkage methods.
        12.2 Code Walkthrough: Agglomerative clustering on Wine dataset.
        12.3 Parameter Explanations: linkage, affinity, n_clusters.
        12.4 Model Tuning + Diagnostics: Cut dendrogram; compare linkages.
        12.5 Source Code Dissection of AgglomerativeClustering.

     Chapter 13: DBSCAN and Density-Based Clustering
        13.1 Math Intuition + Geometry: Core points, density reachability.
        13.2 Code Walkthrough: Detect clusters in noisy data.
        13.3 Parameter Explanations: eps, min_samples.
        13.4 Model Tuning + Diagnostics: Handle noise; parameter sensitivity.
        13.5 Source Code Dissection of DBSCAN.

Part IV – Model Evaluation & Tuning ¶

     Chapter 14: Model Evaluation Metrics
        14.1 Accuracy, precision, recall, F1
        14.2 Confusion Matrix, ROC, PR Curves
        14.3 When metrics disagree

     Chapter 15: Cross-Validation & StratifiedKFold
        15.1 Why we need CV
        15.2 KFold vs Stratified
        15.3 cross_validate, GridSearchCV, RandomizedSearchCV

     Chapter 16: Hyperparameter Tuning
        16.1 Grid search vs random search
        16.2 Search space design
        16.3 Practical examples with SVM and RF

     Chapter 17: Probability Calibration
        17.1 Why predicted probabilities can lie
        17.2 Platt scaling (sigmoid), isotonic regression
        17.3 CalibratedClassifierCV explained

     Chapter 18: Choosing Decision Thresholds
        18.1 Predicting probabilities vs predicting classes
        18.2 Optimizing for F1, cost-sensitive thresholds
        18.3 Manual threshold tuning with plots

Part V – Data Engineering & Preprocessing ¶

     Chapter 19: Feature Scaling and Transformation
        19.1 StandardScaler, MinMaxScaler
        19.2 When to scale and why
        19.3 Scaling inside pipelines

     Chapter 20: Dimensionality Reduction
        20.1 PCA: Math and scikit-learn usage
        20.2 Using PCA with pipelines
        20.3 Visualization

     Chapter 21: Dealing with Imbalanced Datasets
        21.1 What is imbalance?
        21.2 SMOTE and oversampling
        21.3 Class weights vs resampling

Part VI – Advanced Topics ¶

     Chapter 22: Pipelines and Workflows
        22.1 Building maintainable ML pipelines
        22.2 Pipeline, ColumnTransformer, custom steps

     Chapter 23: Under the Hood of scikit-learn
        23.1 How fit is structured
        23.2 Estimator base classes
        23.3 Digging into the source

Appendices & Templates ¶

A. Glossary of ML terms
B. scikit-learn cheat sheet
C. Tips for debugging models
D. Further reading and learning roadmap

Table of Contents

Classical Machine Learning¶

A Builder’s Guide to Mastering Traditional Algorithms with scikit-learn¶

Contents¶

📖 Preface ¶

Part I – PART I — Foundations ¶

Part II – Core Algorithms (Supervised Learning)¶

Part III – Core Algorithms (Unsupervised Learning)¶

Part IV – Model Evaluation & Tuning ¶

Part V – Data Engineering & Preprocessing ¶

Part VI – Advanced Topics ¶

Appendices & Templates ¶

Table of Contents

Classical Machine Learning¶

A Builder’s Guide to Mastering Traditional Algorithms with scikit-learn¶

Contents¶

📖 Preface¶

Part I – PART I — Foundations¶

Part II – Core Algorithms (Supervised Learning)¶

Part III – Core Algorithms (Unsupervised Learning)¶

Part IV – Model Evaluation & Tuning¶

Part V – Data Engineering & Preprocessing¶

Part VI – Advanced Topics¶

Appendices & Templates¶

📖 Preface ¶

Part I – PART I — Foundations ¶

Part IV – Model Evaluation & Tuning ¶

Part V – Data Engineering & Preprocessing ¶

Part VI – Advanced Topics ¶

Appendices & Templates ¶