Mathematical Foundations of Deep Learning Models and Algorithms

[ Book Description | Book Citation | Table of Contents | Code and Exercises | Errata ]

Book Description

Deep learning uses multi-layer neural networks to model complex data patterns. Large models—with millions or even billions of parameters—are trained on massive datasets. This approach has produced revolutionary advances in image, text, and speech recognition and also has enjoyed many successful applications in a range of other fields such as engineering, finance, mathematics, and medicine.

What you will learn: The book "Mathematical Foundations of Deep Learning Models and Algorithms" aims to serve as an introduction to the mathematical theory underpinning recent advances in deep learning. Detailed derivations as well as mathematical proofs are presented for many of the models and optimization methods which are commonly used in machine learning and deep learning. Applications, code, and practical approaches to training models are also included.

The book is designed for advanced undergraduates, graduate students, practitioners, and researchers. Divided into two parts, it begins with mathematical foundations before tackling advanced topics in approximation, optimization, and neural network training.

Part 1 focuses on a mathematical introduction to deep learning. Part 1 is written for a general audience, including students in mathematics, statistics, computer science, data science, or engineering.
Part 2 contains advanced topics and convergence results in deep learning.

Together, Part 1 and Part 2 form an ideal foundation for an introductory course on the mathematics of deep learning. Our hope is that the combination of both parts offers a better comprehension of the very exciting topic of Deep Learning!

Thoughtfully designed exercises and a companion website with code examples enhance both theoretical understanding and practical skills, preparing readers to engage more deeply with this fast-evolving field.

Who should read this: This book provides a rigorous, yet accessible, mathematical foundation for deep learning models and algorithms. It is intended for advanced undergraduate students, graduate students, researchers, and practitioners who seek a deeper mathematical foundations of modern deep learning models and algorithms!

Free sample material: Freely available components of the book can be downloaded from the publisher's webpage here.

All code and Jupyter notebooks are hosted on GitHub: Access all python code associated with the book!

top

Book Citation

To properly cite the book use the following bibtex entry:

@book{MathDLBook-2025,
    title={Mathematical Foundations of Deep Learning Models and Algorithms},
    author={Konstantinos Spiliopoulos and Richard Sowers and Justin Sirignano},
    publisher={American Mathematical Society},
    note={\url{MathDL.github.io}},
    year={2025}
}

top

The book is organized as follows:

Table of Contents (detailed)
Preface
Notation
Website
Chapter 1. Introduction
Part 1. Mathematical Introduction to Deep Learning
- Chapter 2. Linear Regression
- Chapter 3. Logistic Regression
- Chapter 4. From the Perceptron Model to Kernels to Neural Networks
- Chapter 5. Feed Forward Neural Networks
- Chapter 6. Backpropagation
- Chapter 7. Basics on Stochastic Gradient Descent
- Chapter 8. Stochastic Gradient Descent for Multi-layer Networks
- Chapter 9. Regularization and Dropout
- Chapter 10. Batch Normalization
- Chapter 11. Training, Validation and Testing
- Chapter 12. Feature Importance
- Chapter 13. Recurrent Neural Networks for Sequential Data (includes Attention Mechanism and the Transformer Architecture)
- Chapter 14. Convolution Neural Networks
- Chapter 15. Variational Inference and Generative Models
Part 2. Advanced Topics and Convergence Results in Deep Learning
- Transitioning from Part 1 to Part 2
- Chapter 16. Universal Approximation Theorem
- Chapter 17. Convergence Analysis of Gradient Descent
- Chapter 18. Convergence Analysis of Stochastic Gradient Descent
- Chapter 19. The Neural Tangent Kernel Regime
- Chapter 20. Optimization in the feature learning regime: mean field scaling
- Chapter 21. Reinforcement Learning
- Chapter 22. Neural Differential Equations
- Chapter 23. Distributed Training
- Chapter 24. Automatic Differentiation
Part 3. Appendix
- Appendix A. Background Material in Probability
- Appendix B. Background Material in Analysis
Bibliography
Index

top

Code and Exercises

The Python code and datasets accompanying the different chapters of the book can be found at the dedicated GitHub site. They provide reproducible experiments and solutions to selected exercises.

Access all Python code associated with the book!

A number of exercises have been included to aid the reader in a better comprehension of the material. A solutions manual is available to the instructor of a class using this book upon request from the publisher.

top

Errata

Errata in the published editions will be maintained here: Collecting Errata.

top