CS247 Deep Learning
Table of contents
Index
- Introduction
- 2 Linear Algebra
- 2.1 Scalars, Vectors, Matrices, and Tensors
- 2.2 Multiplying Matrices and Vectors
- 2.3 Identity and Inverse Matrices
- 2.4 Linear Dependence and Span
- 2.5 Norms
- 2.6 Special Kinds of Matrices and Vectors
- 2.7 Eigendecomposition
- 2.8 Singular Value Decomposition
- 2.9 The Moore-Penrose Pseudoinverse
- 2.10 The Trace Operator
- 2.11 The Determinant
- 2.12 Example: Principal Components Analysis
- 3 Probability and Information Theory
- 3.1 Why Probability?
- 3.2 Random Variables
- 3.3 Probability Distributions
- 3.4 Marginal Probability
- 3.5 Conditional Probability
- 3.6 The Chain Rule of Conditional Probabilities
- 3.7 Independence and Conditional Independence
- 3.8 Expectation, Variance, and Covariance
- 3.9 Common Probability Distributions
- 3.10 Useful Properties of Common Functions
- 3.11 Bayes’ Rule
- 3.12 Technical Details of Continuous Variables
- 3.13 Information Theory
- 3.14 Structured Probabilistic Models
- Numerical Computation
- 5 Machine Learning Basics
- 5.1 Learning Algorithms
- 5.2 Capacity, Overfitting, and Underfitting
- 5.3 Hyperparameters and Validation Sets
- 5.4 Estimators, Bias and Variance
- 5.5 Maximum Likelihood Estimation
- 5.6 Bayesian Statistics
- 5.7 Supervised Learning Algorithms
- 5.8 Unsupervised Learning Algorithms
- 5.9 Stochastic Gradient Descent
- 5.10 Building a Machine Learning Algorithm
- 5.11 Challenges Motivating Deep Learning
- 6 Deep Feedforward Networks
- 7 Regularization for Deep Learning
- 7.1 Parameter Norm Penalties
- 7.2 Norm Penalties as Constrained Optimization
- 7.3 Regularization and Under-Constrained Problems
- 7.4 Dataset Augmentation
- 7.5 Noise Robustness
- 7.6 Semi-Supervised Learning
- 7.7 Multi-Task Learning
- 7.8 Early Stopping
- 7.9 Parameter Tying and Parameter Sharing
- 7.10 Sparse Representations
- 7.11 Bagging and Other Ensemble Methods
- 7.12 Dropout
- 7.13 Adversarial Training
- 7.14 Tangent Distance, Tangent Prop, and Manifold Tangent Classifier
- 8 Optimization for Training Deep Models
- 9 Convolutional Networks
- 9.1 The Convolution Operation
- 9.2 Motivation
- 9.3 Pooling
- 9.4 Convolution and Pooling as an Infinitely Strong Prior
- 9.5 Variants of the Basic Convolution Function
- 9.6 Structured Outputs
- 9.7 Data Types
- 9.8 Efficient Convolution Algorithms
- 9.9 Random or Unsupervised Features
- 9.10 The Neuroscientific Basis for Convolutional Networks
- 9.11 Convolutional Networks and the History of Deep Learning
- 10 Sequence Modeling: Recurrent and Recursive Nets
- 10.1 Unfolding Computational Graphs
- 10.2 Recurrent Neural Networks
- 10.3 Bidirectional RNN
- 10.4 Encoder-Decoder Sequence-to-Sequence Architectures
- 10.5 Deep Recurrent Networks
- 10.6 Recursive Neural Networks
- 10.7 The Challenge of Long-Term Dependencies
- 10.8 Echo State Networks
- 10.9 Leaky Units and Other Strategies for Multiple Time Scales
- 10.10 The Long Short-Term Memory and Other Gated RNNs
- 10.11 Optimization for Long-Term Dependencies
- 10.12 Explicit Memory
- 11 Practical Methodology
- 12 Applications
- 13 Linear Factor Models
- 14 Autoencoders
- 14.1 Undercomplete Autoencoders
- 14.2 Regularized Autoencoders
- 14.3 Representational Power, Layer Size and Depth
- 14.4 Stochastic Encoders and Decoders
- 14.5 Denoising Autoencoder
- 14.6 Learning Manifolds with Autoencoders
- 14.7 Contractive Autoencoders
- 14.8 Predictive Sparse Decomposition
- 14.9 Applications of Autoencoders
- 15 Representation Learning
- 16 Structured Probabilistic Models for Deep Learning
- 17 Monte Carlo Methods
- 18 Confronting the Partition Function
- 19 Approximate Inference
- 20 Deep Generative Models
- 20.1 Boltzmann Machines
- 20.2 Restricted Boltzmann Machines
- 20.3 Deep Belief Networks
- 20.4 Deep Boltzmann Machines
- 20.5 Boltzmann Machines for Real-Valued Data
- 20.6 Convolutional Boltzmann Machines
- 20.7 Boltzmann Machines for Structured or Sequential Outputs
- 20.8 Other Boltzmann Machine
- 20.9 Back-Propagation through Random Operations
- 20.10 Directed Generative Nets
- 20.11 Drawing Samples from Autoencoders
- 20.12 Generative Stochastic Networks
- 20.13 Other Generation Schemes
- 20.14 Evaluating Generative Models
- 20.15 Conclusion
- Bibliography
- Index