English | MP4 | AVC 1920×1080 | AAC 44KHz 2ch | 41h 15m | 18.7 GB

Deep Learning and Neural Networks Theory and Applications with PyTorch! Including Transformers, BERT and GPT!

This course is a comprehensive guide to Deep Learning and Neural Networks. The theories are explained in depth and in a friendly manner. After that, we’ll have the hands-on session, where we will be learning how to code Neural Networks in PyTorch, a very advanced and powerful deep learning framework!

The course includes the following Sections:

Section 1 – How Neural Networks and Backpropagation Works

In this section, you will deeply understand the theories of how neural networks and the backpropagation algorithm works, in a friendly manner. We will walk through an example and do the calculations step-by-step. We will also discuss the activation functions used in Neural Networks, with their advantages and disadvantages!

Section 2 – Loss Functions

In this section, we will introduce the famous loss functions that are used in Deep Learning and Neural Networks. We will walk through when to use them and how they work.

Section 3 – Optimization

In this section, we will discuss the optimization techniques used in Neural Networks, to reach the optimal Point, including Gradient Descent, Stochastic Gradient Descent, Momentum, RMSProp, Adam, AMSGrad, Weight Decay and Decoupling Weight Decay, LR Scheduler and others.

Section 4 – Weight Initialization

In this section,we will introduce you to the concepts of weight initialization in neural networks, and we will discuss some techniques of weights initialization including Xavier initialization and He norm initialization.

Section 5 – Regularization Techniques

In this section, we will introduce you to the regularization techniques in neural networks. We will first introduce overfitting and then introduce how to prevent overfitting by using regularization techniques, inclusing L1, L2 and Dropout. We’ll also talk about normalization as well as batch normalization and Layer Normalization.

Section 6- Introduction to PyTorch

In this section, we will introduce the deep learning framework we’ll be using through this course, which is PyTorch. We will show you how to install it, how it works and why it’s special, and then we will code some PyTorch tensors and show you some operations on tensors, as well as show you Autograd in code!

Section 7 – Practical Neural Networks in PyTorch – Application 1

In this section, you will apply what you’ve learned to build a Feed Forward Neural Network to classify handwritten digits. This is the first application of Feed Forward Networks we will be showing.

Section 8 – Practical Neural Networks in PyTorch – Application 2

In this section, we will build a feed forward Neural Network to classify weather a person has diabetes or not. We will train the network on a large dataset of diabetes!

Section 9 – Visualize the Learning Process

In this section, we will visualize how neural networks are learning, and how good they are at separating non-linear data!

Section 10 – Implementing a Neural Network from Scratch with Python and Numpy

In this section, we will understand and code up a neural network without using any deep learning library (from scratch using only python and numpy). This is necessary to understand how the underlying structure works.

Section 11 – Convolutional Neural Networks

In this section, we will introduce you to Convolutional Networks that are used for images. We will show you first the relationship to Feed Forward Networks, and then we will introduce you the concepts of Convolutional Networks one by one!

Section 12 – Practical Convolutional Networks in PyTorch

In this section, we will apply Convolutional Networks to classify handwritten digits. This is the first application of CNNs we will do.

Section 13- Deeper into CNN: Improving and Plotting

In this section, we will improve the CNN that we built in the previous section, as well show you how to plot the results of training and testing! Moreover, we will show you how to classify your own handwritten images through the network!

Section 14 – CNN Architectures

In this section, we will introduce the CNN architectures that are widely used in all deep learning applications. These architectures are: AlexNet, VGG net, Inception Net, Residual Networks and Densely Connected Networks. We will also discuss some object detection architectures.

Section 15- Residual Networks

In this section, we will dive deep into the details and theory of Residual Networks, and then we’ll build a Residual Network in PyTorch from scratch!

Section 16 – Transfer Learning in PyTorch – Image Classification

In this section, we will apply transfer learning on a Residual Network, to classify ants and bees. We will also show you how to use your own dataset and apply image augmentation. After completing this section, you will be able to classify any images you want!

Section 17- Convolutional Networks Visualization

In this section, we will visualize what the neural networks output, and what they are really learning. We will observe the feature maps of the network of every layer!

Section 18 – YOLO Object Detection (Theory)

In this section, we will learn one of the most famous Object Detection Frameworks: YOLO!! This section covers the theory of YOLO in depth.

Section 19 – Autoencoders and Variational Autoencoders

In this section, we will cover Autoencoders and Denoising Autoencoders. We will then see the problem they face and learn how to mitigate it with Variational Autoencoders.

Section 20 – Recurrent Neural Networks

In this section, we will introduce you to Recurrent Neural Networks and all their concepts. We will then discuss the Backpropagation through time, the vanishing gradient problem, and finally about Long Short Term Memory (LSTM) that solved the problems RNN suffered from.

Section 21 – Word Embeddings

In this section, we will discuss how words are represented as features. We will then show you some Word Embedding models. We will also show you how to implement word embedding in PyTorch!

Section 22 – Practical Recurrent Networks in PyTorch

In this section, we will apply Recurrent Neural Networks using LSTMs in PyTorch to generate text similar to the story of Alice in Wonderland! You can just replace the story with any other text you want, and the RNN will be able to generate text similar to it!

Section 23 – Sequence Modelling

In this section, we will learn about Sequence-to-Sequence Modelling. We will see how Seq2Seq models work and where they are applied. We’ll also talk about Attention mechanisms and see how they work.

Section 24 – Practical Sequence Modelling in PyTorch – Build a Chatbot

In this section, we will apply what we learned about sequence modeling and build a Chatbot with Attention Mechanism.

Section 25 – Saving and Loading Models

In this section, we will show you how to save and load models in PyTorch, so you can use these models either for later testing, or for resuming training!

Section 26 – Transformers

In this section, we will cover the Transformer, which is the current state-of-art model for NLP and language modeling tasks. We will go through each component of a transformer.

Section 27 – Build a Chatbot with Transformers

In this section, we will implement all what we learned in the previous section to build a Chatbot using Transformers.

What you’ll learn

- Understand How Neural Networks Work (Theory and Applications)
- Understand How Convolutional Networks Work (Theory and Applications)
- Understand How Recurrent Networks and LSTMs work (Theory and Applications)
- Learn how to use PyTorch in depth
- Understand how the Backpropagation algorithm works
- Understand Loss Functions in Neural Networks
- Understand Weight Initialization and Regularization Techniques
- Code-up a Neural Network from Scratch using Numpy
- Apply Transfer Learning to CNNs
- CNN Visualization
- Learn the CNN Architectures that are widely used nowadays
- Understand Residual Networks in Depth
- Understand YOLO Object Detection in Depth
- Visualize the Learning Process of Neural Networks
- Learn how to Save and Load trained models
- Learn Sequence Modeling with Attention Mechanisms
- Build a Chatbot with Attention
- Transformers
- Build a Chatbot with Transformers
- BERT
- Build an Image Captioning Model

## Table of Contents

**How Neural Networks and Backpropagation Works**

1 What Can Deep Learning Do

2 The Rise of Deep Learning

3 The Essence of Neural Networks

4 The Perceptron

5 Gradient Descent

6 The Forward Propagation

7 Backpropagation Part 1

8 Backpropagation Part 2

9 Before Proceeding with the Backpropagation

10 BEFORE STARTING…PLEASE READ THIS

**Loss Functions**

11 Mean Squared Error (MSE)

12 L1 Loss (MAE)

13 Huber Loss

14 Binary Cross Entropy Loss

15 Cross Entropy Loss

16 Softmax Function

17 KL divergence Loss

18 Contrastive Loss

19 Hinge Loss

20 Triplet Ranking Loss

21 Practical Loss Functions Note

22 Softmax with Temperature Controlling your distribution

**Activation Functions**

23 Why we need activation functions

24 Sigmoid Activation

25 Tanh Activation

26 ReLU and PReLU

27 Exponentially Linear Units (ELU)

28 Gated Linear Units (GLU)

29 Swish Activation

30 Mish Activation

**Regularization and Normalization**

31 Overfitting

32 L1 and L2 Regularization

33 Dropout

34 DropConnect

35 Normalization

36 Batch Normalization

37 Layer Normalization

38 Group Normalization

39 DropBlock in CNNs

40 Note on Weight Decay

**Optimization**

41 Batch Gradient Descent

42 Stochastic Gradient Descent

43 Mini-Batch Gradient Descent

44 Exponentially Weighted Average Intuition

45 Exponentially Weighted Average Implementation

46 Bias Correction in Exponentially Weighted Averages

47 Momentum

48 RMSProp

49 Adam Optimization

50 SWATS – Switching from Adam to SGD

51 Weight Decay

52 Decoupling Weight Decay

53 AMSGrad

**Hyperparameter Tuning and Learning Rate Scheduling**

54 Introduction to Hyperparameter Tuning and Learning Rate Recap

55 Step Learning Rate Decay

56 Cyclic Learning Rate

57 Cosine Annealing with Warm Restarts

58 Batch Size vs Learning Rate

**Weight Initialization**

59 Normal Distribution

60 What happens when all weights are initialized to the same value

61 Xavier Initialization

62 He Norm Initialization

63 Practical Weight Initialization Note

**Introduction to PyTorch**

64 CODE FOR THIS COURSE

65 Computation Graphs and Deep Learning Frameworks

66 Installing PyTorch and an Introduction

67 How PyTorch Works

68 Torch Tensors – Part 1

69 Torch Tensors – Part 2

70 Numpy Bridge, Tensor Concatenation and Adding Dimensions

71 Automatic Differentiation

72 Loss Functions in PyTorch

73 Weight Initialization in PyTorch

**Practical Neural Networks in PyTorch – Application 1 Diabetes**

74 Part 1 Data Preprocessing

75 Part 2 Data Normalization

76 Part 3 Creating and Loading the Dataset

77 Part 4 Building the Network

78 Part 5 Training the Network

79 Download the Dataset

**Visualize the Learning Process**

80 Visualize Learning Part 1

81 Visualize Learning Part 2

82 Visualize Learning Part 3

83 Visualize Learning Part 4

84 Visualize Learning Part 5

85 Visualize Learning Part 6

86 Neural Networks Playground

**Implementing a Neural Network from Scratch with Numpy**

87 The Dataset and Hyperparameters

88 Understanding the Implementation

89 Forward Propagation

90 Loss Function

91 Prediction

92 Backpropagation Equations

93 Backpropagation

94 Initializing the Network

95 Training the Model

96 Notebook for the following Lecture

**Practical Neural Networks in PyTorch – Application 2 Handwritten Digits**

97 Code Details

98 Importing and Defining Parameters

99 Defining the Network Class

100 Creating the network class and the network functions

101 Training the Network

102 Testing the Network

103 The MNIST Dataset

**Convolutional Neural Networks**

104 Prerequisite Filters

105 Introduction to Convolutional Networks and the need for them

106 Filters and Features

107 Convolution over Volume Animation

108 More on Convolutions

109 Quiz Solution Discussion

110 A Tool for Convolution Visualization

111 Activation, Pooling and FC

112 CNN Visualization

113 Important formulas

114 CNN Characteristics

115 Regularization and Batch Normalization in CNNs

116 DropBlock Dropout in CNNs

117 Softmax with Temperature

118 Convolution over Volume Animation Resource

**Practical Convolutional Networks in PyTorch – Image Classification**

119 Loading and Normalizing the Dataset

120 Visualizing and Loading the Dataset

121 Building the CNN

122 Defining the Model

123 Understanding the Propagation

124 Training the CNN

125 Testing the CNN

126 Plotting and Putting into Action

127 Predicting an image

128 Classifying your own Handwritten images

**CNN Architectures**

129 CNN Architectures Part 1

130 Residual Networks Part 1

131 Residual Networks Part 2

132 CNN Architectures Part 2

133 Densely Connected Networks

134 Squeeze-Excite Networks

135 Seperable Convolutions

136 Transfer Learning

137 Note on Residual Networks Implementation

**Practical Residual Networks in PyTorch**

138 Practical ResNet Part 1

139 Practical ResNet Part 2

140 Practical ResNet Part 3

141 Practical ResNet Part 4

**Transposed Convolutions**

142 Introduction to Transposed Convolutions

143 Convolution Operation as Matrix Multiplication

144 Transposed Convolutions

**Transfer Learning in PyTorch – Image Classification**

145 Data Augmentation

146 Loading the Dataset

147 Modifying the Network

148 Understanding the data

149 Finetuning the Network

150 Testing and Visualizing the results

**Convolutional Networks Visualization**

151 Data and the Model

152 Processing the Model

153 Visualizing the Feature Maps

**YOLO Object Detection (Theory)**

154 YOLO Theory Part 1

155 YOLO Theory Part 2

156 YOLO Theory Part 3

157 YOLO Theory Part 4

158 YOLO Theory Part 5

159 YOLO Theory Part 6

160 YOLO Theory Part 7

161 YOLO Theory Part 8

162 YOLO Theory Part 9

163 YOLO Theory Part 10

164 YOLO Theory Part 11

165 YOLO Theory Part 12

166 YOLO Code Note

**Autoencoders and Variational Autoencoders**

167 Autoencoders

168 Denoising Autoencoders

169 The Problem in Autoencoders

170 Variational Autoencoders

171 Probability Distributions Recap

172 Loss Function Derivation for VAE

173 Deep Fake

**Practical Variational Autoencoders in PyTorch**

174 Practical VAE Part 1

175 Practical VAE Part 2

176 Practical VAE Part 3

**Neural Style Transfer**

177 NST Theory Part 1

178 NST Theory Part 2

179 NST Theory Part 3

**Practical Neural Style Transfer in PyTorch**

180 NST Practical Part 1

181 NST Practical Part 2

182 NST Practical Part 3

183 NST Practical Part 4

184 Fast Neural Style Transfer

**Recurrent Neural Networks**

185 Why do we need RNNs

186 Vanilla RNNs

187 Quiz Solution Discussion

188 Backpropagation Through Time

189 Stacked RNNs

190 Vanishing and Exploding Gradient Problem

191 LSTMs

192 Bidirectional RNNs

193 GRUs

194 CNN-LSTM

**Word Embeddings**

195 What are Word Embeddings

196 Visualizing Word Embeddings

197 Measuring Word Embeddings

198 Word Embeddings Models

199 Word Embeddings in PyTorch

**Practical Recurrent Networks in PyTorch**

200 Creating the Dictionary

201 Processing the Text

202 Defining and Visualizing the Parameters

203 Creating the Network

204 Training the Network

205 Generating Text

206 Download the Dataset

**Saving and Loading Models**

207 Saving and Loading Part 1

208 Saving and Loading Part 2

209 Saving and Loading Part 3

**Sequence Modelling**

210 Sequence Modeling

211 Image Captioning

212 Attention Mechanisms

213 How Attention Mechanisms Work

**Practical Sequence Modelling in PyTorch Chatbot Application**

214 Introduction

215 Understanding the Encoder

216 Defining the Encoder

217 Understanding Pack Padded Sequence

218 Designing the Attention Model

219 Designing the Decoder Part 1

220 Designing the Decoder Part 2

221 Teacher Forcing

222 Download the Dataset

**Practical Sequence Modelling in PyTorch Image Captioning**

223 Implementation Details

224 Utility Functions

225 Accuracy Calculation

226 Constructing the Dataset Part 1

227 Constructing the Dataset Part 2

228 Creating the Encoder

229 Creating the Decoder Part 1

230 Creating the Decoder Part 2

231 Creating the Decoder Part 3

232 Train Function

233 Defining Hyperparameters

234 Evaluation Function

235 Training

236 Results

**Transformers**

237 Introduction to Transformers

238 Input Embeddings

239 Positional Encoding

240 MultiHead Attention Part 1

241 MultiHead Attention Part 2

242 Concat and Linear

243 Residual Learning

244 Layer Normalization

245 Feed Forward

246 Masked MultiHead Attention

247 MultiHead Attention in Decoder

248 Cross Entropy Loss

249 KL Divergence Loss

250 Label Smoothing

251 Dropout

252 Learning Rate Warmup

253 SANITY CHECK ON PREVIOUS SECTIONS

**Build a Chatbot with Transformers**

254 Dataset Preprocessing Part 1

255 Dataset Preprocessing Part 2

256 Dataset Preprocessing Part 3

257 Dataset Preprocessing Part 4

258 Dataset Preprocessing Part 5

259 Data Loading and Masking

260 Embeddings

261 MultiHead Attention Implementation Part 1

262 MultiHead Attention Implementation Part 2

263 MultiHead Attention Implementation Part 3

264 Feed Forward Implementation

265 Encoder Layer

266 Decoder Layer

267 Transformer

268 AdamWarmup

269 Loss with Label Smoothing

270 Defining the Model

271 Training Function

272 Evaluation Function

273 Main Function and User Evaluation

274 Action

275 CODE

276 SANITY CHECK ON PREVIOUS SECTIONS

**Universal Transformers**

277 Universal Transformers

278 Practical Universal Transformers Modifying the Transformers code

279 Transformers for other tasks

280 SANITY CHECK ON PREVIOUS SECTIONS

**Google Colab and Gradient Accumulation**

281 Running your models on Google Colab

282 Gradient Accumulation

**BERT**

283 What is BERT and its structure

284 Masked Language Modelling

285 Next Sentence Prediction

286 Fine-tuning BERT

287 Exploring Transformers

**Vision Transformers**

288 Vision Transformer Part 1

289 Vision Transformer Part 2

290 Vision Transformer Part 3

291 SANITY CHECK ON PREVIOUS SECTIONS

**GPT**

292 GPT Part 1

293 GPT Part 2

294 Zero-Shot Predictions with GPT

295 Byte-Pair Encoding

296 Technical Details of GPT

297 Playing with HuggingFace models

298 Implementation

Resolve the captcha to access the links!