Deep Reinforcement Learning 2.0

Deep Reinforcement Learning 2.0

English | MP4 | AVC 1920×1080 | AAC 44KHz 2ch | 9h 38m | 3.90 GB

The smartest combination of Deep Q-Learning, Policy Gradient, Actor Critic, and DDPG

Welcome to Deep Reinforcement Learning 2.0!

In this course, we will learn and implement a new incredibly smart AI model, called the Twin-Delayed DDPG, which combines state of the art techniques in Artificial Intelligence including continuous Double Deep Q-Learning, Policy Gradient, and Actor Critic. The model is so strong that for the first time in our courses, we are able to solve the most challenging virtual AI applications (training an ant/spider and a half humanoid to walk and run across a field).

To approach this model the right way, we structured the course in three parts:

Part 1: Fundamentals
In this part we will study all the fundamentals of Artificial Intelligence which will allow you to understand and master the AI of this course. These include Q-Learning, Deep Q-Learning, Policy Gradient, Actor-Critic and more.

Part 2: The Twin-Delayed DDPG Theory
We will study in depth the whole theory behind the model. You will clearly see the whole construction and training process of the AI through a series of clear visualization slides. Not only will you learn the theory in details, but also you will shape up a strong intuition of how the AI learns and works. The fundamentals in Part 1, combined to the very detailed theory of Part 2, will make this highly advanced model accessible to you, and you will eventually be one of the very few people who can master this model.

Part 3: The Twin-Delayed DDPG Implementation
We will implement the model from scratch, step by step, and through interactive sessions, a new feature of this course which will have you practice on many coding exercises while we implement the model. By doing them you will not follow passively the course but very actively, therefore allowing you to effectively improve your skills. And last but not least, we will do the whole implementation on Colaboratory, or Google Colab, which is a totally free and open source AI platform allowing you to code and train some AIs without having any packages to install on your machine. In other words, you can be 100% confident that you press the execute button, the AI will start to train and you will get the videos of the spider and humanoid running in the end.

What you’ll learn

  • Q-Learning
  • Deep Q-Learning
  • Policy Gradient
  • Actor Critic
  • Deep Deterministic Policy Gradient (DDPG)
  • Twin-Delayed DDPG (TD3)
  • The Foundation Techniques of Deep Reinforcement Learning
  • How to implement a state of the art AI model that is over performing the most challenging virtual applications
Table of Contents

Part 1 – Fundamentals
1 Welcome
2 Q-Learning
3 Deep Q-Learning
4 Policy Gradient
5 Actor-Critic
6 Taxonomy of AI models
7 BONUS 5 Advantages of DRL
8 BONUS Learning Path
9 BONUS RL Algorithms Map
10 Get the materials
11 Some resources before we start

Part 2 – Twin Delayed DDPG Theory
12 Introduction and Initialization
13 The Q-Learning part
14 The Policy Learning part
15 The whole training process

Part 3 – Twin Delayed DDPG Implementation
16 Beginning
17 Implementation – Step 1
18 Implementation – Step 2
19 Implementation – Step 3
20 Implementation – Step 4
21 Implementation – Step 5
22 Implementation – Step 6
23 Implementation – Step 7
24 Implementation – Step 8
25 Implementation – Step 9
26 Implementation – Step 10
27 Implementation – Step 11
28 Implementation – Step 12
29 Implementation – Step 13
30 Implementation – Step 14
31 Implementation – Step 15
32 Implementation – Step 16
33 Implementation – Step 17
34 Implementation – Step 18
35 Implementation – Step 19
36 Implementation – Step 20
37 The whole code folder of the course with all the implementations

The Final Demo!
38 Demo – Training
39 Demo – Inference

Annex 1 – Artificial Neural Networks
40 Plan of Attack
41 The Neuron
42 The Activation Function
43 How do Neural Networks Work
44 How do Neural Networks Learn
45 Gradient Descent
46 Stochastic Gradient Descent
47 Backpropagation

Annex 2 – Q-Learning
48 Plan of Attack
49 What is Reinforcement Learning
50 The Bellman Equation
51 The Plan
52 Markov Decision Process
53 Policy vs Plan
54 Living Penalty
55 Q-Learning Intuition
56 Temporal Difference
57 Q-Learning Visualization

Annex 3 – Deep Q-Learning
58 Plan of Attack
59 Deep Q-Learning Intuition – Step 1
60 Deep Q-Learning Intuition – Step 2
61 Experience Replay
62 Action Selection Policies

Bonus Lectures
63 YOUR SPECIAL BONUS

Homepage