BERT, GPT, Deep Learning, Machine Learning, & NLP with Hugging Face, Attention in Python, Tensorflow, PyTorch, & Keras
Ever since Transformers arrived on the scene, deep learning hasn’t been the same.
Machine learning is able to generate text essentially indistinguishable from that created by humans
We’ve reached new state-of-the-art performance in many NLP tasks, such as machine translation, question-answering, entailment, named entity recognition, and more
We’ve created multi-modal (text and image) models that can generate amazing art using only a text prompt
We’ve solved a longstanding problem in molecular biology known as “protein structure prediction”
In this course, you will learn very practical skills for applying transformers, and if you want, detailed theory behind how transformers and attention work.
This is different from most other resources, which only cover the former.
The course is split into 3 major parts:
- Using Transformers
- Fine-Tuning Transformers
- Transformers In-Depth
PART 1: Using Transformers
In this section, you will learn how to use transformers which were trained for you. This costs millions of dollars to do, so it’s not something you want to try by yourself!
We’ll see how these prebuilt models can already be used for a wide array of tasks, including:
- text classification (e.g. spam detection, sentiment analysis, document categorization)
- named entity recognition
- text summarization
- machine translation
- generating (believable) text
- masked language modeling (article spinning)
- zero-shot classification
This is already very practical.
If you need to do sentiment analysis, document categorization, entity recognition, translation, summarization, etc. on documents at your workplace or for your clients – you already have the most powerful state-of-the-art models at your fingertips with very few lines of code.
One of the most amazing applications is “zero-shot classification”, where you will observe that a pretrained model can categorize your documents, even without any training at all.
PART 2: Fine-Tuning Transformers
In this section, you will learn how to improve the performance of transformers on your own custom datasets. By using “transfer learning”, you can leverage the millions of dollars of training that have already gone into making transformers work very well.
You’ll see that you can fine-tune a transformer with relatively little work (and little cost).
We’ll cover how to fine-tune transformers for the most practical tasks in the real-world, like text classification (sentiment analysis, spam detection), entity recognition, and machine translation.
PART 3: Transformers In-Depth
In this section, you will learn how transformers really work. The previous sections are nice, but a little too nice. Libraries are OK for people who just want to get the job done, but they don’t work if you want to do anything new or interesting.
Let’s be clear: this is very practical.
How practical, you might ask?
Well, this is where the big bucks are.
Those who have a deep understanding of these models and can do things no one has ever done before are in a position to command higher salaries and prestigious titles. Machine learning is a competitive field, and a deep understanding of how things work can be the edge you need to come out on top.
We’ll also look at how to implement transformers from scratch.
As the great Richard Feynman once said, “what I cannot create, I do not understand”.
What you’ll learn
- Apply transformers to real-world tasks with just a few lines of code
- Fine-tune transformers on your own datasets with transfer learning
- Sentiment analysis, spam detection, text classification
- NER (named entity recognition), parts-of-speech tagging
- Build your own article spinner for SEO
- Generate believable human-like text
- Neural machine translation and text summarization
- Question-answering (e.g. SQuAD)
- Zero-shot classification
- Understand self-attention and in-depth theory behind transformers
- Implement transformers from scratch
- Use transformers with both Tensorflow and PyTorch
- Understand BERT, GPT, GPT-2, and GPT-3, and where to apply them
- Understand encoder, decoder, and seq2seq architectures
- Master the Hugging Face Python library
Table of Contents
3 Where to get the code and data
4 How to use Github & Extra Coding Tips (Optional)
5 Are You Beginner, Intermediate, or Advanced All are OK!
6 Beginner’s Corner Section Introduction
7 From RNNs to Attention and Transformers – Intuition
8 Sentiment Analysis
9 Sentiment Analysis in Python
10 Text Generation
11 Text Generation in Python
12 Masked Language Modeling (Article Spinner)
13 Masked Language Modeling (Article Spinner) in Python
14 Named Entity Recognition (NER)
15 Named Entity Recognition (NER) in Python
16 Text Summarization
17 Text Summarization in Python
18 Neural Machine Translation
19 Neural Machine Translation in Python
20 Question Answering
21 Question Answering in Python
22 Zero-Shot Classification
23 Zero-Shot Classification in Python
24 Beginner’s Corner Section Summary
25 Fine-Tuning Section Introduction
26 Text Preprocessing and Tokenization Review
27 Models and Tokenizers
28 Models and Tokenizers in Python
29 Transfer Learning & Fine-Tuning (pt 1)
30 Transfer Learning & Fine-Tuning (pt 2)
31 Transfer Learning & Fine-Tuning (pt 3)
32 Fine-Tuning Sentiment Analysis and the GLUE Benchmark
33 Fine-Tuning Sentiment Analysis in Python
34 Fine-Tuning Transformers with Custom Dataset
35 Hugging Face AutoConfig
36 Fine-Tuning with Multiple Inputs (Textual Entailment)
37 Fine-Tuning Transformers with Multiple Inputs in Python
38 Fine-Tuning Section Summary
Named Entity Recognition (NER) and POS Tagging
39 Token Classification Section Introduction
40 Data & Tokenizer (Code Preparation)
41 Data & Tokenizer (Code)
42 Target Alignment (Code Preparation)
43 Create Tokenized Dataset (Code Preparation)
44 Target Alignment (Code)
45 Data Collator (Code Preparation)
46 Data Collator (Code)
47 Metrics (Code Preparation)
48 Metrics (Code)
49 Model and Trainer (Code Preparation)
50 Model and Trainer (Code)
51 POS Tagging & Custom Datasets (Exercise Prompt)
52 POS Tagging & Custom Datasets (Solution)
53 Token Classification Section Summary
Seq2Seq and Neural Machine Translation
54 Translation Section Introduction
55 Data & Tokenizer (Code Preparation)
56 Data & Tokenizer (Code)
57 Aside Seq2Seq Basics (Optional)
58 Model Inputs (Code Preparation)
59 Model Inputs (Code)
60 Translation Metrics (BLEU Score & BERT Score) (Code Preparation)
61 Translation Metrics (BLEU Score & BERT Score) (Code)
62 Train & Evaluate (Code Preparation)
63 Train & Evaluate (Code)
64 Translation Section Summary
Transformers and Attention Theory (Advanced)
65 Theory Section Introduction
66 Basic Self-Attention
67 Self-Attention & Scaled Dot-Product Attention
68 Attention Efficiency
69 Attention Mask
70 Multi-Head Attention
71 Transformer Block
72 Positional Encodings
73 Encoder Architecture
74 Data Links
Setting Up Your Environment FAQ
75 Anaconda Environment Setup
76 How to install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow
Extra Help With Python Coding for Beginners FAQ
77 How to Code by Yourself (part 1)
78 How to Code by Yourself (part 2)
79 Proof that using Jupyter Notebook is the same as not using it
Effective Learning Strategies for Machine Learning FAQ
80 How to Succeed in this Course (Long Version)
81 Is this for Beginners or Experts Academic or Practical Fast or slow-paced
82 Machine Learning and AI Prerequisite Roadmap (pt 1)
83 Machine Learning and AI Prerequisite Roadmap (pt 2)
Appendix FAQ Finale
84 What is the Appendix