Natural Language Processing with Transformers in Python

Natural Language Processing with Transformers in Python

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 11.5 Hours | 3.28 GB

Learn next-generation NLP with transformers using PyTorch, TensorFlow, and HuggingFace!

Transformer models are the de-facto standard in modern NLP. They have proven themselves as the most expressive, powerful models for language by a large margin, beating all major language-based benchmarks time and time again.

In this course, we learn all you need to know to get started with building cutting-edge performance NLP applications using transformer models like Google AI’s BERT, or Facebook AI’s DPR.

We cover several key NLP frameworks including:

  • HuggingFace’s Transformers
  • TensorFlow 2
  • PyTorch
  • spaCy
  • NLTK
  • Flair

And learn how to apply transformers to some of the most popular NLP use-cases:

  • Language classification/sentiment analysis
  • Named entity recognition (NER)
  • Question and Answering
  • Similarity/comparative learning

Throughout each of these use-cases we work through a variety of examples to ensure that what, how, and why transformers are so important. Alongside these sections we also work through two full-size NLP projects, one for sentiment analysis of financial Reddit data, and another covering a fully-fledged open domain question-answering application.

All of this is supported by several other sections that encourage us to learn how to better design, implement, and measure the performance of our models, such as:

  • History of NLP and where transformers come from
  • Common preprocessing techniques for NLP
  • The theory behind transformers
  • How to fine-tune transformers

We cover all this and more, I look forward to seeing you in the course!

What you’ll learn

  • How to use transformer models for NLP
  • Modern natural language processing technologies
  • An overview of recent development in NLP
  • Python
  • Machine Learning
  • Natural Language Processing
  • Tensorflow
  • PyTorch
  • Transformers
  • Sentiment Analysis
  • Question and Answering
  • Named Entity Recognition
Table of Contents

1 Introduction
2 Course Overview
3 Environment Setup
4 CUDA Setup

NLP and Transformers
5 The Three Eras of AI
6 Pros and Cons of Neural AI
7 Word Vectors
8 Recurrent Neural Networks
9 Long Short-Term Memory
10 Encoder-Decoder Attention
11 Self-Attention
12 Multi-head Attention
13 Positional Encoding
14 Transformer Heads

Preprocessing for NLP
15 Stopwords
16 Tokens Introduction
17 Model-Specific Special Tokens
18 Stemming
19 Lemmatization
20 Unicode Normalization – Canonical and Compatibility Equivalence
21 Unicode Normalization – Composition and Decomposition
22 Unicode Normalization – NFD and NFC
23 Unicode Normalization – NFKD and NFKC

24 Attention Introduction
25 Alignment With Dot-Product
26 Dot-Product Attention
27 Self Attention
28 Bidirectional Attention
29 Multi-head and Scaled Dot-Product Attention

Language Classification
30 Introduction to Sentiment Analysis
31 Prebuilt Flair Models
32 Introduction to Sentiment Models With Transformers
33 Tokenization And Special Tokens For BERT
34 Making Predictions

[Project] Sentiment Model With TensorFlow and Transformers
35 Project Overview
36 Getting the Data (Kaggle API)
37 Preprocessing
38 Building a Dataset
39 Dataset Shuffle, Batch, Split, and Save
40 Build and Save
41 Loading and Prediction

Long Text Classification With BERT
42 Classification of Long Text Using Windows
43 Window Method in PyTorch

Named Entity Recognition (NER)
44 Introduction to spaCy
45 Extracting Entities
46 NER Walkthrough
47 Authenticating With The Reddit API
48 Pulling Data With The Reddit API
49 Extracting ORGs From Reddit Data
50 Getting Entity Frequency
51 Entity Blacklist
52 NER With Sentiment
53 NER With roBERTa

Question and Answering
54 Open Domain and Reading Comprehension
55 Retrievers, Readers, and Generators
56 Intro to SQuAD 2.0
57 Processing SQuAD Training Data
58 (Optional) Processing SQuAD Training Data with Match-Case
59 Processing SQuAD Dev Data
60 Our First Q&A Model

Metrics For Language
61 Q&A Performance With Exact Match (EM)
62 ROUGE in Python
63 Applying ROUGE to Q&A
64 Recall, Precision and F1
65 Longest Common Subsequence (LCS)
66 Q&A Performance With ROUGE

Reader-Retriever QA With Haystack
67 Intro to Retriever-Reader and Haystack
68 What is Elasticsearch
69 Elasticsearch Setup (Windows)
70 Elasticsearch Setup (Linux)
71 Elasticsearch in Haystack
72 Sparse Retrievers
73 Cleaning the Index
74 Implementing a BM25 Retriever
75 What is FAISS
76 FAISS in Haystack
77 What is DPR
78 The DPR Architecture
79 Retriever-Reader Stack

[Project] Open-Domain QA
80 ODQA Stack Structure
81 Creating the Database
82 Building the Haystack Pipeline

83 Introduction to Similarity
84 Extracting The Last Hidden State Tensor
85 Sentence Vectors With Mean Pooling
86 Using Cosine Similarity
87 Similarity With Sentence-Transformers

Fine-Tuning Transformer Models
88 Visual Guide to BERT Pretraining
89 Introduction to BERT For Pretraining Code
90 BERT Pretraining – Masked-Language Modeling (MLM)
91 BERT Pretraining – Next Sentence Prediction (NSP)
92 The Logic of MLM
93 Fine-tuning with MLM – Data Preparation
94 Fine-tuning with MLM – Training
95 Fine-tuning with MLM – Training with Trainer
96 The Logic of NSP
97 Fine-tuning with NSP – Data Preparation
98 Fine-tuning with NSP – DataLoader
99 Setup the NSP Fine-tuning Training Loop
100 The Logic of MLM and NSP
101 Fine-tuning with MLM and NSP – Data Preparation
102 Setup DataLoader and Model Fine-tuning For MLM and NSP