Advanced Machine Learning with Spark 2.x

Advanced Machine Learning with Spark 2.x

English | MP4 | AVC 1920×1080 | AAC 48KHz 2ch | 2h 24m | 473 MB

Get in-depth knowledge of Machine Learning libraries, analytics, and prediction with Apache Spark

The aim of this course is to provide a practical understanding of advanced Machine Learning algorithms in Apache Spark to make predictions and recommendation and derive insights from large distributed datasets. This course starts with an introduction to the key concepts and data types that are fundamental to understanding distributed data processing and Machine Learning with Spark.

Further to this, we provide practical recipes that demonstrate some of the most popular algorithms in Spark, leading to the creation of sophisticated Machine Learning pipelines and applications. The final sections are dedicated to more advanced use cases for Machine Learning: streaming, Natural Language Processing, and Deep Learning. In each section, we briefly establish the theoretical basis of the topic under discussion and then cement our understanding with practical use cases.

The video course starts with an introduction to Machine Learning libraries, data types, and key concepts that apply to Machine Learning pipelines with Apache Spark. Then, via recipes including practical Kaggle examples, we will understand different components of Machine Learning applications and considerations in building, evaluating, and deploying various Machine Learning models. With an understanding of Machine Learning models gained in previous sections, we move on to deal with more sophisticated use-cases such as streaming using the structured streaming library in Spark 2.X. Other special use cases we deal with include Natural Language Processing and Deep Learning using external libraries with Spark.

What You Will Learn

  • Get introduced to Machine Learning libraries and datatypes in Spark: MLlib, ML, vectors, matrices, labeled points, rating datatypes, and more.
  • Understand different key components of Machine Learning applications.
  • Learn to evaluate, fine-tune, save and deploy models along with pipelines.
  • Deploy Machine Learning models in a typical streaming application.
  • Understand Natural Language Processing in Spark.
  • Understand Deep learning workflows in Spark.
Table of Contents

Introduction to Key Concepts and Data Types
1 The Course Overview
2 Spark Data Structures — RDD, DataFrames, and Datasets
3 Dense and Sparse Vectors
4 Labeled Points, Matrix, and Other Data Types
5 Key Concepts, Machine Learning Pipelines, and Operations

Machine Learning at Scale
6 Feature Engineering
7 Supervised Learning – Classification, Regression
8 Unsupervised Learning
9 Recommendation Engines

ML Pipelines, Evaluation, Tuning, and Deployment
10 Deep Dive into Regression Models
11 Deep Dive into Decision Tree Models
12 Evaluating and Tuning Our Model
13 Saving and Deploying Our Model

Spark Machine Learning and Streaming
14 Overview of Spark Streaming
15 Your Own Streaming Application with Kafka
16 Your First Streaming Application
17 Analyzing Sensors Data in a Streaming Way

Advanced Topics Natural Language Processing
18 Natural Language Processing Overview
19 Feature Generation from Text — CountVectorizer, TFIDF, and LDA
20 Feature Generation from Text — Word Embeddings
21 NLP Document Classification Application

Deep Learning with Spark
22 The Spark Versus Deep Learning Use Case
23 Spark for Parallelizing Deep Learning Evaluation
24 Deep Learning as a Feature Generator for Existing Spark ML Algorithms
25 Spark_Deep Learning Made Easy