Learn Big Data: The Hadoop Ecosystem Masterclass

Learn Big Data: The Hadoop Ecosystem Masterclass

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 6 Hours | 1.55 GB

Master the Hadoop ecosystem using HDFS, MapReduce, Yarn, Pig, Hive, Kafka, HBase, Spark, Knox, Ranger, Ambari, Zookeeper

In this course you will learn Big Data using the Hadoop Ecosystem. Why Hadoop? It is one of the most sought after skills in the IT industry. The average salary in the US is $112,000 per year, up to an average of $160,000 in San Fransisco (source: Indeed).

The course is aimed at Software Engineers, Database Administrators, and System Administrators that want to learn about Big Data. Other IT professionals can also take this course, but might have to do some extra research to understand some of the concepts.

You will learn how to use the most popular software in the Big Data industry at moment, using batch processing as well as realtime processing. This course will give you enough background to be able to talk about real problems and solutions with experts in the industry. Updating your LinkedIn profile with these technologies will make recruiters want you to get interviews at the most prestigious companies in the world.

The course is very practical, with more than 6 hours of lectures. You want to try out everything yourself, adding multiple hours of learning. If you get stuck with the technology while trying, there is support available. I will answer your messages on the message boards and we have a Facebook group where you can post questions.

What you’ll learn

  • Process Big Data using batch
  • Process Big Data using realtime data
  • Be familiar with the technologies in the Hadoop Stack
  • Be able to install and configure the Hortonworks Data Platform (HDP)
Table of Contents

1 Course Introduction
2 Course Guide

What is Big Data and Hadoop
3 What is Big Data
4 Examples of Big Data
5 What is Data Science
6 What is Hadoop
7 Hadoop Distributions

Introduction to Hadoop
8 Hadoop Installation
9 Demo Hortonworks Sandbox
10 Demo Hadoop Installation – Part 1
11 Demo Hadoop Installation – Part 2
12 Introduction to HDFS
13 DataNode Communications
14 Demo HDFS – Part 1
15 Demo HDFS – Part 2 – Using Ambari
16 MapReduce WordCount Example
17 Demo MapReduce WordCount
18 Lines that span blocks
19 Introduction to Yarn
20 Demo Yarn and ResourceManager UI
21 Ambari API and Blueprints
22 Demo Ambari API and Blueprints
23 ETL Processing in Hadoop

24 Introduction to Pig
25 Demo Part 1 – Pig Installation
26 Demo Part 2 – Pig Commands
27 Demo Part 3 – More Pig Commands

Apache Spark
28 Introduction to Apache Spark
29 Spark WordCount
30 Demo Spark installation and WordCount
31 RDDs
32 Demo RDD Transformations and Actions
33 Overview of RDD Transformations and Actions
34 Spark MLLib

35 Introduction to Hive
36 Hive Queries
37 Demo Hive Installation and Hive Queries
38 Hive Partitioning, Buckets, UDFs, and SerDes
39 The Stinger Initiative
40 Hive in Spark

Real Time Processing
41 Introduction to Realtime Processing

42 Introduction to Kafka
43 Kafka Topics
44 Kafka Messages and Log Compaction
45 Kafka Use Cases and Usage
46 Demo Kafka Installation and Usage

47 Introduction to Storm
48 A Storm Topology
49 Demo Storm installation and Example Topology
50 Storm Message Processing and Reliability
51 Trident

Spark Streaming
52 Introduction to Spark Streaming
53 Spark Streaming Architecture
54 Spark Receivers and WordCount Streaming Example
55 Demo Spark Streaming with Kafka
56 Spark Streaming State and Checkpointing
57 Demo Stateful Spark Streaming
58 More Spark Streaming Features

59 Introduction to HBase
60 HBase Tables
61 The HBase Meta Table
62 HBase Writes
63 HBase Reads
64 Compactions
65 Crash Recovery
66 Region Splits
67 Hotspotting
68 Demo HBase Install
69 Demo HBase Shell
70 Demo Spark HBase

71 Introduction to Phoenix
72 Salting, Compression, and Indexes in Phoenix
73 JOINs, VIEWs, and Phoenix in Spark
74 Demo Phoenix

Hadoop Security
75 Introduction to Kerberos
76 Kerberos on Hadoop
77 Kerberos Terminology
78 Demo Enabling Kerberos
79 Introduction to SPNEGO
80 Demo SPNEGO
81 Introduction to Knox

82 Introduction to Ranger
83 Demo Ranger Installation
84 Demo Ranger with Hive

HDFS Encryption
85 Introduction to HDFS Transparent Encryption
86 Demo HDFS Encryption using Ranger KMS

Advanced Topics
87 Yarn Schedulers
88 Demo Capacity Scheduler
89 Label based scheduling
90 Yarn Sizing
91 Hive Query Optimizations
92 Join Strategies
93 Spark Optimizations
94 NameNode High Availability
95 Demo NameNode High Availability Setup
96 Database High Availability

Thank You
97 Thank You
98 Bonus Lecture My Other Courses