Clustering and Classification with Machine Learning in R

Clustering and Classification with Machine Learning in R

English | MP4 | AVC 1920×1080 | AAC 44KHz 2ch | 7h 42m | 1.34 GB

The underlying patterns in your data hold vital insights; unearth them with cutting-edge clustering and classification techniques in R

This course is your complete guide to both supervised and unsupervised learning using R. This course covers all the main aspects of practical data science; if you take this course, there is no need to take other courses or buy books on R-based data science. In this age of big data, companies across the Globe use R to sift through the avalanche of information at their disposal. By becoming proficient in unsupervised and supervised learning in R, you can give your company a competitive edge and take your career to the next level.

Over the course of research, the author realized that almost all the R data science courses and books out there do take account of the multidimensional nature of the topic. This course will give you a robust grounding in the main aspects of machine learning: clustering and classification. Unlike other R instructors, the author digs deep into R’s machine learning features and give you a one-of-a-kind grounding in data science! You will go all the way from carrying out data reading & cleaning to machine learning, to finally implementing powerful machine learning algorithms and evaluating their performance via R.

The following topics will be covered: –

  • A full introduction to the R Framework for data science
  • Data structures and reading in R, including CSV, Excel, and HTML data
  • How to pre-process and clean data by removing NAs/No data, visualization
  • Machine learning, supervised learning, and unsupervised learning in R
  • Model building and selection and much more!

The course will help you implement methods using real data obtained from different sources. Many courses use made-up data that does not empower students to implement R-based data science in real life. After taking this course, you’ll easily use data science packages such as Caret to work with real data in R. You’ll even understand concepts such as unsupervised learning, dimension reduction, and supervised learning.

Learn

  • Read-in data into the R environment from different sources
  • Carry out basic data pre-processing and wrangling in R Studio
  • Implement unsupervised/clustering techniques such as K-means clustering
  • Implement dimensional reduction techniques (PCA) and feature selection
  • Implement supervised learning techniques/classification such as Random Forests
  • Evaluate model performance and learn the best practices for evaluating machine learning model accuracy
Table of Contents

Introduction to the Course
1 Welcome to Clustering & Classification with Machine Learning in R
2 Installing R and R Studio

Read in Data from Different Sources in R
3 Read in CSV & Excel Data
4 Read in Unzipped Folder
5 Read in Online CSV
6 Read in Googlesheets
7 Read in Data from Online HTML Tables-Part 1
8 Read in Data from Online HTML Tables-Part 2
9 Read Data from a Database

Data Pre-processing and Visualization
10 Remove Missing Values
11 More Data Cleaning
12 Introduction to dplyr for Data Summarizing-Part 1
13 Introduction to dplyr for Data Summarizing-Part 2
14 Exploratory Data Analysis (EDA) – Basic Visualizations with R
15 More Exploratory Data Analysis with xda
16 Data Exploration & Visualization With dplyr & ggplot2
17 Associations Between Quantitative Variables- Theory
18 Testing for Correlation
19 Evaluate the Relation Between Nominal Variables
20 Cramer’s V for Examining the Strength of Association Between Nominal Variable

Machine Learning for Data Science
21 How is Machine Learning Different from Statistical Data Analysis
22 What is Machine Learning (ML) About Some Theoretical Pointers

Unsupervised Learning in R
23 K-Means Clustering
24 Other Ways of Selecting Cluster Numbers
25 Fuzzy K-Means Clustering
26 Weighted k-means
27 Partitioning Around Meloids (PAM)
28 Hierarchical Clustering in R
29 Expectation-Maximization (EM) in R
30 DBSCAN Clustering in R
31 Cluster a Mixed Dataset
32 Should We Even Do Clustering
33 Assess Clustering Performance
34 Which Clustering Algorithm to Choose

Feature Dimension Reduction
35 Dimension Reduction-theory
36 Principal Component Analysis (PCA)
37 More on PCA
38 Multidimensional Scaling
39 Singular Value Decomposition (SVD)

Feature Selection to Select the Most Relevant Predictors
40 Removing Highly Correlated Predictor Variables
41 Variable Selection Using LASSO Regression
42 Variable Selection with FSelector
43 Boruta Analysis for Feature Selection

Supervised Learning Theory
44 Some Basic Supervised Learning Concepts
45 Pre-processing for Supervised Learning

Supervised Learning – Classification
46 What are GLMs
47 Logistic Regression Models as Binary Classifiers
48 Binary Classifier with PCA
49 Some Pointers on Evaluating Accuracy
50 Obtain Binary Classification Accuracy Metrics
51 More on Binary Accuracy Measures
52 Linear Discriminant Analysis
53 Our Multi-class Classification Problem
54 Classification Trees
55 More on Classification Tree Visualization
56 Classification with Party Package
57 Decision Trees
58 Random Forest (RF) Classification
59 Examine Individual Variable Importance for Random Forests
60 GBM Classification
61 Support Vector Machines (SVM) for Classification
62 More SVM for Classification
63 Variable Importance in SVM Modelling with rminer

Additional Lectures
64 Fuzzy C-Means Clustering