SQL for Exploratory Data Analysis Essential Training

SQL for Exploratory Data Analysis Essential Training

English | MP4 | AVC 1280×720 | AAC 48KHz 2ch | 0h 44m | 100 MB

Learn how to use SQL to understand the characteristics of data sets destined for data science and machine learning. The course begins with an introduction to exploratory data analysis and how it differs from hypothesis-driven statistical analysis. Instructor Dan Sullivan explains how SQL queries and statistical calculations, and visualization tools like Excel and R, can help you verify data quality and avoid incorrect assumptions. Next, find out how to perform data-quality checks, reveal and recover missing values, and check business logic. Discover how to use box plots to understand non-normal distribution of data and use histograms to understand the frequency of data values in particular attributes. Dan also explains how to use the chi square test to understand dependencies and measure correlations between attributes. The course concludes with a collection of tips and best practices for exploratory data analysis.

Topics include:

  • Exploratory data analysis vs. hypothesis-driven statistical analysis
  • Performing data quality checks
  • Calculating quartiles
  • Using box plot to understand the distribution of values
  • Using histograms to understand the frequency of values
  • Using chi square to understand the correlation between values
Table of Contents

Introduction
1 Welcome
2 What you should know

Introduction to Exploratory Data Analysis
3 Why explore data
4 Exploring data with statistics
5 Testing hypothesis with statistics

Data Quality Checks
6 Why check data
7 Types of quality checks
8 Imputing missing values
9 Identifying business logic checks

Calculating Quartiles
10 Why learn about the distribution of data
11 Minimum maximum and median values
12 Ordering and counting
13 Calculating quartiles
14 Introduction to box plots

Histograms
15 Introduction to histograms
16 Partitioning data
17 Calculating histograms
18 Simple histogram visualization

Checking Correlation between Attributes
19 Introduction to correlation
20 Calculating correlation with SQL

Conclusion
21 Next steps