Doing Data Science with Python

Doing Data Science with Python

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 6h 25m | 925 MB

This course shows you how to work on an end-to-end data science project including processing data, building & evaluating machine learning model, and exposing the model as an API in a standardized approach using various Python libraries.

Do you want to become a Data Scientist? If so, this course will equip you with concepts and tools that can bring you to speed and you can utilize the skills acquired in this course to work on any data science project in a standardized approach. This course, Doing Data Science with Python, follows a pragmatic approach to tackle end-to-end data science project cycle right from extracting data from different types of sources to exposing your machine learning model as API endpoints that can be consumed in a real-world data solution. This course will not only help you to understand various data science related concepts, but also help you to implement the concepts in an industry standard approach by utilizing Python and related libraries. First, you will be introduced to the various stages of a typical data science project cycle and a standardized project template to work on any data science project. Then, you will learn to use various standard libraries in the Python ecosystem such as Pandas, NumPy, Matplotlib, Scikit-Learn, Pickle, Flask to tackle different stages of a data science project such as extracting data, cleaning and processing data, building and evaluating machine learning model. Finally you’ll dive into exposing the machine learning model as APIs. You will also go through a case study that will encompass the whole course to learn end-to-end execution of a data science project. By the end of this course, you will have a solid foundation to handle any data science project and have the knowledge to apply various Python libraries to create your own data science solutions.

Table of Contents

001 – Course Overview
002 – Course Introduction
003 – Target Audience
004 – Course Prerequisites
005 – Data Science Project Cycle Overview
006 – Why Python for Data Science
007 – Course Outline
008 – Summary
009 – Introduction
010 – Overview
011 – Python Distributions for Data Science
012 – Python 3.x vs. Python 2.x
013 – Demo – Installing Ananconda Distribution
014 – Jupyter Notebook
015 – Demo – Setting up Jupyter Notebook on Local Machine
016 – Demo – Jupyter Notebook – Basics
017 – Demo – Jupyter Notebook – Magic Functions
018 – Data Science Project Template
019 – Demo – Setting up Cookiecutter Data Science Project Template
020 – Versioning for Data Science Projects
021 – Demo – Add Project to Git
022 – Summary
023 – Introduction
024 – Overview
025 – Extracting Data from Databases
026 – Demo – Extracting Data from Databases
027 – Extracting Data Through APIs
028 – Demo – Extracting Data Through APIs
029 – Extracting Data Using Web Scraping
030 – Demo – Web Scraping Using Requests and BeautifulSoup
031 – Demo – Getting Titanic Dataset Using Requests – Part 1 – Initial Preparation
032 – Demo – Getting Titanic Dataset Using Requests – Part 2 – Downloading Data
033 – Demo – Creating Reproducible Script for Getting Titanic Data
034 – Public Datasets
035 – Committing Changes to Git
036 – Summary
037 – Introduction
038 – Overview
039 – Introduction to NumPy and Pandas
040 – EDA – Basic Structure
041 – Demo – Investigating Basic Structure
042 – Demo – Selection, Indexing, and Filtering
043 – EDA – Summary Statistics
044 – Centrality Measure
045 – Centrality Measure – Mean
046 – Centrality Measure – Median
047 – Spread Measure
048 – Spread Measure – Range
049 – Spread Measure – Percentiles and Boxplot
050 – Spread Measure – Variance and Standard Deviation
051 – Demo – Getting Summary Statistics for Numerical Features
052 – Counts and Proportions
053 – Demo – Summary Statistics for Categorical Feature
054 – Summary
055 – Introduction
056 – Overview
057 – EDA – Distributions
058 – Univariate Distribution – Histogram and KDE Plot
059 – Demo – Creating Univariate Distribution Plots
060 – Bivariate Distribution – Scatter Plot
061 – Demo – Creating Scatter Plots
062 – EDA – Grouping
063 – Demo – Grouping and Aggregation
064 – Crosstab
065 – Demo – Crosstab
066 – Pivot Table
067 – Demo – Pivot Table
068 – Summary
069 – Introduction
070 – Overview
071 – Data Munging
072 – Missing Value – Issues and Solution
073 – Missing Value Imputation Techniques
074 – Demo – Treating Missing Values Using Pandas – Part 1
075 – Demo – Treating Missing Values Using Pandas – Part 2
076 – Demo – Treating Missing Values Using Pandas – Part 3
077 – Outliers – Detection and Treatment
078 – Demo – Detecting and Treating Outliers Using Pandas and NumPy
079 – Feature Engineering
080 – Demo – Feature Creation Using Pandas and NumPy – Part 1
081 – Demo – Feature Creation Using Pandas and NumPy – Part 2
082 – Demo – Feature Creation Using Pandas and NumPy – Part 3
083 – Demo – Feature Creation Using Pandas and NumPy – Part 4
084 – Categorical Feature Encoding
085 – Categorical Feature Encoding – Binary Encoding
086 – Categorical Feature Encoding – Label Encoding
087 – Categorical Feature Encoding – One-hot Encoding
088 – Demo – Categorical Feature Encoding Using Pandas
089 – Demo – Drop and Reorder Columns Using Pandas
090 – Demo – Save Dataframe to File Using Pandas
091 – Demo – Reproducible Script for Data Processing Using Pandas and NumPy
092 – Demo – Creating Visualization Using MatPlotlib
093 – Demo – Committing Changes to Git
094 – Summary
095 – Introduction
096 – Overview
097 – Machine Learning Basics
098 – Machine Learning Basics – Representation and Generalization
099 – Machine Learning Basics – Spam Classification
100 – Machine Learning Basics – Supervised Learning
101 – Machine Learning Basics – Unsupervised Learning
102 – Titanic Disaster Data Challenge
103 – Classifier
104 – Performance Metrics
105 – Performance Metrics – Accuracy
106 – Performance Metrics – Precision and Recall
107 – Classifier Evaluation
108 – Baseline Model
109 – Demo – Preparing Data for Machine Learning Model
110 – Demo – Building and Evaluating Baseline Model
111 – Demo – Making the First Kaggle Submission
112 – Linear Regression Model
113 – Logistic Regression Model
114 – Demo – Building Logistic Regression Using Scikit-Learn
115 – Demo – Making Second Kaggle Submission
116 – Summary
117 – Introduction
118 – Overview
119 – Underfitting vs. Overfitting
120 – Regularization
121 – Hyperparameter Optimization – GridSearch
122 – Crossvalidation
123 – K-Fold Crossvalidation
124 – Demo – Hyperparameter Optimization Using GridSearchCV
125 – Demo – Making Third Kaggle Submission
126 – Feature Normalization and Standardization
127 – Demo – Feature Normalization and Standardization Using Scikit-Learn
128 – Model Persistence
129 – Demo – Model Persistence Using Pickle
130 – Machine Learning API Development
131 – Demo – Hello World API Using Flask
132 – Demo – Machine Learning API Using Flask
133 – Demo – Committing Changes to Git
134 – Summary
135 – Where to Go from Here