**The Data Science Course 2019: Complete Data Science Bootcamp**

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 20.5 Hours | 11.2 GB

eLearning | Skill level: All Levels

Complete Data Science Training: Mathematics, Statistics, Python, Advanced Statistics in Python, Machine & Deep Learning

The Problem

Data scientist is one of the best suited professions to thrive this century. It is digital, programming-oriented, and analytical. Therefore, it comes as no surprise that the demand for data scientists has been surging in the job marketplace.

However, supply has been very limited. It is difficult to acquire the skills necessary to be hired as a data scientist.

And how can you do that?

Universities have been slow at creating specialized data science programs. (not to mention that the ones that exist are very expensive and time consuming)

Most online courses focus on a specific topic and it is difficult to understand how the skill they teach fit in the complete picture

The Solution

Data science is a multidisciplinary field. It encompasses a wide range of topics.

- Understanding of the data science field and the type of analysis carried out
- Mathematics
- Statistics
- Python
- Applying advanced statistical techniques in Python
- Data Visualization
- Machine Learning
- Deep Learning

Each of these topics builds on the previous ones. And you risk getting lost along the way if you don’t acquire these skills in the right order. For example, one would struggle in the application of Machine Learning techniques before understanding the underlying Mathematics. Or, it can be overwhelming to study regression analysis in Python before knowing what a regression is.

So, in an effort to create the most effective, time-efficient, and structured data science training available online, we created The Data Science Course 2019.

We believe this is the first training program that solves the biggest challenge to entering the data science field – having all the necessary resources in one place.

Moreover, our focus is to teach topics that flow smoothly and complement each other. The course teaches you everything you need to know to become a data scientist at a fraction of the cost of traditional programs (not to mention the amount of time you will save).

The Skills

1. Intro to Data and Data Science

Big data, business intelligence, business analytics, machine learning and artificial intelligence. We know these buzzwords belong to the field of data science but what do they all mean?

Why learn it? As a candidate data scientist, you must understand the ins and outs of each of these areas and recognise the appropriate approach to solving a problem. This ‘Intro to data and data science’ will give you a comprehensive look at all these buzzwords and where they fit in the realm of data science.

2. Mathematics

Learning the tools is the first step to doing data science. You must first see the big picture to then examine the parts in detail.

We take a detailed look specifically at calculus and linear algebra as they are the subfields data science relies on.

Why learn it?

Calculus and linear algebra are essential for programming in data science. If you want to understand advanced machine learning algorithms, then you need these skills in your arsenal.

3. Statistics

You need to think like a scientist before you can become a scientist. Statistics trains your mind to frame problems as hypotheses and gives you techniques to test these hypotheses, just like a scientist.

Why learn it?

This course doesn’t just give you the tools you need but teaches you how to use them. Statistics trains you to think like a scientist.

4. Python

Python is a relatively new programming language and, unlike R, it is a general-purpose programming language. You can do anything with it! Web applications, computer games and data science are among many of its capabilities. That’s why, in a short space of time, it has managed to disrupt many disciplines. Extremely powerful libraries have been developed to enable data manipulation, transformation, and visualisation. Where Python really shines however, is when it deals with machine and deep learning.

Why learn it?

When it comes to developing, implementing, and deploying machine learning models through powerful frameworks such as scikit-learn, TensorFlow, etc, Python is a must have programming language.

5. Tableau

Data scientists don’t just need to deal with data and solve data driven problems. They also need to convince company executives of the right decisions to make. These executives may not be well versed in data science, so the data scientist must but be able to present and visualise the data’s story in a way they will understand. That’s where Tableau comes in – and we will help you become an expert story teller using the leading visualisation software in business intelligence and data science.

Why learn it?

A data scientist relies on business intelligence tools like Tableau to communicate complex results to non-technical decision makers.

6. Advanced Statistics

Regressions, clustering, and factor analysis are all disciplines that were invented before machine learning. However, now these statistical methods are all performed through machine learning to provide predictions with unparalleled accuracy. This section will look at these techniques in detail.

Why learn it?

Data science is all about predictive modelling and you can become an expert in these methods through this ‘advance statistics’ section.

7. Machine Learning

The final part of the program and what every section has been leading up to is deep learning. Being able to employ machine and deep learning in their work is what often separates a data scientist from a data analyst. This section covers all common machine learning techniques and deep learning methods with TensorFlow.

What you’ll learn

- The course provides the entire toolbox you need to become a data scientist
- Fill up your resume with in demand data science skills: Statistical analysis, Python programming with NumPy, pandas, matplotlib, and Seaborn, Advanced statistical analysis, Tableau,
- Machine Learning with stats models and scikit-learn, Deep learning with TensorFlow
- Impress interviewers by showing an understanding of the data science field
- Learn how to pre-process data
- Understand the mathematics behind Machine Learning (an absolute must which other courses don’t teach!)
- Start coding in Python and learn how to use it for statistical analysis
- Perform linear and logistic regressions in Python
- Carry out cluster and factor analysis
- Be able to create Machine Learning algorithms in Python, using NumPy, statsmodels and scikit-learn
- Apply your skills to real-life business cases
- Use state-of-the-art Deep Learning frameworks such as Google’s TensorFlowDevelop a business intuition while coding and solving tasks with big data
- Unfold the power of deep neural networks
- Improve Machine Learning algorithms by studying underfitting, overfitting, training, validation, n-fold cross validation, testing, and how hyperparameters could improve performance
- Warm up your fingers as you will be eager to apply everything you have learned here to more and more real-life situations

**+ Table of Contents**

**Part 1: Introduction**

1 A Practical Example: What You Will Learn in This Course

2 What Does the Course Cover

3 Download All Resources and Important FAQ

**The Field of Data Science – The Various Data Science Disciplines**

4 Data Science and Business Buzzwords: Why are there so many?

5 What is the difference between Analysis and Analytics

6 Business Analytics, Data Analytics, and Data Science: An Introduction

7 Continuing with BI, ML, and AI

8 A Breakdown of our Data Science Infographic

**The Field of Data Science – Connecting the Data Science Disciplines**

9 Applying Traditional Data, Big Data, BI, Traditional Data Science and ML

**The Field of Data Science – The Benefits of Each Discipline**

10 The Reason behind these Disciplines

**The Field of Data Science – Popular Data Science Techniques**

11 Techniques for Working with Traditional Data

12 Real Life Examples of Traditional Data

13 Techniques for Working with Big Data

14 Real Life Examples of Big Data

15 Business Intelligence (BI) Techniques

16 Real Life Examples of Business Intelligence (BI)

17 Techniques for Working with Traditional Methods

18 Real Life Examples of Traditional Methods

19 Machine Learning (ML) Techniques

20 Types of Machine Learning

21 Real Life Examples of Machine Learning (ML)

**The Field of Data Science – Popular Data Science Tools**

22 Necessary Programming Languages and Software Used in Data Science

**The Field of Data Science – Careers in Data Science**

23 Finding the Job – What to Expect and What to Look for

**The Field of Data Science – Debunking Common Misconceptions**

24 Debunking Common Misconceptions

**Part 2: Statistics**

25 Population and Sample

**Statistics – Descriptive Statistics**

26 Types of Data

27 Levels of Measurement

28 Categorical Variables – Visualization Techniques

29 Categorical Variables Exercise

30 Numerical Variables – Frequency Distribution Table

31 Numerical Variables Exercise

32 The Histogram

33 Histogram Exercise

34 Cross Tables and Scatter Plots

35 Cross Tables and Scatter Plots Exercise

36 Mean, median and mode

37 Mean, Median and Mode Exercise

38 Skewness

39 Skewness Exercise

40 Variance

41 Variance Exercise

42 Standard Deviation and Coefficient of Variation

43 Standard Deviation and Coefficient of Variation Exercise

44 Covariance

45 Covariance Exercise

46 Correlation Coefficient

47 Correlation Coefficient Exercise

**Statistics – Practical Example: Descriptive Statistics**

48 Practical Example: Descriptive Statistics

49 Practical Example: Descriptive Statistics Exercise

**Statistics – Inferential Statistics Fundamentals**

50 Introduction

51 What is a Distribution

52 The Normal Distribution

53 The Standard Normal Distribution

54 The Standard Normal Distribution Exercise

55 Central Limit Theorem

56 Standard error

57 Estimators and Estimates

**Statistics – Inferential Statistics: Confidence Intervals**

58 What are Confidence Intervals?

59 Confidence Intervals; Population Variance Known; z-score

60 Confidence Intervals; Population Variance Known; z-score; Exercise

61 Confidence Interval Clarifications

62 Student’s T Distribution

63 Confidence Intervals; Population Variance Unknown; t-score

64 Confidence Intervals; Population Variance Unknown; t-score; Exercise

65 Margin of Error

66 Confidence intervals. Two means. Dependent samples

67 Confidence intervals. Two means. Dependent samples Exercise

68 Confidence intervals. Two means. Independent samples (Part 1)

69 Confidence intervals. Two means. Independent samples (Part 1) Exercise

70 Confidence intervals. Two means. Independent samples (Part 2)

71 Confidence intervals. Two means. Independent samples (Part 2) Exercise

72 Confidence intervals. Two means. Independent samples (Part 3)

**Statistics – Practical Example: Inferential Statistics**

73 Practical Example: Inferential Statistics

74 Practical Example: Inferential Statistics Exercise

**Statistics – Hypothesis Testing**

75 Null vs Alternative Hypothesis

76 Further Reading on Null and Alternative Hypothesis

77 Rejection Region and Significance Level

78 Type I Error and Type II Error

79 Test for the Mean. Population Variance Known

80 Test for the Mean. Population Variance Known Exercise

81 p-value

82 Test for the Mean. Population Variance Unknown

83 Test for the Mean. Population Variance Unknown Exercise

84 Test for the Mean. Dependent Samples

85 Test for the Mean. Dependent Samples Exercise

86 Test for the mean. Independent samples (Part 1)

87 Test for the mean. Independent samples (Part 1). Exercise

88 Test for the mean. Independent samples (Part 2)

89 Test for the mean. Independent samples (Part 2) Exercise

**Statistics – Practical Example: Hypothesis Testing**

90 Practical Example: Hypothesis Testing

91 Practical Example: Hypothesis Testing Exercise

**Part 3: Introduction to Python**

92 Introduction to Programming

93 Why Python?

94 Why Jupyter?

95 Installing Python and Jupyter

96 Understanding Jupyter’s Interface – the Notebook Dashboard

97 Prerequisites for Coding in the Jupyter Notebooks

**Python – Variables and Data Types**

98 Variables

99 Numbers and Boolean Values in Python

100 Python Strings

**Python – Basic Python Syntax**

101 Using Arithmetic Operators in Python

102 The Double Equality Sign

103 How to Reassign Values

104 Add Comments

105 Understanding Line Continuation

106 Indexing Elements

107 Structuring with Indentation

**Python – Other Python Operators**

108 Comparison Operators

109 Logical and Identity Operators

**Python – Conditional Statements**

110 The IF Statement

111 The ELSE Statement

112 The ELIF Statement

113 A Note on Boolean Values

**Python – Python Functions**

114 Defining a Function in Python

115 How to Create a Function with a Parameter

116 Defining a Function in Python – Part II

117 How to Use a Function within a Function

118 Conditional Statements and Functions

119 Functions Containing a Few Arguments

120 Built-in Functions in Python

**Python – Sequences**

121 Lists

122 Using Methods

123 List Slicing

124 Tuples

125 Dictionaries

**Python – Iterations**

126 For Loops

127 While Loops and Incrementing

128 Lists with the range() Function

129 Conditional Statements and Loops

130 Conditional Statements, Functions, and Loops

131 How to Iterate over Dictionaries

**Python – Advanced Python Tools**

132 Object Oriented Programming

133 Modules and Packages

134 What is the Standard Library?

135 Importing Modules in Python

**Part 4: Advanced Statistical Methods in Python**

136 Introduction to Regression Analysis

**Advanced Statistical Methods – Linear regression**

137 The Linear Regression Model

138 Correlation vs Regression

139 Geometrical Representation of the Linear Regression Model

140 Python Packages Installation

141 First Regression in Python

142 First Regression in Python Exercise

143 Using Seaborn for Graphs

144 How to Interpret the Regression Table

145 Decomposition of Variability

146 What is the OLS?

147 R-Squared

**Advanced Statistical Methods – Multiple Linear Regression**

148 Multiple Linear Regression

149 Adjusted R-Squared

150 Multiple Linear Regression Exercise

151 Test for Significance of the Model (F-Test)

152 OLS Assumptions

153 A1: Linearity

154 A2: No Endogeneity

155 A3: Normality and Homoscedasticity

156 A4: No Autocorrelation

157 A5: No Multicollinearity

158 Dealing with Categorical Data – Dummy Variables

159 Dealing with Categorical Data – Dummy Variables

160 Making Predictions with the Linear Regression

**Advanced Statistical Methods – Logistic Regression**

161 Introduction to Logistic Regression

162 A Simple Example in Python

163 Logistic vs Logit Function

164 Building a Logistic Regression

165 Building a Logistic Regression – Exercise

166 An Invaluable Coding Tip

167 Understanding Logistic Regression Tables

168 Understanding Logistic Regression Tables – Exercise

169 What do the Odds Actually Mean

170 Binary Predictors in a Logistic Regression

171 Binary Predictors in a Logistic Regression – Exercise

172 Calculating the Accuracy of the Model

173 Calculating the Accuracy of the Model

174 Underfitting and Overfitting

175 Testing the Model

176 Testing the Model – Exercise

**Advanced Statistical Methods – Cluster Analysis**

177 Introduction to Cluster Analysis

178 Some Examples of Clusters

179 Difference between Classification and Clustering

180 Math Prerequisites

**Advanced Statistical Methods – K-Means Clustering**

181 K-Means Clustering

182 A Simple Example of Clustering

183 A Simple Example of Clustering – Exercise

184 Clustering Categorical Data

185 Clustering Categorical Data – Exercise

186 How to Choose the Number of Clusters

187 How to Choose the Number of Clusters – Exercise

188 Pros and Cons of K-Means Clustering

189 To Standardize or not to Standardize

190 Relationship between Clustering and Regression

191 Market Segmentation with Cluster Analysis (Part 1)

192 Market Segmentation with Cluster Analysis (Part 2)

193 How is Clustering Useful?

194 EXERCISE: Species Segmentation with Cluster Analysis (Part 1)

195 EXERCISE: Species Segmentation with Cluster Analysis (Part 2)

**Advanced Statistical Methods – Other Types of Clustering**

196 Types of Clustering

197 Dendrogram

198 Heatmaps

**Part 5: Mathematics**

199 What is a matrix?

200 Scalars and Vectors

201 Linear Algebra and Geometry

202 Arrays in Python – A Convenient Way To Represent Matrices

203 What is a Tensor?

204 Addition and Subtraction of Matrices

205 Errors when Adding Matrices

206 Transpose of a Matrix

207 Dot Product

208 Dot Product of Matrices

209 Why is Linear Algebra Useful?

**Part 6: Deep Learning**

210 What to Expect from this Part?

**Deep Learning – Introduction to Neural Networks**

211 Introduction to Neural Networks

212 Training the Model

213 Types of Machine Learning

214 The Linear Model (Linear Algebraic Version)

215 The Linear Model with Multiple Inputs

216 The Linear model with Multiple Inputs and Multiple Outputs

217 Graphical Representation of Simple Neural Networks

218 What is the Objective Function?

219 Common Objective Functions: L2-norm Loss

220 Common Objective Functions: Cross-Entropy Loss

221 Optimization Algorithm: 1-Parameter Gradient Descent

222 Optimization Algorithm: n-Parameter Gradient Descent

**Deep Learning – How to Build a Neural Network from Scratch with NumPy**

223 Basic NN Example (Part 1)

224 Basic NN Example (Part 2)

225 Basic NN Example (Part 3)

226 Basic NN Example (Part 4)

227 Basic NN Example Exercises

**Deep Learning – TensorFlow: Introduction**

228 How to Install TensorFlow

229 A Note on Installing Packages in Anaconda

230 TensorFlow Outline and Logic

231 Actual Introduction to TensorFlow

232 Types of File Formats, supporting Tensors

233 Basic NN Example with TF: Inputs, Outputs, Targets, Weights, Biases

234 Basic NN Example with TF: Loss Function and Gradient Descent

235 Basic NN Example with TF: Model Output

236 Basic NN Example with TF Exercises

**Deep Learning – Digging Deeper into NNs: Introducing Deep Neural Networks**

237 What is a Layer?

238 What is a Deep Net?

239 Digging into a Deep Net

240 Non-Linearities and their Purpose

241 Activation Functions

242 Activation Functions: Softmax Activation

243 Backpropagation

244 Backpropagation picture

245 Backpropagation – A Peek into the Mathematics of Optimization

**Deep Learning – Overfitting**

246 What is Overfitting?

247 Underfitting and Overfitting for Classification

248 What is Validation?

249 Training, Validation, and Test Datasets

250 N-Fold Cross Validation

251 Early Stopping or When to Stop Training

**Deep Learning – Initialization**

252 What is Initialization?

253 Types of Simple Initializations

254 State-of-the-Art Method – (Xavier) Glorot Initialization

**Deep Learning – Digging into Gradient Descent and Learning Rate Schedules**

255 Stochastic Gradient Descent

256 Problems with Gradient Descent

257 Momentum

258 Learning Rate Schedules, or How to Choose the Optimal Learning Rate

259 Learning Rate Schedules Visualized

260 Adaptive Learning Rate Schedules (AdaGrad and RMSprop )

261 Adam (Adaptive Moment Estimation)

**Deep Learning – Preprocessing**

262 Preprocessing Introduction

263 Types of Basic Preprocessing

264 Standardization

265 Preprocessing Categorical Data

266 Binary and One-Hot Encoding

**Deep Learning – Classifying on the MNIST Dataset**

267 MNIST: What is the MNIST Dataset?

268 MNIST: How to Tackle the MNIST

269 MNIST: Relevant Packages

270 MNIST: Model Outline

271 MNIST: Loss and Optimization Algorithm

272 Calculating the Accuracy of the Model

273 MNIST: Batching and Early Stopping

274 MNIST: Learning

275 MNIST: Results and Testing

276 MNIST: Exercises

277 MNIST: Solutions

**Deep Learning – Business Case Example**

278 Business Case: Getting acquainted with the dataset

279 Business Case: Outlining the Solution

280 The Importance of Working with a Balanced Dataset

281 Business Case: Preprocessing

282 Business Case: Preprocessing Exercise

283 Creating a Data Provider

284 Business Case: Model Outline

285 Business Case: Optimization

286 Business Case: Interpretation

287 Business Case: Testing the Model

288 Business Case: A Comment on the Homework

289 Business Case: Final Exercise

**Deep Learning – Conclusion**

290 Summary on What You’ve Learned

291 What’s Further out there in terms of Machine Learning

292 An overview of CNNs

293 DeepMind and Deep Learning

294 An Overview of RNNs

295 An Overview of non-NN Approaches

296 Download All Resources

**Software Integration**

297 What are Data, Servers, Clients, Requests, and Responses

298 What are Data Connectivity, APIs, and Endpoints?

299 Taking a Closer Look at APIs

300 Communication between Software Products through Text Files

301 Software Integration – Explained

**Case Study – What’s Next in the Course?**

302 Game Plan for this Python, SQL, and Tableau Business Exercise

303 The Business Task

304 Introducing the Data Set

**Case Study – Preprocessing the ‘Absenteeism_data’**

305 What to Expect from the Following Sections?

306 Importing the Absenteeism Data in Python

307 Checking the Content of the Data Set

308 Introduction to Terms with Multiple Meanings

309 What’s Regression Analysis – a Quick Refresher

310 Using a Statistical Approach towards the Solution to the Exercise

311 Dropping a Column from a DataFrame in Python

312 EXERCISE – Dropping a Column from a DataFrame in Python

313 SOLUTION – Dropping a Column from a DataFrame in Python

314 Analyzing the Reasons for Absence

315 Obtaining Dummies from a Single Feature

316 EXERCISE – Obtaining Dummies from a Single Feature

317 SOLUTION – Obtaining Dummies from a Single Feature

318 Dropping a Dummy Variable from the Data Set

319 More on Dummy Variables: A Statistical Perspective

320 Classifying the Various Reasons for Absence

321 Using .concat() in Python

322 EXERCISE – Using .concat() in Python

323 SOLUTION – Using .concat() in Python

324 Reordering Columns in a Pandas DataFrame in Python

325 EXERCISE – Reordering Columns in a Pandas DataFrame in Python

326 SOLUTION – Reordering Columns in a Pandas DataFrame in Python

327 Creating Checkpoints while Coding in Jupyter

328 EXERCISE – Creating Checkpoints while Coding in Jupyter

329 SOLUTION – Creating Checkpoints while Coding in Jupyter

330 Analyzing the Dates from the Initial Data Set

331 Extracting the Month Value from the “Date” Column

332 Extracting the Day of the Week from the “Date” Column

333 EXERCISE – Removing the “Date” Column

334 Analyzing Several “Straightforward” Columns for this Exercise

335 Working on “Education”, “Children”, and “Pets”

336 Final Remarks of this Section

**Case Study – Applying Machine Learning to Create the ‘absenteeism_module’**

337 Exploring the Problem with a Machine Learning Mindset

338 Creating the Targets for the Logistic Regression

339 Selecting the Inputs for the Logistic Regression

340 Standardizing the Data

341 Splitting the Data for Training and Testing

342 Fitting the Model and Assessing its Accuracy

343 Creating a Summary Table with the Coefficients and Intercept

344 Interpreting the Coefficients for Our Problem

345 Standardizing only the Numerical Variables (Creating a Custom Scaler)

346 Interpreting the Coefficients of the Logistic Regression

347 Backward Elimination or How to Simplify Your Model

348 Testing the Model We Created

349 Saving the Model and Preparing it for Deployment

350 ARTICLE – A Note on ‘pickling’

351 EXERCISE – Saving the Model (and Scaler)

352 Preparing the Deployment of the Model through a Module

**Case Study – Loading the ‘absenteeism_module’**

353 Are You Sure You’re All Set?

354 Deploying the ‘absenteeism_module’ – Part I

355 Deploying the ‘absenteeism_module’ – Part II

356 Exporting the Obtained Data Set as a *.csv

**Case Study – Analyzing the Predicted Outputs in Tableau**

357 EXERCISE – Age vs Probability

358 Analyzing Age vs Probability in Tableau

359 EXERCISE – Reasons vs Probability

360 Analyzing Reasons vs Probability in Tableau

361 EXERCISE – Transportation Expense vs Probability

362 Analyzing Transportation Expense vs Probability in Tableau

Resolve the captcha to access the links!