Data Analysis and Exploration with Pandas

Data Analysis and Exploration with Pandas

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 5h 12m | 918 MB

Get idiomatic solutions to common data problems while working on real-world datasets and get surprising insights from the pandas library

Are you looking for a gigantic boost in your productivity? Are you searching for some interesting and fun tricks to solve your data problems? If so, then this course is indeed a perfect choice for you. This course provides you with unique, idiomatic, and amazing solutions for both fundamental and advanced data manipulation tasks with pandas.

Some solutions focus on achieving a deeper understanding of basic principles, or comparing and contrasting two similar operations. A few others will delve into a particular dataset, and let you uncover new and unexpected insights along the way.

The pandas library is massive, and it’s common for frequent users to be unaware of many of its more impressive features. The official pandas documentation, while thorough, does not contain many useful examples of how to piece together multiple commands as one would do during an actual analysis. This course guides you, as if you were looking over the shoulder of an expert, through practical situations that you are highly likely to encounter. Many advanced solutions combine several different features across the pandas library to generate results.

This course includes interesting and illustrative examples and delivers very detailed explanations for each line of code in all of the examples. All code and dataset explanations exist in Jupyter Notebooks, an excellent interface for exploring data. In other words, this is an easy guide with a problem/solution approach for real-world datasets.

What You Will Learn

  • Master the fundamentals of pandas to quickly begin exploring any dataset
  • Explore the most crucial and common operations that you will perform during data analysis
  • Build customized functions to apply to your groups.
  • Restructure and tidy data to make data analysis and visualization easier
  • Prepare real-world messy datasets for machine learning
  • Combine and merge data from different sources through pandas SQL-like operations
Table of Contents

01 The Course Overview
02 Dissecting the Anatomy of a DataFrame
03 Accessing the Main DataFrame Components
04 Understanding Data Types
05 Selecting a Single Column of Data as a Series
06 Calling Series Methods
07 Working with Operators on a Series
08 Chaining Series Methods Together
09 Making the Index Meaningful
10 Renaming Row and Column Names
11 Creating and Deleting Columns
12 Selecting Multiple DataFrame Columns
13 Selecting Columns with Methods
14 Ordering Column Names Sensibly
15 Operating on the Entire DataFrame
16 Chaining DataFrame Methods Together
17 Working with Operators on a DataFrame
18 Comparing Missing Values
19 Transposing the Direction of a DataFrame Operation
20 Determining College Campus Diversity
21 Developing a Data Analysis Routine
22 Reducing Memory by Changing Data Types
23 Selecting the Smallest of the Largest
24 Selecting the Largest of Each Group by Sorting
25 Replicating nlargest with sort_values
26 Selecting Series Data
27 Selecting DataFrame Rows
28 Selecting DataFrame Rows and Columns Simultaneously
29 Selecting Data with Both Integers and Labels
30 Speeding Up Scalar Selection
31 Slicing Rows Lazily
32 Slicing Lexicographically
33 Calculating Boolean Statistics
34 Constructing Multiple Boolean Conditions
35 Filtering with Boolean Indexing
36 Replicating Boolean Indexing with Index Selection
37 Selecting with Unique and Sorted Indexes
38 Gaining Perspective on Stock Prices
39 Translating SQL WHERE Clauses
40 Determining the Normality of Stock Market Returns
41 Improving Readability of Boolean Indexing with the Query Method
42 Preserving Series with the WHERE Method
43 Masking DataFrame Rows
44 Selecting with Booleans, Integer Location, and Labels
45 Examining the Index Object
46 Producing Cartesian Products
47 Exploding Indexes
48 Filling Values with Unequal Indexes
49 Appending Columns from Different DataFrames
50 Highlighting the Maximum Value from Each Column
51 Replicating idxmax with Method Chaining
52 Finding the Most Common Maximum
53 Defining an Aggregation
54 Grouping and Aggregating with Multiple Columns and Functions
55 Removing the MultiIndex After Grouping
56 Customizing an Aggregation Function
57 Customizing Aggregating Functions with _args and _kwargs
58 Examining the groupby Object
59 Filtering for States with a Minority Majority
60 Transforming through a Weight Loss Bet
61 Calculating Weighted Mean SAT Scores Per State with Apply
62 Grouping By Continuous Variables
63 Counting the Total Number of Flights Between Cities
64 Finding the Longest Streak of On-Time Flights
65 Tidying Variable Values as Column Names with Stack
66 Tidying Variable Values as Column Names with Melt
67 Stacking Multiple Groups of Variables Simultaneously
68 Inverting Stacked Data
69 Unstacking After a groupby Aggregation
70 Replicating pivot_table with a groupby Aggregation
71 Renaming Axis Levels for Easy Reshaping
72 Tidying When Multiple Variables are Stored as Column Names
73 Tidying When Multiple Variables are Stored as Column Values
74 Tidying When Two or More Values are Stored in the Same Cell
75 Tidying When Variables are Stored in Column Names and Values
76 Tidying When Multiple Observational Units are Stored in the Same Table
77 Appending New Rows to DataFrames
78 Concatenating Multiple DataFrames Together
79 Comparing President Trump’s and Obama’s Approval Ratings
80 Understanding the Differences Between concat, join, and merge
81 Connecting to SQL Databases