Building Data Pipelines with Python: Understanding Pipeline Frameworks, Workflow Automation, and Python Toolset

Building Data Pipelines with Python: Understanding Pipeline Frameworks, Workflow Automation, and Python Toolset

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 3h 39m | 913 MB

This course shows you how to build data pipelines and automate workflows using Python 3. From simple task-based messaging queues to complex frameworks like Luigi and Airflow, the course delivers the essential knowledge you need to develop your own automation solutions. You’ll learn the architecture basics, and receive an introduction to a wide variety of the most popular frameworks and tools.

Designed for the working data professional who is new to the world of data pipelines and distributed solutions, the course requires intermediate level Python experience and the ability to manage your own system set-ups.

  • Acquire a practical understanding of how to approach data pipelining using Python toolsets
  • Master the ability to determine when a Python framework is appropriate for a project
  • Understand workflow concepts like directed acyclic graphs, producers, and consumers
  • Learn to integrate data flows into pipelines, workflows, and task-based automation solutions
  • Understand how to parallelize data analysis, both locally and in a distributed cluster
  • Practice writing simple data tests using property-based testing
Table of Contents

01 Welcome To The Course
02 About The Author
03 Introduction To Automation
04 Adventures With Servers
05 Being A Good Systems Caretaker
06 What Is A Queue
07 What Is A Consumer What Is A Producer
08 Why Celery
09 Celery Architecture & Set Up
10 Writing Your First Tasks
11 Deploying Your Tasks
12 Scaling Your Workers
13 Monitoring With Flower
14 Advanced Celery Features
15 Why Dask
16 First Steps With Dask
17 Dask Bags
18 Dask Distributed
19 What Are Data Pipelines What Is Dag
20 Luigi And Airflow – A Comparison
21 First Steps With Luigi
22 More Complex Luigi Tasks
23 Introduction To Hadoop
24 First Steps With Airflow
25 Custom Tasks With Airflow
26 Advanced Airflow – Subdags And Branches
27 Using Luigi With Hadoop
28 Apache Spark
29 Apache Spark Streaming
30 Django Channels
31 And Many More
32 Introduction To Testing With Python
33 Property-Based Testing With Hypothesis
34 What’s Next