Crawling the Web with Python and Scrapy

Crawling the Web with Python and Scrapy

English | MP4 | AVC 1280×720 | AAC 44KHz 2ch | 1h 32m | 239 MB

Have you ever wanted to know how to programmatically crawl websites and extract data from them? If so, then this course is for you. You will learn how to use the Scrapy framework to write spiders that are able to extract valuable data from the web.

Have you ever spent hours trying to gather high-quality data from specific websites, and wondered how you could extract this data programmatically and use it within your own applications? In this course, Crawling the Web with Python and Scrapy, you will gain the ability to write spiders that can extract data from the web, using Python and Visual Studio Code, through an advanced yet easy-to-use framework called Scrapy. First, you will learn what scraping and crawling are, and explore all its implications. Next, you will discover how to scaffold a Scrapy project and write spiders. Finally, you will explore how to influence how spiders crawl websites and extract data in different formats. When you are finished with this course, you will have the skills and knowledge on how to use Scrapy with Python, to programmatically crawl and scrape data from any website.

Table of Contents

Course Overview
1 Course Overview

Extracting Data from the Web – Core Concepts
2 Introduction, Overview, and Prerequisites
3 Concepts
4 Legal or Illegal
5 Legal Consequences
6 General Advice
7 Why Scrapy
8 Demo Extracting Data without Scrapy
9 Summary

Scaffolding and Running Your First Scrapy Web Crawler Project
10 Introduction and Overview
11 Introduction to Scrapy
12 Scrapy Architecture
13 Beautiful Soup
14 Demo Creating and Scaffolding a New Scrapy Project
15 Summary

Achieving Common Spider Behaviors Using Built-in Classes
16 Introduction and Overview
17 Spiders Overview
18 Types of Scrapy Spiders
19 scrapy.Spider
20 CrawlSpider
21 XMLFeedSpider
22 CSVFeedSpider
23 SitemapSpider
24 Demo Implementing a scrapy.Spider
25 Demo Implementing a CrawlSpider
26 Summary

Influencing Scrapy Crawling
27 Introduction and Overview
28 Allow and Deny Rules
29 Processors
30 Item Loaders
31 Item Pipelines
32 Demo Implementing a Scraping Pipeline
33 Summary

Scrapy Outcome and Data Export
34 Introduction and Overview
35 Feed Exporter
36 Demo Using an Exporter to Save Data
37 Summary