Over the years I’ve built out many resources on the topic of web scraping. I thought I’d collect them all here for ease of access.
The success of some of my web scraping articles meant that I was getting several emails a week from folks with various web scraping questions. I decided to put together a book that walks a beginner through the skills they’ll need to build their own web scraper. It has since sold thousands of copies.
I realized that a major challenge of teaching web scraping is having a target site that won’t mind your scraping them, and won’t change their markup over time. That’s when I got the idea to build out Scrape This Site.
It has a free sandbox section that uses common web interface elements you’ll see on most websites, like pagination, forms, AJAX and more. Anyone is welcome to scrape the site.
In addition to the free sandbox content, there is also a premium, member’s only area where I offer video screencasts of me building web scrapers, as well as the final working code samples that members can download and use themselves.
I built it out to be the internet’s best resource for learning web scraping.
This provides a good overview of reasons you should think about using web scraping, and also provides a crash course into a bunch of topics you’ll likely run into.
This was the very first article on web scraping I wrote in 2012, and has been viewed over half a million times.
It was the top google search result for “web scraping” for several years.
For anyone learning python who wants to do some web scraping, this article has lots of code snippets you can copy/paste to perform common, web scraping tasks.
Web Scraping Boilerplate: Everything You Need to Start Your New Python Scraping Project (Batteries Included)
No one likes having to reinvent the wheel, so I decided to open source a bunch of the common, generic code that I use to get started on a new client’s scraping projects.
One of the most common sites I get asked to scrape is amazon. This is an article I wrote after I did my very first scrape of the site. I’ve since built dozens of data collection tools for other companies.
Everything you need to know about using proxies with your web scraping. I show you how to use them, how to figure out how many you need, where to get them, and lots more.
Page last updated April 2019.