This repository contains the source code for the fourteen examples included in the book Practical Web Scraping for Data Science: Best Practices and Examples with Python by Seppe vanden Broucke and Bart Baesens.
See http://www.webscrapingfordatascience.com/ for more information, or buy the book on Amazon.
The following examples are included and explained in the book and available here under python-examples:
- Scraping Hacker News, see
hacker-newsfolder - Using the Hacker News API, see
hacker-newsfolder - Quotes to Scrape, see
quotes-to-scrapefolder - Books to Scrape, see
books-to-scrapefolder - Scraping GitHub Stars, see
githubfolder - Scraping Mortgage Rates, see
mortgage-ratesfolder - Scraping and Visualizing IMDB Ratings, see
imdbfolder - Scraping IATA Airline Information, see
iatafolder - Scraping and Analyzing Web Forum Interactions, see
web-forumfolder - Collecting and Clustering a Fashion Data Set, see
fashion-clusteringfolder - Sentiment Analysis of Scraped Amazon Reviews, see
product-reviewsfolder - Scraping and Analyzing News Articles, see
news-articlesfolder - Scraping and Analyzing a Wikipedia Graph, see
wikipedia-graphfolder - Scraping and Visualizing a Board Members Graph, see
board-membersfolder - Breaking CAPTCHA’s Using Deep Learning, see
captcha-crackingfolder