A Year of Writing about Web Scraping in Review
I'd like to share the favorite things I wrote in a year of professional web scraping blogging at ScrapFly and my key takeaways.
Asynchronous Web Scraping: Scaling For The Moon!
Asynchronous programming can speed up web scrapers astronomically and is by far the most important scaling step when dealing with big projects. What exactly is it, how does it work, and what are the best ways to take advantage of it?
Introduction To Reverse Engineering The Web
To efficiently scrape a web resource, understanding how it works and functions is often a vital step. Reverse engineering a website behavior is often first step when developing a web-scraper - let's take a look how!
Web Scraping Target Discovery
Target discovery in web-scraping is how the scraper explores target website to find scraping targets. For example to scrape product data of an e-commerce website we would need to find urls to each individual product. This step is called "discovery". What types of discovery methods are there?
Web Scraping Target Discovery: Crawling
The most common web scraping target discovery technique: recursive crawling. How does it work? What are the pros and cons and the most optimal execution patterns?