Articles Tagged With: discovery

Web Scraping Target Discovery

Target discovery in web-scraping is how the scraper explores target website to find scraping targets. For example to scrape product data of an e-commerce website we would need to find urls to each individual product. This step is called "discovery". What types of discovery methods are there?

Web Scraping Target Discovery: Crawling

The most common web scraping target discovery technique: recursive crawling. How does it work? What are the pros and cons and the most optimal execution patterns?

Web Scraping Target Discovery: Indexes

The most common web scraping target discovery technique: recursive crawling. How does it work? What are the pros and cons and the most optimal execution patterns?

Web Scraping Target Discovery: Search API

Fundamental web-scraping reverse-engineering technique is figuring out how website's search works. Replicating web search in web scraping is a great target discovery technique. Why, when and how should it be used effectively?

Web Scraping Target Discovery: Sitemaps

There are many techniques when it comes to discovery web-scraping targets. One of the most common ones is to use website sitemap indexes. What are they and to take advantage of them in web-scraping?