Skip to main content

Posts

Showing posts from March, 2023

Scraping Complex Websites Made Easy: A Step-by-Step Guide - Part 2

  Demystifying Web Scraping: How to Extract Data from Complex Websites In the previous part , of this tutorial, we did some base work like setting up the spider, collecting products, finding price API, and some other useful lessons. I will advise you to check that out as well. In part 2 of Scraping Complex Websites, we will be seeing how to extract the extra products that are loaded as we scroll down. Some things we will do involve Finding the right request from the network tab copying and formatting the curl request Mimicking the browser request in scrapy Making a post request in scrapy Merging the code This video part has a lot of technical details that i decided to make in form of a video. feel free to check the source code on GitHub

Scraping Complex Websites Made Easy: A Step-by-Step Guide

Demystifying Web Scraping: How to Extract Data from Complex Websites Web scraping has become an essential skill for data analysts, researchers, and developers in various fields. It involves extracting data from websites and storing it in a structured format for analysis or use in other applications. In this tutorial, we will learn how to scrape a complex website using Scrapy, a Python-based web scraping framework. Our target website will be www.walmart.ca, a popular e-commerce website in Canada. By the end of this tutorial, you will have the skills and knowledge to scrape any website using Scrapy and handle complex website structures. Let's get started! I. Introduction While learning to scrape simple websites like https://quotes.toscrape.com/ is a good starting point, many beginners struggle with applying their skills to real-life websites that clients require. This is because such websites often contain complex features and structures that require additional training to handle. In...