Web Scraping Using Scrapy

Sumitkumar Pandey
5y
18.8k
0
5
25
Blog

What is Web Scraping?

Web Scraping is extracting a large amount of data from websites. This data then can be saved in your local file on the computer or in the database table.

We can use this data to do the analysis. For example, we can scrape prices of products from e-commerce websites and then analyze them.

Why Web Scraping?

Data displayed on the website can only be viewed on the browser. We don't get to save this information. For this, we actually need to copy/paste the entire website which is boring sometimes. So, instead, we can use scrapers to get the information in a fraction of minutes.

Scrapy Framework

Scrapy is the web scraping framework written in Python. It can be used for various purposes like data mining, monitoring, and test automating. Scrapy is open source and available for python 2.7 and python 3.4 and above version.

Here, we will see how easily we can scrape websites using the Scrapy framework.

Steps

Requirement

Install Python 2.7 or Python 3.4 or above. Here is the link for downloading python Python Link.
Install Scrapy

Open your command prompt or terminal and type,

pip install scrapy
Scrapy Shell

Scrapy has a ScrapyShell which can be used for testing or debugging your code and you can also scrape the URLs from here. So, once you have successfully installed Scrapy, just write in your command prompt or Terminal -

scrapy shell
Fetch

Once ScrapyShell is started successfully, we can start scraping. Fetch is going to request the response and scrape the data. For now, I am going to take my friend's website "ugentertainment.in".

fetch("http://ugentertainment.in/")
View

The view will open the response in your default browser.

view(response)

and the scraped website will open in the default browser and you can compare the original website and scraped website.

Scaped Website

Original Website

And you are done with scraping your first website using Scrapy.