This project contains a set of scripts used to scrape Ebay’s products data using Scrapy Web Crawling Framework.

Problem

The client needs data (name, status, price, stars, ratings) from thousands of products of ebay.com to perfom data analysis.

Task

Create an automated and fast solution to navigate the website, find the products by a search text, extract all the data, and save it in a user-friendly format (CSV or JSON).

Solution

I’ve used the Scrapy Web Crawling Framework to build a Python script to search and scrape (extract) the data of products found in Etsy.

Results

The client was able to quickly download the data in CSV and JSON format of more than 100,000 products from ebay.com.


Source code

The solution is available at Github.

GitHub

How to use

In the current stage, the list of products scraped is defined by a search string (the same used in eBay web page).

An example of the scraped data can be found in the data/ folder.

The image below shows a scraped data for the “iphone X 256gb” search string in ebay.com

ebay_iphone_x_256gb_products_sample

You will need Python 3.x to run the scripts. Python can be downloaded here.

You have to install scrapy framework:

Once you have installed scrapy framework, just clone/download this project, access the folder in command prompt/Terminal and run the following command:

scrapy crawl ebay -o products.csv*

You can change the output format to JSON or XML by change the output file extension (ex: products.json).

Search string

The default search string is nintendo switch console and it can be changed in the command line with the -a flag. For example, to search to Xbox one X you can use:

scrapy crawl ebay -o products.csv -a search="Xbox one X"