Skip to content

patkle/VGChartz-Scrapy-Project

Repository files navigation

VGChartz Scrapy Project

This project is configured to be hosted on Scrapy Cloud.

It uses Zyte Smart Proxy Manager as proxy service.

The dataset can be found here.
A Jupyter Notebook with some EDA on that data can be found here.

Game Statistics

The spider can be ran with

python3 -m scrapy crawl game_statistics -a pages=5 -O game_statistics.csv

Arguments

With -a you can specify arguments for the spider.

argument type description
pages int number of pages to scrape

Setting up locally

When setting up this project locally you must create a .env file with the following data:

setting description
ZYTE_SMARTPROXY_APIKEY your smart proxy manager api key

Deploy to Scrapy Cloud

There's a shortcut in the Makefile, just running make deploy will deploy the project to Scrapy Cloud (given that you provided the project ID in scrapinghub.yml).Don't forget to add the following settings in your cloud project's settings:

setting description
ZYTE_SMARTPROXY_APIKEY your smart proxy manager api key

Also,

you could buy me a coffe if you wanted to. I'd really appreciate that.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published