Scrapy Item Exporters which allows export into gzip compressed feed. Supported formats: CSV (csv.gz), XML (xml.gz), JSON (json.gz), and JSON lines (jsonl.gz, jsonlines.gz, jl.gz).
This package also adds jsonl output format alias for JSON lines feed.
Add the following line to your project settings.py
:
from scrapy_gzip_exporters import FEED_EXPORTERS
Set desired output format:
FEED_FORMAT = 'csv.gz'
or specify it with --output-format
(-t
) commandline tool option:
scrapy crawl myspider -o outfile.csv.gz -t csv.gz
To get CSV encoded in UTF-8 variant understood by Microsoft Excel, set
FEED_EXPORT_ENCODING = 'utf-8-sig'