GoSber is a versatile GoLang script for web scraping. It offers multiple modes for scraping, the ability to specify search prompts, and the option to scrape specific links. Additionally, it can automatically connect to a PostgreSQL database using the DB_URL
environment variable.
Will soon include parser and promo duplicator. Functionality will be combined in one project SberTool
- Selection of different modes for customizable scraping
- Search prompts to target specific content
- Scraping of data from specific URLs
- Automatic connection to a PostgreSQL database using
DB_URL
To get started with sber-scrape, follow the instructions below:
-
Download appropriate executable depending on your platform from releases
-
Run with with flags if needed
./sber-scrape -mode <mode> -seatch <search-prompt> -table-name <url>
-
Clone this repository:
git clone https://github.com/malvere/GoSber.git cd GoSberScrape
-
Build the project
go build
-
Run
./sber-scrape -mode <mode> -search <search-prompt>
3.1 Available Flags:
-mode
- Mode to run in. makes HTTP requests and parses HTML body, while searches for .html file. uses API requests and parses JSON body (preferred method).-search
- Searhces with specific prompt.-url
- Parses using predifined url. You can set up your search prompt with filters and then copy the url from megamarket and paste it to the scraper.-table-name
- Tables name in the DataBase.-pages
- How many pages to parse.3.2 Usage:
If
-search
is passed, then it will search by your specific prompt.If
-url
is passed, search will be done according to the specified link.
If you have a PostgreSQL database, sber-scrape can connect to it by setting the DB_URL environment variable. The script will use it to establish a connection.
export DB_URL="postgres://username:password@localhost/database"
./sber-scrape
If DB_URL
is not specified, a .csv file with parse results will be generated near the executable.
MIT
Contributions are welcome! Feel free to open issues and submit pull requests to help improve this project.