An interactive CLI tool for choosing CSS selectors for a web page. Designed for use as a library with BeautifulSoup and Scrapy.
This project uses the
BeautifulSoup
and
rich
libraries to create
an interactive element-selecting experience. It can be run as program or used as
a library.
This project was made using Python 3.10.12
and pip 22.0.2
. See
requirements.txt
for module information.
git clone https://github.com/Makaze/csschooser.git
cd csschooser
pip install -r requirements.txt
$ python3 csschooser.py
Example using the BeautifulSoup
library to print the text from all matching
elements:
import csschooser
soup = csschooser.get_soup("http://github.com/Makaze/csschooser") # Example URLexit
selector = csschooser.interactive_select(soup)
for tag in soup.select(selector):
print(tag.get_text().strip())
Takes in a string
name
and returns aBeautifulSoup
instance based on the contents of the file or URL namedname
. Raises aFileNotFoundError
ifname
is neither a valid URL nor a valid file name.
Takes in a string
s
and returns a Regular Expression pattern as a string for matching the outermost element ins
. Returnss
unchanged if it contains no elements.
Takes in
soup
as aBeautifulSoup
instance and prompts the user to enter a CSS selector. Matching elements are highlighted in an auto-scrolling output window. Clears the terminal screen and returns the last chosen selector when the user follows the prompt to exit.
Takes in an int
lines
. Iflines
is>= 1
, moves the cursor up and to the end of the linelines
times and returns the resulting backtrack sequence as a string. Otherwise calls the system's clear terminal command, clearing the terminal screen, then returns False.
Takes in
console
as arich.Console
instance andpretty
as a string, then passes pretty to the console and sends the rich string to the system's pager utility (less
for Linux systems).