This repository provides an http
service to get bills, titles, bills related by title, and bills related by section similarity. It is built on the self-documenting FastAPI in Python and uses data from other related repositories. Initially, this data was stored in a sqlite
database, but was migrated to a postgres
database. The original version of the API, with SQLite data embedded, is available in self-contained docker containers, through 0.1.5 (docker run -d -t -i -p 8000:8000 ghcr.io/aih/billtitles:0.1.5
)
Run using Docker and docker-compose (see below)
Or:
-
Clone this repository with
git clone https://github.com/aih/billtitles-py.git
-
Install Python > 3.7 (preferably with pyenv)
-
Install dependencies, including FastAPI and the uvicorn server (pip install -r requirements.txt)
-
Run the PostgreSQL database server (using docker:
docker run -d -t -i -p 5432:5432 postgres
) -
Run the server with
uvicorn billtitles.main:app --reload
(default port8000
)
This will provide an API to query for bills and titles; query similar bills by bill number; (soon) to query similar bills by title; and to create, update, and delete bills and titles.
Note
|
The bill-to-bill data is only up to date through the date of this README. |
The API is self-documenting, thanks to FastAPI. It can be viewed and tested at http://localhost:8000/docs
.
The api includes paths to get bills, titles, and bills related by title.
Within this repository is a docker-compose.yml
file that can be used to run the API in a docker container. To run this compose file requires at least 20GB of disk space (for the PostgresSQL container) and at least 4GB of RAM (for the Elasticsearch container). I used a machine with the following settings:
` podman machine init --disk-size=20 --memory=5000`
The docker-compose.yml
refers to two custom docker images:
-
docker.io/arihersh/billtitles (the FastAPI application, contained in this repository and defined by the Dockerfile at the top level of the directory)
-
docker.io/arihersh/billsim-pgsql (the PostgreSQL database. The tables for this are defined in models.py. It is loaded with ~ 2Gb of data derived from processes in the
github.com/aih/billsim
repository. The docker container is built with the dockerfile at dockerpgsa/Dockerfile in this repository.)
To run the docker-compose file, copy the .env-sample
file to .env
in the directory with the docker-compose.yaml
file and use:
docker-compose up -d
(add a &
at the end to run in the background)
-
Unzip the
.json.gz
gzip -d elasticdump.billsim.json.gz
gzip -d elasticdump.bill_full.json.gz
Note
|
these files are available separately, and are generated from data created by the github.com/aih/billsim repository.
|
-
Restore data to Elasticsearch
elasticdump \
--input "elasticdump.billsim.json" \
--output=http://localhost:9200/billsim --limit=50 \
--ignore-errors true
elasticdump \
--input "elasticdump.bill_full.json" \
--output=http://localhost:9200/bill_full --limit=50 \
--ignore-errors true
The database container alone can be run with:
docker container run -d --rm -p 5432:5432 -e POSTGRES_PASSWORD=postgres -e POSTGRES_USER=postgres --name billsim-data docker.io/arihersh/billsim-pgsql:latest
Then the database can be accessed at psql postgresql://postgres:postgres@localhost:5432
Note
|
The data in the Docker image is not complete, especially for related bills. The title data should be up-to-date as of the commit of this README. However, it is not meant to be used in production as-is. |
Note
|
For MacOs users, it may be necessary to set port forwarding in Virtualbox on MacOs to forward to a host port. Set Guest port to 8000 and host port to whatever you want to use on your local machine (I also use 8000). To set the port forwarding, follow the instructions here: https://www.jhipster.tech/tips/020_tip_using_docker_containers_as_localhost_on_mac_and_windows.html |
See also the github.com/aih/billsim
and the github.com/aih/bills
repositories. The billsim
repository processes bill data into a PostgresSql database, while the bills
repository provides a Go module to calculate similarity scores between bills.