Scispacy has two components:
- The scispacy pip package
- The scispacy models
The scispacy pip package is published automatically using the .github/actions/publish.yml
github action. It happens whenever a release is published (with an associated tag) in the github releases UI.
In order to create a new release, the following should happen:
Update the version in version.py.
The entire pipeline can be run using spacy project run all
. This will train and package all the models.
The packages should then be uploaded to the https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/{VERSION}
S3 bucket, and references to previous models (e.g in the readme and in the docs) should be updated. You can find all these places using git grep <previous version>
.
The scripts install_local_packages.py
, install_remote_packages.py
, print_out_metrics.py
, smoke_test.py
, and uninstall_local_packages.py
are useful for testing at each step of the process. Before uploading, install_local_packages.py
and smoke_test.py
can be used to make sure the packages are installable and do a quick check of output. print_out_metrics.py
can then be used to easily get the metrics that need to be updated in the README. Once the packages have been uploaded, uninstall_local_packages.py
, install_remote_packages.py
, and smoke_test.py
can be used to ensure everything was uploaded correctly.
Merge a PR with the above changes, and publish a release with a tag corresponding to the commit from the merged PR. This should trigger the publish github action, which will create the scispacy
package and publish it to pypi.