This repository helps you set up a reproducibility proof for your project. It's quite easy, trust me.
How can scientists unmistakably know whether their results can be reproduced by other people? How can reviewers verify that a certain numerical experiment published in an article is correct? And how can collaborators quickly understand how to use the source code you have written for your project? As the age-old saying in the Swiss army goes:
Vertrauen ist gut; Kontrolle ist besser. - Trust is good; control is better.
Provable reproducibility is an initiative, which pursues the goal of making results published in articles, theses, and software packages easier to reproduce and verify. No more ambiguity, data manipulation, cherry-picked parameters, and hand-crafted results. Every figure, plot, and table in a provably reproducible project can be unequivocally traced back to where it originated from.
To make your GitHub project provably reproducibile, run the following command:
git pull https://github.com/FMatti/Re-Pro <preset> --allow-unrelated-histories
Currently, you may choose from the following presets:
Preset | Description |
---|---|
python-latex | Python scripts and LaTeX project |
python-latex-bibtex | Python scripts and LaTeX project with BibTeX bibliography component |
matlab-latex | (UNDER CONSTRUCTION) MATLAB scripts and LaTeX project |
julia-latex | (UNDER CONSTRUCTION) Julia scripts and LaTeX project |
In the .github/workflows
directory, you may have to modify the reproduce.yml
file as follows:
- Change the
LATEX_PROJECT
variable to the name/path of your LaTeX main file (without the.tex
extension) - Change the
CODE_FILES
variable to the location of the code file(s) you want to execute with the pipeline (glob patterns supported) - Change the package imports to the packages you depend on in your code
- Change the LaTeX setup to the packages/compilers your project requires
Finally, commit and push the changes to GitHub:
git commit --all -m "add reproducibility proof"
git push
On your GitHub repository, every time you push to main branch, a pipeline will be executed. If everything goes well, a green check mark will appear next to the commit message.
You can view all the GitHub actions in the Actions
tab. Click on one to see all the details.
This is also where you can download a ZIP archive with the generated PDF in it.
Once you publish your GitHub repository, everyone can inspect your code and the steps used to generate your results. If an error occurs in your pipeline, a red cross appears and you can inspect the action as above to see what happened.
The Re-Pro badge is the certificate of reproducibility, which you can display in your document. It certifies that a document was indeed produced based on a given state of a source.
The badge is automatically generated. You can display it in your LaTeX
documents with the following command:
\input{re-pro-badge.tex}
Doing this requires the tikz
and hyperref
packages, which you should include in your preamble using the \usepackage{...}
command.
The main
branch of this repository (which you are currently viewing) serves as an example for a provably reproducible project. It includes a
-
Python script called
plot.py
which downloads and saves a matrix (matrix.npz
) from the internet, analyzes its principal components, and visualizes them in a plot which is saved asplot.pgf
. -
LaTeX project
main.tex
with a bibliographybibliography.bib
which produces a PDF in which the generated plotplot.pgf
is included.
Unless you are using some extraordinary dependencies or features in your project, your repository should now be configured for provable reproducibility. Some common problems which are encountered by people trying to set up a reproducibility proof are:
- Ìf the branch you want to run the reproducibility proof on is not called
main
, you'll need to modify thebranches:
key at the top of thereproduce.yml
file. - If your code files or LaTeX project is located in subdirectories, relative imports may not work any longer, hence you'll need to manually specify the working directory by adding
working-directory: [PATH]
below the commands which run the code. - In case you already use another GitHub action implemented in a file called
reproduce.yml
, you might have to resolve merge conflicts.
If you want to commit and push the generated changes from the reproducibility proof to your repository, add write-permissions to the build:
jobs:
build:
permissions:
contents: write
Subsequently you can add the following step to your action (make sure to replace [FILES]
by the space-separated filepaths of the files you want to be changed):
- name: Commit and push generated files to repository
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
git add [FILES]
git commit -m "reproduce project"
git push
Another advantage of tracking your code in a GitHub repository is that you can view and edit your project from Overleaf. The process for setting this up is described in the Overleaf guide on GitHub Synchronisation.
This example repository also serves as a demonstration of how matplotlib plots are to be exported and included in LaTeX projects. Any other way than using .pgf files for this purpose should be pursued as a criminal offence.