This repository provides instructions for running two software packages: GeoCARET and RE-Emission from their respective Docker images.
GeoCARET is an open-source software for geospatial analysis of reservoirs and catchments. RE-Emission is an open-source tool for estimating, reporting and visualising reservoir emissions. These tools can be used sequentially to estimate emissions from multiple reservoirs. GeoCARET processes geospatial datasets to generate input data for emission models, which RE-Emission then uses to calculate, report and visualize emissions.
Both tools are written in Python and can be installed as Python packages. To simplify and speed up the installation process, we also provide their Docker images (see here for GeoCARET and here for RE-Emission) which enable running both tools via command-line in isolated environments.
This repository outlines how to run both packages as Docker containers and provides the folder structure required to facilitate data exchange between your local file system and the containerized applications.
Before running the software, you need a working Docker installation. If you are using Windows or macOS, you should install Docker Desktop, which provides an easy-to-use interface for managing Docker on desktop operating systems. Please visit Docker's official documentation and follow the appropriate instructions for installing Docker Desktop on your system. Linux users also have the option of installing Docker Engine and Docker Compose as an alternative.
Docker can be set to start automatically at system startup. If you choose not to enable this option, you’ll need to launch it manually before use. To check if Docker is running, open your console or terminal and type docker -v
.
Additionally, to run GeoCARET, you will need to set up an account with Google Earth Engine / Google Cloud and request access to the private assets hosted in our project folder. Detailed instructions on how to do this are available in the following documentation pages:
It is recommended to fetch the Docker images directly from the GitHub Container Registry (GHCR) using the following commands:
docker pull ghcr.io/reservoir-research/geocaret:release
docker pull ghcr.io/tomjanus/reemission:release
Docker will automatically assign names of the pulled images based on their URLs. After you have pulled both images typing the docker images
command, should produce a similar output to the one below:
❯ docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ghcr.io/reservoir-research/geocaret release xxxxxxxxxxxx x weeks ago x.xxGB
ghcr.io/tomjanus/reemission release xxxxxxxxxxxx x weeks ago x.xxGB
Image names have have a [REPOSITORY]:[TAG]
format. In other words, in case you need to, you should refer to both images as ghcr.io/reservoir-research/geocaret:release
and ghcr.io/tomjanus/reemission:release
.
Alternatively, you can build the images from source by downloading the source code for each package from their respective GitHub repositories. Instructions for building GeoCARET can be found here, and for RE-Emission here.
Running both applications requires a specific folder structure that maps directories on your local file system to correspondingu locations inside the Docker containers, enabling data exchange, e.g. reading inputs and saving outputs. This mapping is defined in two separate compose.yaml
files—one for each application.
The folder structure includes an demo.csv
input file for GeoCARET demonstration run and a test_input.json
input file for testing RE-Emission. You can also create your own input files for both GeoCARET and RE-Emission, placing them in the appropriate folders as shown in the file tree below.
geocaret-reemission (root)
├── geocaret
│ ├── auth
│ ├── compose.yml
│ ├── data
│ │ ├── demo.csv
│ │ └── README.md
│ └── outputs
└── reemission
├── compose.yaml
├── examples
│ └── test_input.json
└── outputs
The geocaret/data
and geocaret/outputs
directories are used to store the input and output data for GeoCARET calculations, respectively. The geocaret/auth
folder holds authentication credentials for Earth Engine and Google Cloud. Similarly, reemission/examples
and reemission/outputs
store the input and output data for RE-Emission.
To test everything is working correctly, you should first run the following command from inside the GeoCARET workspace folder geocaret
:
docker compose run --rm geocaret
Note
You should see the following message: You must specify a command to run. See https://Reservoir-Research.github.io/geocaret/running_geocaret/running_docker.html for details and GeoCARET will exit.
More information about running GeoCARET with Docker can be found in the GeoCARET's documentation.
To test if the RE-Emission Docker image is working, run RE-Emission using docker compose
, similar to how you ran GeoCARET. Start from the reemission
folder:
docker compose run --rm reemission
You should see a short usage guide
Usage: reemission [OPTIONS] COMMAND [ARGS]...
This will be followed by additional details about the available options and commands for the RE-Emission command-line interface (CLI). More information on running RE-Emission with Docker can be found here.
You can run a test analysis using the input data provided in ./examples/test_input.json
. From within the reemission
folder, enter the following command:
docker compose run --rm reemission reemission calculate examples/test_input.json -o outputs/test_output.json -o outputs/test_output.xlsx
The analysis should complete successfully and you should find two output files - test_output.json
and test_output.xlsx
- in the outputs
folder.
Atttention
The two instances of "reemission" in the command are intentional. The first refers to the name of the running container while the second invokes the Re-Emission's CLI.
The output files include the input, intermediate, and final output variables in both .json
and Excel
formats.
Note
RE-Emission also supports generating output reports in a PDF format. However, this feature requires a working LaTeX installation, which is not currently included in the RE-Emission Docker Image. To generate a PDF report, provided that LaTeX is installed in your OS, add the -o [output-filename].pdf flag to the remisison calculation command. You can view an example PDF report in the RE-Emission documentation here.
You can run a short demo that demonstrates how RE-Emission processes tabular input data, along with reservoir and catchment delineations from GeoCARET, to calculate reservoir emissions and visualize them on an interactive map. More details about the demo can be found on the corresponding page in RE-Emission's documentation. To run the demo from within the reemission
folder, use the following command:
docker compose run --rm reemission reemission run-demo examples
where examples
is the directory where all input and output files will be stored.
First, pre-calculated outputs from GeoCARET will be fetched from an online source. This includes tabular outputs with input data for the emission model and geospatial data with delineations (reservoir, river, and catchment) in .shp
format—one for each analyzed dam.
Once the analysis is complete, you will find several files and folders in the examples
directory, including the fetched input data. The structure of the examples
folder should look like this:
examples
├── demo_interactive_map
├── geocaret_outputs
├── reemission_demo_dam_db
├── reemission_demo_delineations
├── reemission_outputs
└── test_input.json
They contents of each folder are explained below:
reemission_demo_delineations
: Contains geospatial outputs from GeoCARET, including reservoir and catchment delineations, impounded river segments, and tabular outputs with emission input parameters inoutput_parameters.csv
. Each folder contains outputs from a single simulation run. This folder is fetched from a remote source.reemission_demo_dam_db
: Contains a shapefile of hydroelectric dams, represented as points with associated data about dam and turbine properties. This folder is also fetched from a remote source.geeocaret_outputs
: Contains merged shapefiles fromreemission_demo_delineations
, updated with emission estimates calculated by RE-Emission. This folder also includes thereemission_inputs.json
file, created by merging and converting multipleoutput_parameters.csv
files.reemission_outputs
: Contains the output data generated by RE-Emission in both.json
and.xlsx
formats.demo_interactive_map
: Contains an interactive map that visualizes the delineated reservoirs and their estimated emissions.
An animated GIF demonstrating how to run the demo (albeit not from within the Docker container) is shown below. Running the demo using the Docker image should look the same, except for the initial command.
The interactive map can also be accessed at https://tomjanus.github.io/reemission_demo_map.
Here, we will demonstrate how to run GeoCARET and RE-Emission sequentially. We will begin with dam locations and dam heights/full supply levels, then use GeoCARET to delineate the respective reservoirs and catchments, and calculate reservoir- and catchment-level inputs for the emission model. Finally, we will calculate reservoir emissions using RE-Emission.
The demo illustrates the process for a batch of three reservoirs.
Note
To run the analysis you need to be authenticated with GeoCARET, have a Google Cloud project folder set up an have at least one assets directory within your GCloud project folder. For details how to do this, please refer to the Account Setup section of the GeoCARET documentation. For this example, we will assume that the project folder is named geocaret-demo and inside the project folder we create assets folder called emissions.
Attention
An internet connection needs to present throughout the process of running the GeoCARET calculations. Short interruptions in connectivity are permissible.
The majority of the processing time is spent in GeoCARET, where reservoirs and catchments are delineated and their respective characteristics computed. The computational time varies depending on the size of the reservoirs and catchments, as well as the availability of Google’s servers and their current workload. In our experience, it took 19 minutes to analyze three dams during this GeoCARET demonstration run, using an Earth Engine account with unpaid usage. Please note that this time does not include the optional export of delineations to Google Drive, which took an additional 2.5 minutes. The run times for the subsequent steps are measured in seconds.
Here, we will outline the steps necessary to calculate the reservoir emissions associated with the construction of three dams. The process involves computing the reservoir and catchment characteristics for each dam and utilizing this data to estimate greenhouse gas (GHG) emissions.
Assuming your project is callled geocaret-demo and you choose a standard simulation scenario (see Export Settings and Outputs in GeoCARET's documentation for options) you can run geocaret from within the geocaret
folder using the following command:
docker compose run --rm geocaret python heet_cli.py data/demo.csv geocaret-demo demo standard
This run delineates the reservoirs and catchments for three dam locations and calculates the characteristics of each that are relevant for estimating reservoir GHG emissions with ReEmission.
After the analysis completes you should see a subfolder in the outputs
folder with the designation DEMO_[yyyymmdd]-[hhmm]
, where [yyyymmdd]
represents the date of the simulation (e.g. 20240924) and [hhmm]
indicates the simulation start time when (e.g. 1914). The file we are interested in is output_parameters.csv
, which contains the input data for RE-Emission. You will need to copy this file to the examples
folder within the reemission
directory, either manually or programmatically. In the Linux Teminal you can do the following (from within the geocaret
folder):
mkdir ../reemission/examples/demo
cp outputs/DEMO_[yyyymmdd]-[hhmm]/output_parameters.csv ../reemission/examples/demo/output_parameters.csv
The equivalent commands in Windows Command Prompot (CMD) are:
mkdir ..\reemission\examples\demo
copy outputs\DEMO_[yyyymmdd]-[hhmm]\output_parameters.csv ..\reemission\examples\demo\output_parameters.csv
The tabular output data from GeoCARET needs to be converted into the JSON file format required by RE-Emission. This involves using two conversion scripts in RE-Emission: the first script imputes any missing data that cannot be sourced from known geospatial datasets, and the second converts the tabular data into the appropriate JSON format.
Three input variables for ReEmission cannot be derived from geospatial datasets and must be entered manually. These variables are: level of wastewater treatment in the catchment, landuse intensity in the catchment and reservoir's purpose/type represented by: c_treatment_factor
, c_landuse_indensity
and type
, respectively. Here, we will apply default values for these parameters (across all reservoirs) using a script in ReEmission. You can adjust them later in output_parameters_upgraded.csv
. To run the script, navigate to the reemission
folder and execute the following command:
docker compose run --rm reemission reemission-geocaret process-tab-outputs \
-i examples/demo/output_parameters.csv \
-o examples/demo/output_parameters_upgraded.csv \
-cv 'c_treatment_factor' 'primary (mechanical)' \
-cv 'c_landuse_intensity' 'low intensity' \
-cv 'type' 'unknown'
The allowed values for each of the three variables are listed below.
c_landuse_intensity | c_treatment_factor | type |
---|---|---|
low intensity | no treatment | hydroelectric |
high intensity | primary (mechanical) | multipurpose |
secondary biological treatment | potable | |
tertiary | irrigation | |
flood control | ||
unknown |
We then convert the tabular data, updated with the missing input variables, into a JSON representation that conforms to RE-Emission's input data specification.
docker compose run --rm reemission reemission-geocaret tab-to-json \
-i examples/demo/output_parameters_upgraded.csv \
-o examples/demo/input_data_demo.json
Finally, we estimate the reservoir emission in ReEmission.
docker compose run --rm reemission reemission calculate \
examples/demo/input_data_demo.json \
-o outputs/output_data_demo.json \
-o outputs/output_data_demo.xlsx
The generated files, outputs/output_data_demo.json
and outputs/output_data_demo.xlsx
, contain the inputs, the calculated intermediate values and emissions. Both files contain equivalent numerical information. JSON
files are intended for data transfer and processing via scripts, e.g. in Python or JavaScript, while XLSX
files are suitable for quick examination and manipulation using Microsoft Excel or LibreOffice Calc.
We have currently configured GeoCARET to output only the tabular values. However, you can also optionally export the other generated files, including reservoir and catchment delineations (polygons), dam locations (points), and impounded river sections (polylines) in shapefile format. These files will be exported to your Google Drive. To perform the file export, use the following command, where DEMO_[yyyymmdd]-[hh-mm]
corresponds to the output folder generated during the GeoCARET run and geocaret-demo
is the project folder. The --drive-folder
flag sets the the path on your Google Drive where the results will be exported to. All arguments: --results-path
, --drive-folder
and --project
are required.
docker compose run --rm geocaret python heet_export_cli.py \
--results-path projects/geocaret-demo/assets/emissions/XHEET/DEMO_[yyymmdd]-[hh-mm] \
--drive-folder DEMO_[yyymmdd]-[hh-mm] \
--project geocaret-demo
You can download the folder manually from your Google Drive, either via a web browser or using a Google Drive client.
You may encounter permission issues when writing to volumes (the created folder structure) if your Docker is configured to run in rootless mode, which is not the default option. While using Docker in rootless mode is recommended for critical server-side applications that require increased security, we discourage its use with our applications. However, if you choose to run Docker in rootless mode and encounter permission issues—such as being unable to write files into folders—you can refer to this article for potential fixes.
Tomasz Janus 💻 |
James Sinnott 💻 |
This project follows the all-contributors specification. Contributions of any kind are welcome!