Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to Docker material #270

Merged
merged 23 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
f526e98
Remove MAINTAINER label from Dockerfiles
fasterius Nov 12, 2024
99795d2
Add note on OCI Docker label best practices
fasterius Nov 12, 2024
31bff6c
Fix naming of Dockerfiles according to conventions
fasterius Nov 12, 2024
c215721
Harmonise Dockerfile formatting
fasterius Nov 12, 2024
19e9a0f
Fix naming of `linux/amd64` platform specification
fasterius Nov 12, 2024
c32405b
Add missing note title word
fasterius Nov 12, 2024
a58e880
Streamline `[slim/conda].Dockerfile` files
fasterius Nov 12, 2024
a5b23f3
Streamline final tutorial Dockerfile
fasterius Nov 12, 2024
d17a01a
Add extra material for multi-stage Docker builds
fasterius Nov 13, 2024
1682dd0
Fix broken Docker image
fasterius Nov 13, 2024
bcd97a5
Minor updates to conform to build checks
johnne Nov 13, 2024
461337b
Fix CMD line of Dockerfile
johnne Nov 14, 2024
c6cca74
Add SHELL instruction
johnne Nov 14, 2024
0994714
Fix embedding of resources in Quarto document
fasterius Nov 14, 2024
9324734
Set SHELL to not require JSON format for CMD
johnne Nov 14, 2024
e3e8b14
Fix embed-resources
johnne Nov 14, 2024
a1668fe
Move callout for ARM users
johnne Nov 14, 2024
d74046a
Merge branch 'docker-changes' of github.com:NBISweden/workshop-reprod…
johnne Nov 14, 2024
837c4c3
Remove section on jupyter notebook with Docker
johnne Nov 14, 2024
1fc4dad
Remove expose part from Dockerfile
johnne Nov 14, 2024
11805cf
Make supplementary table less redundant
johnne Nov 15, 2024
9365c3b
Fix minor formatting and spelling errors
johnne Nov 15, 2024
61d1b4a
Give explicit name for multi-stage Dockerfile
johnne Nov 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
408 changes: 281 additions & 127 deletions pages/containers.qmd

Large diffs are not rendered by default.

47 changes: 17 additions & 30 deletions tutorials/containers/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,54 +1,41 @@
FROM --platform=amd64 condaforge/miniforge3
FROM condaforge/miniforge3:24.7.1-0

LABEL authors="John Sundh, [email protected]; Erik Fasterius, [email protected]"
LABEL description="Image for the NBIS reproducible research course."
LABEL author="John Sundh"
LABEL email="[email protected]"

# Use bash as shell
SHELL ["/bin/bash", "--login", "-c"]

# Set workdir
WORKDIR /course

# Set timezone
ENV TZ="Europe/Stockholm"
ENV DEBIAN_FRONTEND=noninteractive
# Use bash as shell
SHELL ["/bin/bash", "-c"]

# Install packages require for timezone and Quarto installation
RUN apt-get update \
&& apt-get install -y tzdata curl \
&& apt-get clean
# Install required packages
RUN apt-get update && \
apt-get install -y curl && \
apt-get clean

# Install Quarto
ARG QUARTO_VERSION="1.3.450"
RUN mkdir -p /opt/quarto/${QUARTO_VERSION} \
&& curl -o quarto.tar.gz -L "https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-amd64.tar.gz" \
&& tar -zxvf quarto.tar.gz -C "/opt/quarto/${QUARTO_VERSION}" --strip-components=1 \
&& rm quarto.tar.gz
ENV PATH /opt/quarto/${QUARTO_VERSION}/bin:${PATH}
RUN mkdir -p /opt/quarto/${QUARTO_VERSION} && \
curl -o quarto.tar.gz -L "https://github.com/quarto-dev/quarto-cli/releases/download/v${QUARTO_VERSION}/quarto-${QUARTO_VERSION}-linux-amd64.tar.gz" && \
tar -zxvf quarto.tar.gz -C "/opt/quarto/${QUARTO_VERSION}" --strip-components=1 && \
rm quarto.tar.gz
ENV PATH=/opt/quarto/${QUARTO_VERSION}/bin:${PATH}

# Configure Conda
RUN conda init bash && conda config --set channel_priority strict && \
conda config --append channels bioconda && \
conda config --append channels r && \
conda config --set subdir linux-64
RUN conda config --set channel_priority strict && \
conda config --append channels bioconda

# Install environment
COPY environment.yml ./
RUN conda env create -f environment.yml -n project_mrsa && \
conda clean -a

# Set mrsa-workflow environment as active at start-up
RUN echo "source activate project_mrsa" >> ~/.bashrc

# Add environment to PATH
ENV PATH /opt/conda/envs/project_mrsa/bin:${PATH}
ENV PATH=/opt/conda/envs/project_mrsa/bin:${PATH}

# Add project files
COPY Snakefile config.yml ./
COPY code ./code/

# Open up port 8888
EXPOSE 8888

CMD snakemake -p -c 1 --configfile config.yml
CMD snakemake --configfile config.yml -p -c 1
29 changes: 0 additions & 29 deletions tutorials/containers/Dockerfile_slim

This file was deleted.

29 changes: 17 additions & 12 deletions tutorials/containers/code/supplementary_material.qmd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the code here to make the sample table less cluttered. There used to be 2 lines for each sample due to the info in gsm$characteristics_ch1. Now we extract the treatment: and growth phase info and put in separate columns, resulting in a condensed table.

Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: "Supplementary Material"
format:
html:
echo: false
embed_resources: true
embed-resources: true
engine: knitr
params:
counts_file: "results/tables/counts.tsv"
Expand Down Expand Up @@ -54,22 +54,29 @@ colnames(counts_summary) <- gsub(".*(SRR[0-9]+)\\..*", "\\1",
meta <- data.frame()
for (GSM in gsm_ids) {
gsm <- Meta(getGEO(GSM))
current_meta <- as.data.frame(do.call(cbind, gsm))
current_meta <- as.data.frame(
list(
title=gsm$title, geo_accession=gsm$geo_accession, source_name_ch1=gsm$source_name_ch1,
growth_phase=gsub("growth phase: ", "", gsm$characteristics_ch1[grep("growth phase", gsm$characteristics_ch1)]),
treatment=gsub("treatment: ", "", gsm$characteristics_ch1[grep("treatment", gsm$characteristics_ch1)])
)
)
meta <- rbind(meta, current_meta)
}
meta <- meta[c("title", "geo_accession", "source_name_ch1", "characteristics_ch1")]
meta <- meta[c("title", "geo_accession", "source_name_ch1", "growth_phase", "treatment")]
gsm2srr <- data.frame(geo_accession = gsm_ids, SRR = srr_ids)
meta <- merge(meta, gsm2srr, by = "geo_accession")

# Read FastQC data and update column names
qc <- read.delim(multiqc_file)
patterns <- c(".+percent_duplicates.*",
".+percent_gc.*",
".+avg_sequence_length.*",
".+percent_fails.*",
".+total_sequences.*")
patterns <- c("*.+percent_duplicates.*",
"*.+percent_gc.*",
"*.+avg_sequence_length.*",
"*.+median_sequence_length.*",
"*.+percent_fails.*",
"*.+total_sequences.*")
subs <- c("Percent duplicates", "Percent GC", "Avg sequence length",
"Percent fails", "Total sequences")
"Median sequence length", "Percent fails", "Total sequences")
for (i in 1:length(patterns)) {
colnames(qc) <- gsub(patterns[i], subs[i], colnames(qc))
}
Expand All @@ -85,10 +92,8 @@ was aligned and counted.
# Supplementary Tables and Figures

```{r Sample info}
columns <- c("SRR", "geo_accession", "source_name_ch1", "characteristics_ch1")
columns <- c("SRR", "geo_accession", "source_name_ch1", "growth_phase", "treatment")
sample_info <- meta[, columns]
sample_info$characteristics_ch1 <- gsub("treatment: ", "", sample_info$characteristics_ch1)
sample_info$characteristics_ch1 <- gsub("growth phase: ", "", sample_info$characteristics_ch1)
sample_info
```

Expand Down
20 changes: 20 additions & 0 deletions tutorials/containers/slim.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
FROM condaforge/miniforge3

LABEL authors="John Sundh, [email protected]; Erik Fasterius, [email protected]"
LABEL description="Minimal image for the NBIS reproducible research course."

WORKDIR /course

SHELL ["/bin/bash", "-c"]

# Install `curl` for downloading of FASTQ data later in the tutorial
RUN apt-get update && \
apt-get install -y curl && \
apt-get clean

# Configure Conda
RUN conda config --set channel_priority strict && \
conda config --append channels bioconda

# Start Bash by default
CMD /bin/bash
5 changes: 3 additions & 2 deletions tutorials/git/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
FROM ubuntu:16.04

LABEL description = "Image for the NBIS reproducible research course."
MAINTAINER "John Sundh" [email protected]
LABEL authors="John Sundh, [email protected]"
LABEL description="Minimal image for the NBIS reproducible research course."

# Use bash as shell
SHELL ["/bin/bash", "-c"]

# Set workdir
WORKDIR /course

Expand Down