a basic workflow for running Nick Hathaway's seekdeep on illumina. This version splits up jobs into individual snakemake submissions.
- Install mamba: https://github.com/conda-forge/miniforge#install (don't forget to do conda init and follow the instructions to log out and back in at the end)
- Create a mamba environment and install snakemake and singularity there:
mamba create -c conda-forge -c bioconda -n snakemake snakemake
mamba activate snakemake
mamba install -c conda-forge singularity
- Change directory to a folder where you want to run the analysis
- clone this repository with git clone web_address - you can get the web_address from the green 'code' button
- Download the sif file from here into the same folder: https://seekdeep.brown.edu/programs/elucidator.sif
- Edit the seekdeep_illumina_general.yaml file using the instructions in the comments. Use a text editor that outputs unix line endings (e.g. vscode, notepad++, gedit, micro, emacs, vim, vi, etc.)
- If snakemake is not your active conda environment, activate snakemake with:
mamba activate snakemake
- If on a slurm system, edit the slurm/config.yaml file to match sbatch job submission instructions of your system, or if not on a slurm system, edit the non_slurm/config.yaml file. (If you already have a slurm or non_slurm profile saved in ~/.config/snakemake/slurm, you can delete the slurm or non_slurm folder)
- Run all steps with (e.g. if using a slurm profile):
snakemake -s setup_run.smk --profile slurm
snakemake -s run_extractor.smk --profile slurm
snakemake -s finish_process.smk --profile slurm
- You can also run all steps (editing the file with an appropriate --profile name) with:
bash run_all_steps.sh
You can read Nick Hathaway's manual here: https://seekdeep.brown.edu/
If you're in the folder where you downloaded the elucidator.sif file, you can get help on any seekdeep command with:
singularity exec elucidator.sif SeekDeep [cmd] -h
- The first command gets info about the genome (genTargetInfoFromGenomes).
- The second command sets up an analysis run (setupTarAmpAnalysis).
- The third command runs 3 seekdeep programs (runAnalysis.sh, no help files).
Here are some example help commands to learn more about these commands:
- singularity exec elucidator.sif SeekDeep -h
- singularity exec elucidator.sif SeekDeep genTargetInfoFromGenomes -h
- singularity exec elucidator.sif SeekDeep setupTarAmpAnalysis -h
Each of these steps can be tweaked for sensitivity and specificity (via extra_ [step]_cmds at the bottom of the yaml file):
- The first command extracts amplicon reads (extractor)
- The second command clusters together similar reads (qluster)
- The third command processes clusters into haplotypes (processClusters)
Here are some example help commands to learn more about these programs:
- singularity exec elucidator.sif SeekDeep extractor -h
- singularity exec elucidator.sif SeekDeep qluster -h
- singularity exec elucidator.sif SeekDeep processClusters -h