A genome assembly evaluating pipeline.
GAEP is a pipeline to assess genome assembly.
git clone https://github.com/zy-optimistic/GAEP.git
cd GAEP
./gaep
gaep <command> [options]
pipe (NGS,TGS,trans) let GAEP to determine the module to be executed based on the input data
stat report genome basic information
macc (NGS) base accuracy based on reads mapping
kacc (NGS) base accuracy based on K-mer
bkp (TGS) misassembly breakpoints detected
snvcov (NGS,TGS) SNV-coverage dot plot
busco run busco v5
gaep pipe -r genome.fasta --lr TGS.fastq -x pb --sr1 NGS_1.fastq --sr2 NGS_2.fastq -t 3 -c config.txt
#You can list your data and dependancies in the config.txt. The template of config.txt is in GAEP/config/.
The specified versions were used for testing GAEP.
- Bio::DB::HTS (perl module). Can be installed by conda with command "conda install -c bioconda perl-bio-db-hts".
- minimap2 v2.17-r974-dirty
- samtools v1.9
- minimap2 v2.17-r974-dirty (for TGS reads mapping)
- bwa v0.7.17-r1198-dirty (for NGS reads mapping)
- bcftools v1.9 (for SNV calling)
- Rscript v4.0.3 (for plotting)
- bedtools
- ggplot2 (R module)
- ggExtra (R module)
- bwa v0.7.17-r1198-dirty
- samtools v1.9
- bcftools v1.9
- meryl v1.3
- merqury v1.3
- busco v5
- metaeuk (for eukaryotes)
- prodigal (for non-eukaryotes)
- hmmsearch
- bbtools