How to run this tool? #13

jolespin · 2021-08-11T02:58:24Z

I'm working on an institute-wide pipeline for JCVI and had some trouble running your tool.

Here's my version installed via pip:

 viral_verify --version
viral_verify, version 0.1.1

Here's my command:

viral_verify -i veba_output/binning/47-Drifterexpttime4punches_S40/tmp/unbinned.fasta -o veba_output/binning/47-Drifterexpttime4punches_S40/intermediate/viral_viralverify_output -H /usr/local/scratch/CORE/jespinoz/db/pfam/v33.1/Pfam-A.hmm -t 16

Edit: I had to decompress the PFAM database which was the error in the original post that I've edited since then.

Should I be using the PFAM database or the database from FigShare?

Can you update the Usage on your GitHub?

This is the results output:

veba_output/binning/47-Drifterexpttime4punches_S40/intermediate/viral_viralverify_output/
├── classified-fasta-output
│   ├── unbinned-chromosome.fasta
│   └── unbinned-unclassified.fasta
├── unbinned-circularized.fasta
├── unbinned-genes.fa
├── unbinned-hmmsearch.domtblout
├── unbinned-hmmsearch.output
├── unbinned-proteins-circularized.fa
├── unbinned-proteins.fa
└── unbinned-results.csv

I ran the version from GitHub on a differen tdataset and got the following output:

testing/viralverify_output/
├── oral_viruses_domtblout
├── oral_viruses_feature_table.txt
├── oral_viruses_genes.fa
├── oral_viruses_input_with_circ.fasta
├── oral_viruses_out_pfam
├── oral_viruses_prodigal.log
├── oral_viruses_proteins_circ.fa
├── oral_viruses_proteins.fa
├── oral_viruses_result_table.csv
├── Prediction_results_fasta
│   ├── oral_viruses_chromosome.fasta
│   ├── oral_viruses_plasmid.fasta
│   ├── oral_viruses_plasmid_uncertain.fasta
│   ├── oral_viruses_virus.fasta
│   └── oral_viruses_virus_uncertain.fasta
└── viralverify.log

1 directory, 15 files

How come the output is so different between the pip and GitHub versions?

The text was updated successfully, but these errors were encountered:

mikeraiko · 2021-08-22T20:38:40Z

That's funny - someone else forked this repo about a year ago, refactored and submitted to pypi as viral_verify (https://github.com/peterk87/viral_verify) . That's why the output and such is so different. Thanks for pointing that out!

Meanwhile, our current github version is awaiting approval for bioconda channel. As soon as that happens, I'll update accordingly.

jolespin · 2021-08-22T21:11:35Z

That is so weird. They also took the namespace too?

What's the process like for getting something on bioconda?

mikeraiko · 2021-08-22T21:25:33Z

That's open source, after all...
Bioconda submission turned out to be pretty straightforward. Create recipe (yaml and build.sh files) with all metadata and dependencies, test and commit to bioconda recipes repository. https://bioconda.github.io/contributor/workflow.html
Then, after all CI tests, it needs ti be reviewed by someone of bioconda members. No idea how long it takes :)
github.com/bioconda/bioconda-recipes/pull/30186

AndAvia · 2024-07-23T01:24:51Z

I'm working on an institute-wide pipeline for JCVI and had some trouble running your tool.

Here's my version installed via pip:

 viral_verify --version
viral_verify, version 0.1.1

Here's my command:

viral_verify -i veba_output/binning/47-Drifterexpttime4punches_S40/tmp/unbinned.fasta -o veba_output/binning/47-Drifterexpttime4punches_S40/intermediate/viral_viralverify_output -H /usr/local/scratch/CORE/jespinoz/db/pfam/v33.1/Pfam-A.hmm -t 16

Edit: I had to decompress the PFAM database which was the error in the original post that I've edited since then.

Should I be using the PFAM database or the database from FigShare?

Can you update the Usage on your GitHub?

This is the results output:

veba_output/binning/47-Drifterexpttime4punches_S40/intermediate/viral_viralverify_output/
├── classified-fasta-output
│   ├── unbinned-chromosome.fasta
│   └── unbinned-unclassified.fasta
├── unbinned-circularized.fasta
├── unbinned-genes.fa
├── unbinned-hmmsearch.domtblout
├── unbinned-hmmsearch.output
├── unbinned-proteins-circularized.fa
├── unbinned-proteins.fa
└── unbinned-results.csv

I ran the version from GitHub on a differen tdataset and got the following output:

testing/viralverify_output/
├── oral_viruses_domtblout
├── oral_viruses_feature_table.txt
├── oral_viruses_genes.fa
├── oral_viruses_input_with_circ.fasta
├── oral_viruses_out_pfam
├── oral_viruses_prodigal.log
├── oral_viruses_proteins_circ.fa
├── oral_viruses_proteins.fa
├── oral_viruses_result_table.csv
├── Prediction_results_fasta
│   ├── oral_viruses_chromosome.fasta
│   ├── oral_viruses_plasmid.fasta
│   ├── oral_viruses_plasmid_uncertain.fasta
│   ├── oral_viruses_virus.fasta
│   └── oral_viruses_virus_uncertain.fasta
└── viralverify.log

1 directory, 15 files

How come the output is so different between the pip and GitHub versions?

Hi there, so in the end which database did you use or is that one annotated out a bit more accurately.

jolespin · 2024-07-23T05:45:43Z

I use geNomad now.

AndAvia · 2024-07-23T05:50:19Z

I use geNomad now.

All right, thanks.

jolespin · 2024-07-23T17:07:37Z

Apologies @AndAvia , I wrote that from my phone but should have given more context. Here's the geNomad publication: https://www.nature.com/articles/s41587-023-01953-y and here's the GitHub: https://github.com/apcamargo/genomad

I developed a wrapper around geNomad for my "binning-viral" module (though, it doesn't really bin and more so identifies contigs that are viral) in my VEBA package. Here's the publication for VEBA (https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkae528/7697622) and here's the GitHub (https://github.com/jolepsin/veba).

If you only want to perform viral analysis, I would recommend just using geNomad because VEBA has a lot of functionality in other modules (e.g., assembly w/ SPAdes, rnaSPAdes, Fly or eukaryotic binning/gene modeling, etc) and requires more dependencies/databases.

AndAvia · 2024-07-24T10:13:36Z

@jolespin Thank you so much for your patience in replying! I only need to do virus identification at the moment, because there are so many virus identification software, I'm going to use genomad, VIBRANT, virfinder, deepvirfinder, virsorter, virsorter2, ViralVerify and these, but you said that ViralVerify two databases have different results, and I don't know which database to choose.I had a chance to look at your VEBA, and I found it very impressive! Wishing you a wonderful day!

Dmitry-Antipov · 2024-08-12T18:31:56Z

Hi,
Since this tool was released in 2021 and didn't receive significant updates, I'd also recommend for checking newer alternatives. If you are still interested to run exactly viralVerify, I'd definitely retrain the db with the updated pfam-a and genbank viral/plasmid/chromosomal sequences

Dmitry-Antipov assigned Dmitry-Antipov and mikeraiko and unassigned Dmitry-Antipov Aug 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run this tool? #13

How to run this tool? #13

jolespin commented Aug 11, 2021 •

edited

Loading

mikeraiko commented Aug 22, 2021

jolespin commented Aug 22, 2021

mikeraiko commented Aug 22, 2021

AndAvia commented Jul 23, 2024

Edit: I had to decompress the PFAM database which was the error in the original post that I've edited since then.

jolespin commented Jul 23, 2024

AndAvia commented Jul 23, 2024

jolespin commented Jul 23, 2024 •

edited

Loading

AndAvia commented Jul 24, 2024

Dmitry-Antipov commented Aug 12, 2024

How to run this tool? #13

How to run this tool? #13

Comments

jolespin commented Aug 11, 2021 • edited Loading

Edit: I had to decompress the PFAM database which was the error in the original post that I've edited since then.

mikeraiko commented Aug 22, 2021

jolespin commented Aug 22, 2021

mikeraiko commented Aug 22, 2021

AndAvia commented Jul 23, 2024

Edit: I had to decompress the PFAM database which was the error in the original post that I've edited since then.

jolespin commented Jul 23, 2024

AndAvia commented Jul 23, 2024

jolespin commented Jul 23, 2024 • edited Loading

AndAvia commented Jul 24, 2024

Dmitry-Antipov commented Aug 12, 2024

jolespin commented Aug 11, 2021 •

edited

Loading

jolespin commented Jul 23, 2024 •

edited

Loading