Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental update of the graph #16

Open
huguesrichard opened this issue Jan 12, 2021 · 8 comments
Open

Incremental update of the graph #16

huguesrichard opened this issue Jan 12, 2021 · 8 comments
Labels
enhancement New feature or request

Comments

@huguesrichard
Copy link

Hello,

Thank you very much for abPOA, it is a nice tool, installation was fast and easy.

I would like to be able to incrementally update the POA graph. That would be very practical to use the graph as a compressed aligned version of the sequences and to add sequences to it as they accumulate over time.

Typical use case would be to first generate a graph (for instance in gfa format) and then to be able to add sequences to this graph with additional commands. For instance with an --increment option:

abpoa -r 3 seqs.fa > graph.gfa

abpoa --increment newseqs.fa graph.gfa > newgraph.gfa

Best regards,

Hugues

@yangao07
Copy link
Owner

This is theoretically doable, I will give it a try.
I will post the updates here when it's ready.

Yan

@yangao07 yangao07 added the enhancement New feature or request label Jan 12, 2021
@huguesrichard
Copy link
Author

Hello again,

As a complement to my previous request, the gfa graphs produced by abPOA are usually quite huge and it was easy to transform it to unitigs using Heng Li's gfatools, e.g.
gfatools asm -u graph.gfa > graph_unitig.gfa`

That would be great if graph_unitig.gfa could be provided to abPOA as input. The graph can then practically be used as a database for short sequences.

Hugues

@yangao07
Copy link
Owner

yangao07 commented Feb 4, 2021

Thanks for the suggestion!
I will try to add this feature in the next version.

Yan

@yangao07
Copy link
Owner

@huguesrichard Please try out the latest abPOA v1.1.0.
It now can incrementally align sequences to an existing GFA or MSA.
Let me know if this works for you.

Yan

@huguesrichard
Copy link
Author

Hello @yangao07,

I tried adding sequences to a gfa produced by apPOA and this worked directly.
That's really a great feature, thank you!
I will try it out a little more in the next days and let you know if I see anything strange on the resulting MSAs.

I also tried with a gfa simplified to unitigs (using Heng Li's gfatools) but in this case abPOA did not recognise the gfa file.

Also, that would be great to have a few information messages printed to stderr as abPOA runs. I am running it on a few thousand sequences now and I am always unsure where it in in terms processing the files.

Anyway, thanks again for adding the feature

@huguesrichard
Copy link
Author

Also, I could not get access to the release, I guess it was not published yet.

@yangao07
Copy link
Owner

yangao07 commented Feb 11, 2021

Also, I could not get access to the release, I guess it was not published yet.

I haven't pushed it to the release yet.

I also tried with a gfa simplified to unitigs (using Heng Li's gfatools) but in this case abPOA did not recognise the gfa file.

The unitigs by gfatools have no P lines, which are required for incremental graph alignment, that is why it is not supported.
On the other hand, I think it is not hard for abPOA to output GFA with unitigs. I will try to add this feature.

@huguesrichard
Copy link
Author

a graph output with unitig would be really really helpfull. From my small tests on viral genomes I had around 50-fold compression generating the unitig version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants