-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help replicating results from another paper #453
Comments
Hi @NikoLichi , Thanks for reporting this to us. I had a look at the data, it seems that the bam file is too small to have novel transcripts that pass the default filters in bambu. If you would want to obtain expected results, we would recommend you to try with a larger bam file. If you really need to run it on a small bam file, you might want to relax a bit on the discovery threshold, for example, set NDR =1, min.readCount = 1, and fitReadClassModel = FALSE. Let us know if this helps. |
Dear @cying111, Thanks a lot for your reply and hints. I agree that the sample test is very small... However, we decided to go with this test, but see PS below. Continuing with your input, I ran 3 bambu tests, adding each one of the parameters you mentioned. Please see the attached figure in which these 15-22 novel transcripts are about 0.01% NIC (Novel In Catalog).
So this is better but still not comparable with the initial graph I sent you from FLAIR. Do you have any other suggestions for increasing the number of novelties? All the best, PS. I am running another test with a larger cohort (and larger BAM files ~1Millon reads per sample in 12 samples) re-training the bambu model. I hope these bambu options can help to increase the number of novelties and make a fully comparable test. |
Hi @NikoLichi , Thanks for getting back with these updates. For your question to get a higher number of novel transcripts, there are a few additional parameters you can adjust, including remove.subsetTx = FALSE, min.readFractionByGene = 0.01. If these still don't yield a sufficient number of transcripts, you could also consider relaxing min.exonOverlap. However, please note that using these relaxed parameters may work well for smaller samples, but for larger datasets, it could lead to less specificity and potentially become infeasible. I recommend to adjust these parameters accordingly. Thank you and looking forward to hearing back from you! |
Dear Bambu team,
First of all, I have to say that I really like this tool! and your guidance is very helpful!
Now, my issue: I am comparing bambu vs FLAIR using the published data in this article. https://www.nature.com/articles/s41467-024-48615-4#Fig1 . I am trying to reproduce the figure 1E using bambu and Squanti3.
A colleague ran the alignment part and FLAIR and I am running Bambu. We are not even close to replicating their results. Interestingly for Bambu, there are no novel transcripts/genes reported, not even forcing NDR, i.e., 0.3
. how is this possible?
For FLAIR
we are getting some novel transcripts but still need to get similar results.
How is it possible that bambu is not getting any novel transcript? could I do something to enhance my code? I am adding the bam file I am using for this. The reference we are using the human gencode.v38 files.
DRR465296.aligned.corrected_all_corrected_chr21.bam.gz
My code looks like:
Thanks for any help here!
Kind regards,
Niko
The text was updated successfully, but these errors were encountered: