-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multiple single cell samples #211
Comments
Dear @wanghlv Thanks for the feedback!
I think both ways are identical in terms of results, although using read tags may save memory since in this case IsoQuant won't load the entire barcode table into memory. Unfortunately, current version of IsoQuant can only group counts by one factor at a time, so either the barcode, or the sample. So if you want both, I guess you'll need to perform two runs.
I highly doubt ONT reads can have identical IDs.
Adding new tag would require creating a new BAM file, so probably it's easier to create a new TSV table. P.S. New version 3.4.2 should be more effective in term of RAM consumption, so it's better to update if possible. Best |
Thank you for all the info and suggestions, and yes 3.4.2 is so much better at using RAM!! I'm wondering if you would recommend a efficient cell barcodes and UMI processing tools before using IsoQuant for mapping, for single cell nanopore data. Also, I'm wondering since I have the single cell data with also UMI. How would you factor in the quantifications, properly to avoid double counting PCR duplicates? |
Currently, I'm using a barcode calling and PCR de-duplication tools of my own (https://github.com/ablab/IsoQuant/tree/sc_v3). They are not released yet, but at some point they will become a part of IsoQuant too. If you eager to test it, contact me via email, please :) There are also some pipelines available, such as Hope that helps. Best |
Hello, I have a similar issue that I posted yesterday! In my case I have one bam file that contains all the conditions. Could you elaborate on running two times isoquant with different tags? How can I keep the barcode and the condition information? |
Replied in #234 |
Hi, Thanks for writing such a complete MAN page! I have a quick question, I have a total of 6 samples, and all of them are single cell Nanopore libraries. I'd like both the transcript and gene quantification to be per cell (in the CB tag) and per sample.
Could I use --read_group file_name:tag:CB ? or I should supply the file like --read_group file:READ_TO_BARCODE_Samples.TSV:0:2
READ_TO_BARCODE_Samples.TSV, should look like:, so the first column is the READ ID, second is the cell barcodes, and the third is the sample? However, I'm not sure if the read ID is unique across all 6 samples I have.
12a5c9c3-2b73-49c0-a3fd-22d2c10832e2_0 AATCAGGAGTGAACGA Sample1
b6e8c102-e1e2-4155-bc28-7dbb5a34c857_0 CCAGCTGCATGAGCAG Sample2
...
I'm currently running it as the following:
isoquant.py -d ont -r ${FA} --complete_genedb --genedb ${GTF}
--bam ${s1bam} ${s2bam} ${s3bam} ${s4bam} ${s5bam} ${s6bam}
-o IQ_all --prefix IQ_all -l s1 s2 s3 s4 s5 s6
--sqanti_output --check_canonical --count_exons --bam_tags
-t 24 --genedb_output
--model_construction_strategy default_ont
--report_canonical auto --read_group tag:CB
Or I was thinking to add a new tag into my bam file including both the cellbarcode pending with a sample ID like AATCAGGAGTGAACGAs1, CCAGCTGCATGAGCAGs2, ... However, I haven't found a good way to do that because I have a lot of reads in my entire experiment. Thank you so much for your suggestions
Best, Hsiao-Lin
The text was updated successfully, but these errors were encountered: