mgatk tenx endless runtime #98

AdrianParrilla · 2024-08-16T14:20:32Z

Hi,
I was trying to process a bam file coming from a scATAC sequencing, but after waiting for hours the process seems to be stuck at the beginning (see image attached). The input bam file is 33 Gb and has around 500,000,000 reads in total (246,780 reads per cell) and 11% of the reads covering the mt DNA.

The command I used was:

mgatk tenx -i /mnt/smb/TDA/scATAC_files/results/test_scARC_possorted_downsampled_bam.bam -n test_scARC -o test_scARC_mgatk -c 60 -bt CB -b /mnt/smb/TDA/scATAC_files/results/barcodes.tsv -g /mnt/smb/TDA/scATAC_files/masked_genome.fa --keep-temp-files

After 2 hours of running, the only output I got was empty folders:

Why is the process stuck at the first step? Is the bam file too big?

Thanks in advance for your help,

caleblareau · 2024-08-16T14:47:27Z

The issue is the fasta you supply to mgatk should only be the mito genome (or use one of the built in ones). On Aug 16, 2024, at 9:20 AM, AdrianParrilla ***@***.***> wrote: Assigned #98<#98> to @caleblareau<https://github.com/caleblareau>. — Reply to this email directly, view it on GitHub<#98 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AD32FYIEVXGM6Q5P37BELX3ZRYDEJAVCNFSM6AAAAABMUE6BRGVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJTHEYTCNJVHA2TKMI>. You are receiving this because you were assigned.Message ID: ***@***.***>

AdrianParrilla · 2024-08-22T08:35:15Z

Hi, thanks for your reply. I tried that but I got an error saying: "User specified mitochondrial genome does NOT match .bam file; correctly specify reference genome or .fasta file". I tried also to extract the mitochondrial genome from our masked_genome.fa and input it as custom with --mito-genome, but I get the same error.
Does the sample bam file need to be only the mitocondrial chromosome?

caleblareau · 2024-09-22T06:25:07Z

The bam file can have more than just the mitochondrial chromosome contig.

The issue is probably that the text following the > should match exactly the chromosome name (e.g., >chrM).

What is the contig name in the bam file and does cat <your fasta file> | grep ">" yield?

AdrianParrilla added the bug label Aug 16, 2024

AdrianParrilla assigned caleblareau Aug 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mgatk tenx endless runtime #98

mgatk tenx endless runtime #98

AdrianParrilla commented Aug 16, 2024

caleblareau commented Aug 16, 2024 via email

AdrianParrilla commented Aug 22, 2024

caleblareau commented Sep 22, 2024

mgatk tenx endless runtime #98

mgatk tenx endless runtime #98

Comments

AdrianParrilla commented Aug 16, 2024

caleblareau commented Aug 16, 2024 via email

AdrianParrilla commented Aug 22, 2024

caleblareau commented Sep 22, 2024