OptiType fails sometimes with BAM not found #419

tavinathanson · 2017-02-09T15:06:20Z

Trying with @armish's setup (since mine didn't work; see #418), I get some of these, which I believe are issues with OptiType itself:

### Kube-Job 5ae7ce4b-d459-5b3c-a30d-6cd9528d6291
### Freshness: Fresh
### Output:

User
biokepi
Host
5ae7ce4b-d459-5b3c-a30d-6cd9528d6291
Machine
Linux 5ae7ce4b-d459-5b3c-a30d-6cd9528d6291 3.16.0-4-amd64 #1 SMP Debian 3.16.36-1+deb8u2 (2016-10-19) x86_64 x86_64 x86_64 GNU/Linux
biokepi
biokepi
No export var
/tmp/_MEIq1H09H/matplotlib/font_manager.py:273: UserWarning: Matplotlib is building the font cache using fc-list. This may take a moment.
Killed
Killed

0:00:00.47 Mapping f003a1843a6c739551bcfd981af8afd7_checkpoint-trials_lung_tumor_bams_mafs_SN0110394_bams_mafs_old_samples_IN_MCC_00234_T1_bamIN_MCC_00234_T1-b2fq-PE_R1.fastq to GEN reference...

0:14:11.61 Mapping f003a1843a6c739551bcfd981af8afd7_checkpoint-trials_lung_tumor_bams_mafs_SN0110394_bams_mafs_old_samples_IN_MCC_00234_T1_bamIN_MCC_00234_T1-b2fq-PE_R2.fastq to GEN reference...

0:27:58.74 Generating binary hit matrix.
Traceback (most recent call last):
  File "<string>", line 267, in <module>
  File "hlatyper.py", line 177, in pysam_to_hdf
  File "pysam/calignmentfile.pyx", line 333, in pysam.calignmentfile.AlignmentFile.__cinit__ (pysam/calignmentfile.c:4808)
  File "pysam/calignmentfile.pyx", line 533, in pysam.calignmentfile.AlignmentFile._open (pysam/calignmentfile.c:7027)
IOError: file `tumor_dna_processing_IN_MCC_00234/2017_02_09_10_38_37/2017_02_09_10_38_37_1.bam` not found
OptiTypePipeline returned -1

Digging a little deeper, I noticed:

That BAM indeed does not exist in the OptiType output dir.
However, other jobs succeeded without such a BAM.
I noticed that OptiType can expect BAMs as input if re-running, perhaps related? https://github.com/FRED-2/OptiType/blob/master/OptiTypePipeline.py#L180

The text was updated successfully, but these errors were encountered:

tavinathanson · 2017-02-09T15:10:30Z

So I don't think it's related to getting BAMs as input, since it's not following that code path.

Rather, it appears to do this:

https://github.com/FRED-2/OptiType/blob/master/OptiTypePipeline.py#L286

Then:

https://github.com/FRED-2/OptiType/blob/master/OptiTypePipeline.py#L294

Then, I think it fails at:

https://github.com/FRED-2/OptiType/blob/master/OptiTypePipeline.py#L298

tavinathanson · 2017-02-09T15:11:23Z

For whatever reason, it looks like https://github.com/FRED-2/OptiType/blob/master/OptiTypePipeline.py#L288 didn't result in a BAM being created?

I also see that the BAMs get removed when done, which explains why the other successes don't have BAMs there.

tavinathanson · 2017-02-09T16:43:48Z

This is a dup. of what @armish hit in RCC: https://github.com/hammerlab/rcc-analyses/issues/104

Leaving it open since this is the more general repo.

tavinathanson · 2017-02-10T16:49:55Z

Tried running this manually in the VM. Some more information:

Aborted (core dumped)

0:09:19.04 Mapping 0e8070747629c84c18f763603fea9545_checkpoint-trials_lung_tumor_bams_mafs_SN0109695_bams_new_samples_AG538184-7_bamAG538184-7-b2fq-PE_R2.fastq to GEN reference...
/nfs-pool/biokepi/toolkit/biopam-kit/opam_dir/opam-root-root-optitype.1.0.0/0.0.0/build/seqan.2.1.0/include/seqan/basic/basic_exception.h:363 FAILED!  (Uncaught exception of type std::bad_alloc: std::bad_alloc)

stack trace:
  0                      [0x72c93d]
  1                      [0x75a146]
  2                      [0x75a191]
  3                      [0x75b149]
  4                      [0x74f99c]
  5                      [0x4091c4]
  6                      [0x47846a]
  7                      [0x4d2bb2]
  8                      [0x72bc38]
  9                      [0x401b23]
 10                      [0x810dc6]
 11                      [0x810fba]
 12                      [0x404cd9]

Aborted (core dumped)

0:18:12.60 Generating binary hit matrix.
Traceback (most recent call last):
  File "<string>", line 267, in <module>
  File "hlatyper.py", line 177, in pysam_to_hdf
  File "pysam/calignmentfile.pyx", line 333, in pysam.calignmentfile.AlignmentFile.__cinit__ (pysam/calignmentfile.c:4808)
  File "pysam/calignmentfile.pyx", line 533, in pysam.calignmentfile.AlignmentFile._open (pysam/calignmentfile.c:7027)
IOError: file `tumor_dna_processing_AG538184-7/2017_02_10_16_29_59/2017_02_10_16_29_59_1.bam` not found
OptiTypePipeline returned -1
(/nfs-pool/biokepi//toolkit/biopam-kit/envs/optitype.1.0.0) opam@115e92c3b8c5:/nfs-pool-16/biokepi/work/results-b37decoy-tumor_dna_processing_AG538184-7/a8365d6f969a40ef3f6fa69c0a56ed62tumor_dna_processing_AG538184-7DNA0e8070747629c84c18f763603fea9545_checkpoint-trials_lung_tumor_bams_mafs_SN0109695_bams_new_samples_AG538184-7_bamAG538184-7-b2fq-PE_R1_fastqoptitype.d$

tavinathanson · 2017-02-10T17:03:49Z

Seems like an OOM situation. Looks like the 10 that failed, at first glance, were relatively large FASTQs; this might be relevant. seqan/seqan#1276

ihodes · 2017-02-10T17:09:22Z

You can try remaking the cluster with bigger nodes? Or is there an argument you can pass to Optitype that tells it to use all 52GB of the default nodes?

tavinathanson · 2017-02-10T17:15:18Z

@ihodes first trying manually on a beefed up node; but if that works, how do I remake the cluster with bigger nodes?

ihodes · 2017-02-10T21:58:47Z

I'm not sure to be honest; you might be able to change it from the GCloud GKE interface, or you could take down the cluster you have and start a new one with different node type… @smondet do you know?

smondet · 2017-02-10T23:16:28Z

@ihodes I've never tried to change the machine type "live"

The machine-type is an option of coclobas configure ... and then each job requests some amount of CPUs/Memory; so the Biokepi.Machine.t has to ask for more also in its run_program
(right now we use the defaults everywhere)

tavinathanson · 2017-02-14T19:58:40Z

Confirmed that this is a memory issue: when running the same commands manually on 30GB memory vs. 120GB memory, it fails on the former and succeeds on the latter.

ihodes · 2017-02-14T20:02:47Z

Do we know if we can filter reads to the MHC locus and save a lot of space? If so, we should add this filtering step to the pipeline in Biokepi

tavinathanson · 2017-02-14T20:05:02Z

@ihodes see #423; I don't think that would address these memory issues, because that filtering would be via razerS3, which is also where the OOM is within OptiType.

ihodes · 2017-02-14T20:10:40Z

Fair enough; I wonder if we could use BWA-mem to do this filtering instead?

tavinathanson · 2017-02-14T20:17:34Z

@ihodes probably, though it's not OptiType's recommendation:

You can use any read mapper to do this step, although we suggest you use RazerS3. Its only drawback is that due to way RazerS3 was designed, it loads all reads into memory, which could be a problem on older, low-memory computing nodes.

tavinathanson · 2017-02-14T22:19:55Z

Per @smondet's instructions, I ran on larger cluster nodes as follows:

# Ctrl-C in the Coclobas-server screen tab
coclobas cluster delete --root /coclo/_cocloroot/
coclobas configure --root _cocloroot/ --cluster-name $CLUSTER_NAME --cluster-zone $GCLOUD_ZONE --max-nodes $CLUSTER_MAX_NODES --machine-type n1-standard-32
screen -t Coclobas-server coclobas start-server --root _cocloroot/ --port 8082 # Don't use start-all; this will overwrite the coclobas configure command

Replaced my biokepi_machine.ml with his new one, which adds support for customizing CPU/memory limits: https://github.com/hammerlab/coclobas/blob/f690ab74f1ce88ccb75d047c87e7f4eb314f7ba7/tools/docker/biokepi_machine.ml

And then:

export KUBE_JOB_CPUS=32
export KUBE_JOB_MEMORY=118

Confirmed that my GCP instance group had the right node type. Then re-ran my jobs.

We'll see if that works!

tavinathanson · 2017-02-14T23:00:19Z

Success!

tavinathanson · 2017-02-15T00:02:48Z

Spoke too soon. 1 out of the 9 remaining jobs still failed with the same error :(.

ihodes · 2017-02-15T00:20:44Z

Is that a larger FASTQ than the others, by any chance?

tavinathanson · 2017-02-15T00:21:15Z

@ihodes it's 81GB, which I didn't think was particularly larger, but I could be misremembering.

tavinathanson · 2017-02-15T17:10:44Z

@ihodes I was wrong; it is the largest one. Sigh. At least the problem is clear, but I'm becoming more convinced by your suggestion to filter using non-razerS3.

ihodes · 2017-02-15T17:21:32Z

It may be the only way forward… or you switch to 250+GB machines for extremely expensive runs. hammerlab/coclobas#19 will also help with degenerate cases like these in the future.

tavinathanson · 2017-02-15T17:22:26Z

@ihodes yeah already kicked off a 208GB machine run. Let's see if that works.

tavinathanson · 2017-02-15T21:06:45Z

It worked!

maryawood · 2017-07-24T20:27:40Z

I've been experiencing the same error that @tavinathanson described here with a set of files I'm working with, but it doesn't appear to be an issue with memory - requesting a machine with increased memory doesn't not eliminate the problem, and I've been able to run Optitype without error on larger fastq files from a different dataset without this problem. Further, when I try to run the razerS command on the command line, it doesn't return an error, but still doesn't produce a bam file.

I'm at a bit of a loss for what to do. Any ideas as to what the problem may be?

armish · 2017-07-24T21:52:52Z

@maryawood: unfortunately still sounds like a memory issue or something related to it. Depending on the depth/coverage of your sequencing data, the memory requirements for razer3 can go through the roof and since this is related to the way razer3 keeps the data in the memory, there is very little you can do.

I have been experimenting with different approaches and I found that using bwa mem to filter down the reads both makes the pipeline run much faster and with very little memory footprint; and testing this approach over a largish cohort of patients (~100), I found that bwa-pre-filtering doesn't really bias or affect the result in any way.

Here is the modified pipeline:

Index the HLA reference file that comes with the optitype: e.g. bwa index $OPTITYPE_HOME/data/hla_reference_dna
Map your reads against this reference sequence and filter out all reads that do not map (F4). You can do this for each pair individually: e.g. bwa mem $OPTITYPE_HOME/data/hla_reference_dna your.pair1.fastq | samtools fastq -F4 - filtered.hla.pair1.fastq
Run the standard Optitype pipeline using these two new fastqs as input
Viola!

maryawood · 2017-07-24T21:56:55Z

@armish thanks so much for the suggestion! I will give this a try

tavinathanson mentioned this issue Feb 9, 2017

OptiType fails to run with the current setup #418

Closed

ihodes added the bug label Feb 10, 2017

tavinathanson mentioned this issue Feb 14, 2017

OptiType does not use razerS3 to filter to HLA reads prior to running #423

Closed

tavinathanson mentioned this issue Feb 14, 2017

Implement OptiType pre-filtering step #426

Open

tavinathanson closed this as completed Feb 14, 2017

tavinathanson reopened this Feb 15, 2017

ihodes closed this as completed Feb 15, 2017

tavinathanson reopened this Feb 15, 2017

tavinathanson closed this as completed Feb 15, 2017

Githubuser3838 mentioned this issue Oct 11, 2023

HLA typing 10x scRNA-seq data with Optitype FRED-2/OptiType#141

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OptiType fails sometimes with BAM not found #419

OptiType fails sometimes with BAM not found #419

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 10, 2017

tavinathanson commented Feb 10, 2017

ihodes commented Feb 10, 2017

tavinathanson commented Feb 10, 2017

ihodes commented Feb 10, 2017

smondet commented Feb 10, 2017

tavinathanson commented Feb 14, 2017

ihodes commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

ihodes commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

tavinathanson commented Feb 15, 2017

ihodes commented Feb 15, 2017 via email •

edited by tavinathanson

Loading

tavinathanson commented Feb 15, 2017

tavinathanson commented Feb 15, 2017

ihodes commented Feb 15, 2017 •

edited

Loading

tavinathanson commented Feb 15, 2017 •

edited

Loading

tavinathanson commented Feb 15, 2017

maryawood commented Jul 24, 2017

armish commented Jul 24, 2017

maryawood commented Jul 24, 2017

OptiType fails sometimes with BAM not found #419

OptiType fails sometimes with BAM not found #419

Comments

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 9, 2017

tavinathanson commented Feb 10, 2017

tavinathanson commented Feb 10, 2017

ihodes commented Feb 10, 2017

tavinathanson commented Feb 10, 2017

ihodes commented Feb 10, 2017

smondet commented Feb 10, 2017

tavinathanson commented Feb 14, 2017

ihodes commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

ihodes commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

tavinathanson commented Feb 14, 2017

tavinathanson commented Feb 15, 2017

ihodes commented Feb 15, 2017 via email • edited by tavinathanson Loading

tavinathanson commented Feb 15, 2017

tavinathanson commented Feb 15, 2017

ihodes commented Feb 15, 2017 • edited Loading

tavinathanson commented Feb 15, 2017 • edited Loading

tavinathanson commented Feb 15, 2017

maryawood commented Jul 24, 2017

armish commented Jul 24, 2017

maryawood commented Jul 24, 2017

ihodes commented Feb 15, 2017 via email •

edited by tavinathanson

Loading

ihodes commented Feb 15, 2017 •

edited

Loading

tavinathanson commented Feb 15, 2017 •

edited

Loading