Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling on HG00463 results in s/w crash. #2

Open
iamh2o opened this issue Jul 20, 2020 · 1 comment
Open

Calling on HG00463 results in s/w crash. #2

iamh2o opened this issue Jul 20, 2020 · 1 comment

Comments

@iamh2o
Copy link

iamh2o commented Jul 20, 2020

I downloaded the R1 and R2 fasta files for this sample referenced in the paper. I aligned the reads with sentieon bwa mem, produced a valid BAM/BAI file. When I ran star_caller.py, I ended up with the following crash.

(supersonic) jmajor@kahlo:/locus/data/external_data/research_experiments/investigations/CYP2D6/HG00463$ python ~/wgs_resources/bin/Cyrius/star_caller.py --reference ~/wgs_resources/data/reference/human/human_g1k_v37_modified.fasta/human_g1k_v37_modified.fasta --genome 37 --prefix CYP_ --outDir ./ --threads 88 --manifest manifest.txt
INFO:root:Processing sample HG00463.aligned.deduped.sort at 2020-07-20 05:36:41.476394
Traceback (most recent call last):
File "/locus/home/jmajor/wgs_resources/bin/Cyrius/star_caller.py", line 580, in
main()
File "/locus/home/jmajor/wgs_resources/bin/Cyrius/star_caller.py", line 548, in main
bam_name, call_parameters, threads, count_file, reference_fasta
File "/locus/home/jmajor/wgs_resources/bin/Cyrius/star_caller.py", line 339, in d6_star_caller
raw_cn_call.spacer_cn,
File "/locus/data/external_data/research_experiments/wgs_resources/bin/Cyrius/caller/cnv_hybrid.py", line 56, in get_cnvtag
if exon9_intron4_sites_counter[0][1] >= EXON9_TO_INTRON4_SITES_MIN
IndexError: list index out of range
`

@xiao-chen-xc
Copy link
Contributor

Hi @iamh2o, the error is due to the fact that you don't have any callable site throughout a big region in the gene, which is probably suggesting that something is wrong. Did you align to the entire genome and use the entire BAM, i.e. without just extracting the CYP2D6 region? You are are using the GRCh37 reference, right? Additionally, this might not be the cause of the issue, but does your BAM contain duplicate reads? We generally recommend that the duplicate reads be kept in the BAM as they tend to be a little bit more accurate for depth assessment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants