You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wish to call specific sites and have found that using the restrict-regions option to indicate a intervals.bed file specifying those sites works for this purpose (in combination with HaplotypeCaller-extra: "--output-mode EMIT_ALL_CONFIDENT_SITES").
However, my reference genome includes some very small contigs and in order to avoid overwhelmingly many jobs, I have previously been using the contig-group-size option. This is unfortunately not yet possible in combination with 'restrict-regions`.
Is there a way to specify these sites without generating overwhelmingly many jobs? Perhaps by adding a single job making empty vcf files for contigs that do not occur in the intervals.bed file?
But of course, ensuring compatibility between both contig-group-size and restrict-regions would be ideal...
Thanks
The text was updated successfully, but these errors were encountered:
Just to update (for the benefit of others who may want to do something similar):
As expected, the combination of both contig-group-size and restrict-regions resulted in a large number of jobs. However, the jobs for contigs without overlaps with the intervals.bed file finished within seconds. In conclusion: spurious jobs and files but fortunately not too much increase in total computation time.
indeed, that is a combination of features that is a bit tricky to implement, and I will unfortunately not have the time for that in the foreseeable future. Hope it's okay that I cannot promise a timeline on implementing this.
Let's keep this issue open for now. And in the future, anyone else who wants to do something similar, please feel free to comment here to bump this up in priority in my implementation back log :-)
Also, thank you for reporting your experience with just running it as-is. That sounds like it's generally workable, with some overhead in terms of job submissions and files created. That is good to know!
Hi Lucas,
I wish to call specific sites and have found that using the
restrict-regions
option to indicate a intervals.bed file specifying those sites works for this purpose (in combination withHaplotypeCaller-extra: "--output-mode EMIT_ALL_CONFIDENT_SITES"
).However, my reference genome includes some very small contigs and in order to avoid overwhelmingly many jobs, I have previously been using the
contig-group-size
option. This is unfortunately not yet possible in combination with 'restrict-regions`.Is there a way to specify these sites without generating overwhelmingly many jobs? Perhaps by adding a single job making empty vcf files for contigs that do not occur in the intervals.bed file?
But of course, ensuring compatibility between both
contig-group-size
andrestrict-regions
would be ideal...Thanks
The text was updated successfully, but these errors were encountered: