Skip deduplication for libraries with UMI #188

ChristopherBarrington · 2024-03-25T12:08:02Z

We have some DNase Hi-C data being produced that has UMI information to identify PCR duplicates. My intention would be to deduplicate the libraries using the FastQ files then submitting those duplicate-free FastQ to distiller.

Is there a method that you suggest using to preprocess these data with distiller? There doesn't look to be a straightforward option but I had a look at the DSL1 Nextflow script and thought that duplicating the merge_split process to avoid the deduplication step and create empty files for the expected duplicate-relevant files may work? The choice of process can then be controlled by --params.skip_dedup in a when directive.

I gave it a go and it seemed to work but I am worried that I will have missed something.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip deduplication for libraries with UMI #188

Skip deduplication for libraries with UMI #188

ChristopherBarrington commented Mar 25, 2024

Skip deduplication for libraries with UMI #188

Skip deduplication for libraries with UMI #188

Comments

ChristopherBarrington commented Mar 25, 2024