Skip to content
This repository has been archived by the owner on Mar 23, 2023. It is now read-only.

what configurations do I need to change? #40

Open
mfazel opened this issue Sep 26, 2022 · 5 comments
Open

what configurations do I need to change? #40

mfazel opened this issue Sep 26, 2022 · 5 comments

Comments

@mfazel
Copy link

mfazel commented Sep 26, 2022

Hi,

I'm trying pore-C data analysis and have a couple questions probably trivial to you but I could not figure them out by looking at Readme documentations here.
I installed Pore-C-SnakeMake and followed the documentation and ran the test and it finished successfully. Now my question is how do I run it on my own data. Precisely, what configurations (files, paths etc.) do I have to change?
Also is it possible to use for example hg19 instead of hg38 that is used in the example test and also a different enzyme. What changes do I have to make. (I modified basecalls.tsv to my data but it failed) and I'm guessing there are more changes to make but not sure where.

Thanks

@wenluo711
Copy link

I guess you'll have to add a hg19 reference file to the config/reference.tsv file? and modify the basecalls.tsv accordingly?

@Oksanak22
Copy link

Hello,
I also faced trouble running the Pore C pipeline.
What I have to install extra to be able to run at list "pore_c refgenome virtual-digest".

Thank you.

@eharr
Copy link
Collaborator

eharr commented Oct 2, 2022

@mfazel all of the configuration you need is in the files referenced in the README, there are comments in the headers of each of the files that describes what each means, let me know if you have any questions.

*  `config/config.yaml` - A yaml file containing settings for the pipeline. Input data is specified in the following tab-delimited files.
*  `config/basecall.tsv` - Metadata and locations of the pore-c sequencing run fastqs.
*  `config/references.tsv` - Locations of the draft/scaffold/reference assemblies that the pore-c reads will be mapped to.
*  `config/phased_vcfs.tsv` - [Optional] The location of phased vcf files that can be used to haplotag poreC reads.

@eharr
Copy link
Collaborator

eharr commented Oct 2, 2022

@Oksanak22 if you're having an issue installing the pipeline would you mind opening a separate issue with the error log?

@mfazel
Copy link
Author

mfazel commented Feb 13, 2023

Hi Eoghan,
I have a couple of questions about pore-C snakeMake and hope they are not so lame, but I'm using this pipeline for the first time.
I cloned the repo and it ran smooth without error on the toy example from chr21 & chr22 according to github.
Now I have some real whole genome sequencing pore-C data and want to run the pipeline. I have to say, some parts of the github readme.md are not clear (at least to a first user) or not comprehensive.

I modified the four files in the config folder as mentioned in github and ran with few of my files. After it finished, some parts of the results were not the same as the test example, ie. Juicebox, assembly (visualization formats), matrix and pairs folders are missing.
Also I see there are a couple of files in .test/resources folder that is not clear what their role is in the pipeline or how should be generated if necessary (read_ids.txt, GM12878.conf.bed, GM12878_NlaIII.sequencing_summary.txt)
I removed these files after test run but not sure if this caused the missing folders in results.
Also I did not understand what the fast5 folder is for (or comes from) in the resources folder when I run the pipeline on my own data.

Thanks,
M.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants