Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Requesting Clarification on User-Supplied LD Matrices #138

Open
dym22 opened this issue Mar 26, 2024 · 1 comment
Open

Requesting Clarification on User-Supplied LD Matrices #138

dym22 opened this issue Mar 26, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@dym22
Copy link

dym22 commented Mar 26, 2024

Is your feature request related to a problem? Please describe.
The sample data I am working with is relatively homogenous but not well characterized by any of the populations in 1000G. Additionally, as I have the genotype data anyway, I would like to provide my own LD matrices rather than relying on those from a reference panel. I have tried doing this a few ways--supplying a path to text file which contains full path directories to pre-computed (with plink) LD matrices (both with .txt.gz copies and without, both adding SNP IDs to column/row and without), supplying a list of file .vcf.gz files, etc. The documentation does not provide much guidance on exactly what the argument "LD_reference" expects if it is not 1000G or UKBB. Some clarification on this would be appreciated.

Describe the solution you'd like
Ideal would be a worked example, but if that is not feasible, just more clarification would help.

Describe alternatives you've considered
Ideal would be an additional feature that allows simply providing the directory (and maybe filtering parameters) to a text file that contains the paths of the bfiles (something like GCTA's mbfile flag) that took care of the everything on the back end, but understandably that would involve quite a lot of work. If I could just get further clarification on how the matrices need to be formatted and how to point the LD_reference argument to them, that would be great.

Additional context
Great package! Preliminary results (just using reference panels for LD matrices) look good and whole procedure is relatively hassle-free compared to most fine-mapping software out there.

Edit: wording

@bschilder
Copy link
Member

bschilder commented May 10, 2024

So I realize now that I haven't documented this feature well, but certain finemap_loci args can take a list equal to the number of loci being fine-mapped.

In the case of the LD_reference arg, this then passes to echoLD::get_LD which infers how to import the respective LD file (or computes LD from a VCF file).

So this might look something like:

loci <- c("A","B","C")
LD_reference <- list("./filepath.A.csv.gz","./filepath.B.csv.gz", "./filepath.C.csv.gz")
echolocatoR::finemap_loci(
...,
loci=loci,
LD_reference=LD_reference
)

Here's the different file types that LD_reference can accept:
Screenshot 2024-05-10 at 21 33 49

Let me know if this helps.

@bschilder bschilder self-assigned this May 10, 2024
@bschilder bschilder added the question Further information is requested label May 10, 2024
@bschilder bschilder moved this from Todo to In Progress in 🦇🦇 echoverse 🦇🦇 May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
Status: In Progress
Development

No branches or pull requests

2 participants