-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metal ion binding dataset #2
Comments
Hi, empyriumz: Metal ion binding dataset collected from PDB(https://www.rcsb.org/). If the protein has any Metal ion binding site, we set its label as 1. |
We wrote a crawler to crawl the annotations of each PDB protein. Do you need the original dataset we collected? |
By original dataset, do you mean all the PDB files? That would be too large I guess, so could you share the script used for search and annotate the PDB entries? |
I am so sorry that the classmates who wrote the crawler are not on the author list and are unwilling to give it to us. They now have a job and will also release the relevant dataset. I can notify you after their paper is released. But I can give you a simple code that can check whether each page contains keywords. It may help you.
If the page does not contain the 'metal ion binding' then the code will return a null list. |
Hi, I try to use your metal alphafold code to predict other protein features, but I find that your code use a pkl data as the input, so I want to know how you generate the pkl files.Thanks! |
Hi Violet969: This pkl including MSA and template information.
Pkl detail information you can see Alphafold paper's supplementary information pages 8-9. I have already released the MSA on https://drive.google.com/drive/folders/1iShEW8NcMIlWqxTRgsEaI_t5ahoHsixt?usp=share_link But the code to generate pkl maybe you need to modify some on run_alphafold.py. I can upload this part of the preprocessing code later. |
I see, thanks for your sample code! I'll try to see if the results match with my aforementioned one. |
Thanks for your answer. I also have a question, I saw that you use Evofomer and ESM to predict protein SS. But I don't see these code, will you share that? |
Sure, I will upload this part of the code later. |
Hi Violet969, |
|
Hi, |
Thanks for your answer. I also want an example for run metal/alphafold/train.py. Can you share that? |
Now you should be able to run train.py directly with a few simple modifications. Please make sure you have configured the Alphafold runtime environment. In addition, it seems that the current Alphafold parameter format is different from before. You can try to find the previous public parameter file. |
Thanks for your reply. I try to run 'train.py' on my server. But there always have an error like this.
I have 8 nodes of 12G GPU, and 125G mem. Can you tell me how to solve it? |
I tested this code on A40(48GB) server and it works. |
Thanks for your so fast reply, that 'os.environ['XLA_PYTHON_CLIENT_MEM_FRACTION'] = '2'' works.
Can you tell me how to solve it? |
Delete the './tmp' folder. |
Hi there,
Nice work!
I have a question about the metal ion binding dataset used in your paper.
Could you let me know where do you get the original dataset?
Thanks!
The text was updated successfully, but these errors were encountered: