PLAPT is a state-of-the-art tool for predicting protein-ligand binding affinity, crucial for accelerating drug discovery processes. Our model leverages transfer learning from pretrained transformers like ProtBERT and ChemBERTa to achieve high accuracy while requiring minimal computational resources.
- Efficient Processing: Extremely lightweight prediction module allows for incredibly high throughput affinity prediction with cached embeddings.
- Transfer Learning: Uses pretrained models to extract rich protein and molecule features.
- Versatile Usage: Uses just 1D protein and ligand sequences as strings for input. Has a command-line interface and Python API for easy integration into various workflows.
- High Accuracy: Achieves top performance on benchmark datasets.
PLAPT uses a novel branching neural network architecture that efficiently integrates features from protein and ligand encoders to estimate binding affinities:
This architecture allows PLAPT to process complex molecular information effectively and highly efficiently when coupled with caching.
-
Clone the repository:
git clone https://github.com/trrt-good/WELP-PLAPT.git cd WELP-PLAPT
-
(Optional) Use virtual environment:
python3 -m venv env
For macos or linux, run:
source env/bin/activate
For windows:
env\Scripts\activate
-
Install dependencies:
pip3 install -r requirements.txt
PLAPT can be used via command line or integrated into Python scripts.
Predict affinity for a single protein and multiple ligands:
python3 plapt_cli.py -p "SEQUENCE" -m "SMILES1" "SMILES2" "SMILES3"
Predict affinities for multiple protein-ligand pairs:
python3 plapt_cli.py -p "SEQUENCE1" "SEQUENCE2" -m "SMILES1" "SMILES2"
Use files for input:
python3 plapt_cli.py -p proteins.txt -m molecules.txt
Save results to a file:
python3 plapt_cli.py -p "SEQUENCE" -m "SMILES1" "SMILES2" -o results.json
from plapt import Plapt
plapt = Plapt()
# Predict affinity for a single protein and multiple ligands
protein = "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG"
molecules = ["CC1=CC=C(C=C1)C2=CC(=NN2C3=CC=C(C=C3)S(=O)(=O)N)C(F)(F)F",
"COC1=CC=C(C=C1)C2=CC(=NN2C3=CC=C(C=C3)S(=O)(=O)N)C(F)(F)F"]
results = plapt.score_candidates(protein, molecules)
print(results)
# Predict affinities for multiple protein-ligand pairs
proteins = ["SEQUENCE1", "SEQUENCE2"]
molecules = ["SMILES1", "SMILES2"]
results = plapt.predict_affinity(proteins, molecules)
print(results)
PLAPT has been used in the following research:
1. López-Cortés, A., Cabrera-Andrade, A., Echeverría-Garcés, G. et al. Unraveling druggable cancer-driving proteins and targeted drugs using artificial intelligence and multi-omics analyses. Sci Rep 14, 19359 (2024). https://doi.org/10.1038/s41598-024-68565-7
If you've used PLAPT in your research, please let us know!
If you use PLAPT in your research, please cite our paper:
@misc{rose2023plapt,
title={PLAPT: Protein-Ligand Binding Affinity Prediction Using Pretrained Transformers},
author={Tyler Rose, Nicolò Monti, Navvye Anand, Tianyu Shen},
journal={bioRxiv},
year={2023},
url={https://www.biorxiv.org/content/10.1101/2024.02.08.575577v3},
doi={10.1101/2024.02.08.575577}
}
This project is licensed under the MIT License - see the LICENSE file for details.