PDFAssist is an innovative tool designed to empower users from all walks of life—researchers, students, and everyday individuals—by providing them with a seamless way to interact with their PDF documents. Whether you're conducting in-depth research, studying for exams, or simply need quick access to information within a PDF, PDFAssist offers an intuitive platform where you can ask questions about the content and receive accurate, context-aware answers. By leveraging advanced technology, PDFAssist transforms static documents into dynamic, interactive resources, making it easier than ever to extract valuable insights from your PDFs.
- Langchain + LLM(Google Gemini Pro)
- Streamlit: UI
- Text Embeddings: Google Text Embeddings
- FAISS: Vector database
- Users can ask questions about the content within their PDFs and receive accurate, context-aware answers.
- Designed for all users, from researchers to students to the general public, making it easy to navigate and extract information.
- Ideal for various needs, including academic research, study assistance, and everyday information retrieval.
- app.py: The main Streamlit application script.
- requirements.txt: A list of required Python packages for the project.
- index.pkl: A pickle file to store the FAISS index.
- .env: Configuration file for storing your Google API key.
1.Clone this repository to your local machine using:
git clone https://github.com/harshd23/PDFAssist.git
2.Install the required dependencies using pip:
pip install -r requirements.txt
3.Set up your Google API key by creating a .env file in the project root and adding your API
GOOGLE_API_KEY = "put_your_google_api_key_here"
1.Run the Streamlit app by executing:
streamlit run app.py
2.The web app will open in your browser:
- On the left of the screen, you can upload your multiple PDFs directly.
- Initiate the data loading and processing by clicking "Submit & Process"
- Observe the system as it performs text splitting, generates embedding vectors, and efficiently indexes them using FAISS.
- The embeddings will be stored and indexed using FAISS, enhancing retrieval speed.
- The FAISS index will be saved in a local file path in pickle format for future use.
- One can now ask a question and get the answer from your own PDFs.