Video Understanding and Q&A Tool

This project allows you to input a YouTube video link, and it provides a comprehensive understanding of the video's content through audio transcription and image captioning. LLM is used to combine audio and video context. Additionally, you can ask questions and it will provide responses according to video content 🚀

Features ✨

👉 Video Understanding: The tool utilizes the Transformer model for audio transcription, converting spoken words into textual format. It also employs image captioning techniques to extract text from images within the video. Image embeddings are also used to compare images and only use images unique for extracting info. Video and Audio are processed parallelly.

👉 Question & Answer: Users can ask questions about the video's content. The tool leverages the power of Chromadb as a vector database to provide accurate and contextually relevant answers.

How to Use ⚙️

• Clone this repository: git clone https://github.com/Dev-Khant/tell-what-a-video-does.git

• Install the required dependencies: pip install -r requirements.txt

• Run the streamlit app: streamlit run app.py

• Provide YouTube video with your OpenAI token, Huggingface token, SerpAPI token

Technical 🖥️

• Hugging Face: Utilized to access the OpenAI Whisper model for audio transcription.

• SerpApi: Used it to access Google Lens API for getting image information.

• Streamlit: Used to create the interactive web interface for the project.

• Chromadb: The vector database used for storing and retrieving Q&A information.

Work in Progress 🚧

Add Weaviate and let the user select their VectorDB.
Internet access to chatbot.
Option to upload video.
Store video explanations so they can be used later.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
process		process
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Video Understanding and Q&A Tool

Features ✨

How to Use ⚙️

Technical 🖥️

Work in Progress 🚧

About

Releases

Packages

Languages

Dev-Khant/tell-what-a-video-does

Folders and files

Latest commit

History

Repository files navigation

Video Understanding and Q&A Tool

Features ✨

How to Use ⚙️

Technical 🖥️

Work in Progress 🚧

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages