This repository contains the Python code implementation of the original transformer model as described in the paper "Attention is All You Need" by Vaswani et al.
The code is 100% possible with the help of Pytorch Transformers from Scratch (Attention is all you need) video by Aladdin Persso.
Overview
The transformer model has been a significant breakthrough in the field of machine learning, particularly in Natural Language Processing (NLP). It introduced the concept of "attention" mechanism that allows the model to focus on certain parts of the input sequence when producing an output, thereby improving the model's ability to handle long sequences.
In this project, we read and implement the transformer model from scratch, providing a detailed understanding of its inner workings.
Code Structure
The main implementation of the Self-Attention mechanism, which is the core of the transformer model, is in the main.py file. It defines a SelfAttention class that extends PyTorch's nn.Module class.
How to Run
To run the code, you need to have Python and PyTorch installed on your machine. You can then run the main.py file using a Python interpreter.