Attention Is All You Need - Implementation from Scratch

This repository contains the Python code implementation of the original transformer model as described in the paper "Attention is All You Need" by Vaswani et al.

The code is 100% possible with the help of Pytorch Transformers from Scratch (Attention is all you need) video by Aladdin Persso.

Overview
The transformer model has been a significant breakthrough in the field of machine learning, particularly in Natural Language Processing (NLP). It introduced the concept of "attention" mechanism that allows the model to focus on certain parts of the input sequence when producing an output, thereby improving the model's ability to handle long sequences.

In this project, we read and implement the transformer model from scratch, providing a detailed understanding of its inner workings.

Code Structure
The main implementation of the Self-Attention mechanism, which is the core of the transformer model, is in the main.py file. It defines a SelfAttention class that extends PyTorch's nn.Module class.

How to Run
To run the code, you need to have Python and PyTorch installed on your machine. You can then run the main.py file using a Python interpreter.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attention Is All You Need - Implementation from Scratch

About

Releases

Packages

Languages

retrogtx/attention-is-all-you-need

Folders and files

Latest commit

History

Repository files navigation

Attention Is All You Need - Implementation from Scratch

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages