Atto-GPT: A Minimal GPT from Scratch

Atto-GPT is a minimal implementation of the Generative Pre-trained Transformer (GPT) architecture from scratch, following the foundational concepts laid out in the GPT family of models. The primary goal of this project is to provide an educational and simple implementation of GPT, focusing on its core components and training process using PyTorch.

Atto-GPT is designed to demonstrate how a Transformer-based language model can be built from the ground up. It includes key components such as the self-attention mechanism, multi-head attention, and the autoregressive generation process, while adhering to the original design philosophy of the GPT model.

Reference:

Original Paper: Improving Language Understanding by Generative Pre-Training
Learning Video: Let's build GPT: from scratch, in code, spelled out. by Andrej Karpathy

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
notebooks		notebooks
src		src
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atto-GPT: A Minimal GPT from Scratch

About

Languages

debarshee2004/atto-gpt

Folders and files

Latest commit

History

Repository files navigation

Atto-GPT: A Minimal GPT from Scratch

About

Topics

Resources

Stars

Watchers

Forks

Languages