Generative AI with Large Language Models

This repository contains resources and notebooks for working with large language models.

Setup

Virtual Environment

Navigate to the project directory:
```
cd <base>/large-language-models
```

Create the conda environment:

conda env create --file deploy/conda/linux_py312.yml

Activate the environment:
```
conda activate llm
```

To update the environment file (if necessary):

conda env export --name llm > deploy/conda/linux_py312.yml

Trained Model Downloads

Install megacmd based on your operating system from https://mega.io/cmd.

For Ubuntu 24.04:

wget https://mega.nz/linux/repo/xUbuntu_24.04/amd64/megacmd-xUbuntu_24.04_amd64.deb && sudo apt install "$PWD/megacmd-xUbuntu_24.04_amd64.deb"

Download the trained models:

mega-get https://mega.nz/folder/GNwjiCxR#bQtpQ8HMZ9jgoB1deKOTxA
mega-get https://mega.nz/folder/nBAXVDaa#Iu-PvhWUDHSDd78HvEleTA
mega-get https://mega.nz/folder/mUoGSTzR#7LQo8MLe_dz_zTG6nxdFTA
mega-get https://mega.nz/folder/GVpXxITD#9YqNR_uhUyxqsDI-KUMr0w

Notebooks

In-context Learning

File: In-context-learning.ipynb

This notebook explores the influence of input text on model output. It focuses on prompt engineering techniques, comparing zero-shot, one-shot, and few-shot inferences to enhance Large Language Model outputs.

Instruction Fine-tuning

File: Instruction-fine-tuning.ipynb

This notebook demonstrates fine-tuning the FLAN-T5 model from Hugging Face for improved dialogue summarization. It covers:

Full fine-tuning
Evaluation using ROUGE metrics
Parameter Efficient Fine-Tuning (PEFT)
Comparison of performance metrics

Reinforcement Learning Fine-tuning

File: Reinforcement-learning-fine-tuning.ipynb

This notebook focuses on fine-tuning a FLAN-T5 model to generate less toxic content using:

Meta AI's hate speech reward model (a binary classifier predicting "not hate" or "hate")
Proximal Policy Optimization (PPO) for reducing model toxicity.

BERT vs. FLAN-T5

Feature	BERT	FLAN-T5
Architecture	Encoder-only	Encoder-decoder
Pre-training	Masked Language Modeling and Next Sentence Prediction	Text-to-Text Transfer Transformer (T5)
Fine-tuning	Task-specific fine-tuning required	Instruction-tuned, can handle multiple tasks without task-specific fine-tuning
Input/Output	Fixed-length input, typically used for classification and token-level tasks	Variable-length input and output, suitable for a wide range of NLP tasks
Multilingual Support	Available in multilingual versions	Inherently supports multiple languages
Size	Various sizes, typically smaller than T5 models	Generally larger, with various sizes available
Instruction Following	Not designed for direct instruction following	Specifically trained to follow natural language instructions

FLAN-T5 is an advancement over BERT, offering more flexibility in task handling and better performance on a wider range of NLP tasks without requiring task-specific fine-tuning.

Infrastructure Decision-making

Here is a table summarizing key information about storage and training memory required for large language models based on model size and parameters:

Aspect	Details
Model Size	- Typically measured in number of parameters (e.g. 175B for GPT-3)
Parameters	- Each parameter is usually a 32-bit float (4 bytes)
Storage	- 1B parameters ≈ 4GB storage
Training Memory	- Model parameters: 4 bytes per parameter
	- Adam optimizer states: 8 bytes per parameter
	- Gradients: 4 bytes per parameter
	- Activations/temp memory: ~8 bytes per parameter
	- Total: ~24 bytes per parameter
Example	- 1B parameter model:
	- 4GB to store
	- ~24GB GPU RAM to train
Quantization	- FP16: 2 bytes per parameter
	- INT8: 1 byte per parameter
	- Reduces storage and memory requirements
PEFT Methods	- LoRA
	- Train small number of parameters (e.g. <1%)
	- Drastically reduce memory/storage needs

The exact numbers can vary based on model architecture, training approach and optimizations used.

Example: Here's a table summarizing the storage and training memory requirements for FLAN-T5-base (250M parameters), which has been used as the base model in the notebooks referred above:

Data Type	Model Size	Inference VRAM	Training VRAM (using Adam)
float32	850.31 MB	94.12 MB	3.32 GB
float16/bfloat16	425.15 MB	47.06 MB	1.66 GB
int8	212.58 MB	23.53 MB	850.31 MB
int4	106.29 MB	11.77 MB	425.15 MB

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data		data
deploy/conda		deploy/conda
.gitignore		.gitignore
In-context-learning.ipynb		In-context-learning.ipynb
Instruction-fine-tuning.ipynb		Instruction-fine-tuning.ipynb
LICENSE		LICENSE
README.md		README.md
Reinforcement-learning-fine-tuning.ipynb		Reinforcement-learning-fine-tuning.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generative AI with Large Language Models

Setup

Virtual Environment

Trained Model Downloads

Notebooks

In-context Learning

Instruction Fine-tuning

Reinforcement Learning Fine-tuning

BERT vs. FLAN-T5

Infrastructure Decision-making

About

Releases

Packages

Languages

License

satyampurwar/large-language-models

Folders and files

Latest commit

History

Repository files navigation

Generative AI with Large Language Models

Setup

Virtual Environment

Trained Model Downloads

Notebooks

In-context Learning

Instruction Fine-tuning

Reinforcement Learning Fine-tuning

BERT vs. FLAN-T5

Infrastructure Decision-making

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages