Skip to content

LLM decomposition; few-shot learning (EMNLP 2024)

Notifications You must be signed in to change notification settings

terarachang/LLMDecomp

Repository files navigation

When Parts Are Greater Than Sums: Individual LLM Components Can Outperform Full Models (EMNLP 2024)

Ting-Yun Chang, Jesse Thomason, and Robin Jia

Paper: https://arxiv.org/abs/2406.13131 Blog: https://terarachang.github.io/projects/llm-decomp.html

Methods

Quick Start

export HF_TOKEN="YOUR TOKEN"
pip install -r requirements.txt

Component Reweighting

$ bash scripts/comp_rw.sh

Standard ICL

$ bash scripts/standard.sh

Calib+

$ bash scripts/calibration.sh

Adding New Models

  • Our repo supports LLMs in the Llama and Mistral family
  • To support new models, please add hooks to the model and follow the naming convention of my_modeling_llama.py
  • If the new model also uses RMSNorm, the decompose.py file is directly applicable. Otherwise, please take care of layernorms, which may greatly influence model performance!
  • *We do not fully adopt TransformerLens to avoid numerical issues in Llama-3 and reduce computation overhead

About

LLM decomposition; few-shot learning (EMNLP 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published