Releases · huggingface/trl · GitHub

02 Mar 09:18

v0.3.1

What's Changed

Clarifications of acronyms and initialisms by @meg-huggingface in #185
Update detoxifying_a_lm.mdx by @younesbelkada in #186
Fix reference to example by @jordimas in #184

New Contributors

@meg-huggingface made their first contribution in #185
@jordimas made their first contribution in #184

Full Changelog: v0.3.0...v0.3.1

Contributors

jordimas, younesbelkada, and meg-huggingface

Assets 2

01 Mar 12:45

lvwerra

v0.3.0

What's Changed

fix style, typos, license by @natolambert in #103
fix re-added file by @natolambert in #116
add citation by @natolambert in #124
add manual seeding for RL experiments by @natolambert in #118
add set_seed to init.py by @lvwerra in #127
update docs with Seq2seq models, set_seed, and create_reference_model by @lvwerra in #128
[bug] Update gpt2-sentiment.py by @younesbelkada in #132
Fix Sentiment control notebook by @lvwerra in #126
realign values by @lvwerra in #137
Change unclear variables & fix typos by @natolambert in #134
Feat/reward summarization example by @TristanThrush in #115
[core] Small refactor of forward pass by @younesbelkada in #136
[tests] Add correct repo name by @younesbelkada in #138
fix forward batching for seq2seq and right padding models. by @lvwerra in #139
fix bug in batched_forward_pass by @ArvinZhuang in #144
[core] Add torch_dtype support by @younesbelkada in #147
[core] Fix dataloader issue by @younesbelkada in #154
[core] enable bf16 training by @younesbelkada in #156
[core] fix saving multi-gpu by @younesbelkada in #157
Added imports by @BirgerMoell in #159
Add CITATION.cff by @kashif in #169
[Doc] Add how to use Lion optimizer by @younesbelkada in #152
policy kl [old | new] by @kashif in #168
add minibatching by @lvwerra in #153
fix bugs in tutorial by @shizhediao in #175
[core] Add max_grad_norm support by @younesbelkada in #177
Add toxcitiy example by @younesbelkada in #162
[Docs] Fix barplot by @younesbelkada in #181

New Contributors

@natolambert made their first contribution in #103
@ArvinZhuang made their first contribution in #144
@BirgerMoell made their first contribution in #159
@kashif made their first contribution in #169
@shizhediao made their first contribution in #175

Full Changelog: v0.2.1...v0.3.0

Contributors

kashif, BirgerMoell, and 6 other contributors

Assets 2

25 Jan 16:09

lvwerra

v0.2.1

What's Changed

Update customization.mdx by @younesbelkada in #109
add datasets as a dependancy by @lvwerra in #110
[Docs] Add hlinks to scripts & notebooks by @younesbelkada in #111
Fix Mapping in core for Python 3.10 by @lvwerra in #112

Full Changelog: v0.2.0...v0.2.1

Contributors

lvwerra and younesbelkada

Assets 2

25 Jan 14:04

lvwerra

v0.2.0

Highlights

General decoder model support in addition to GPT-2 in #53
Encoder-decoder model support (such as T5) in #93
New, shiny docs with the doc-builder in #59
push_to_hub with PPOTrainer in #68
Simple reference model creation with layer sharing in #61

What's Changed

Remove nbdev dependency by @younesbelkada in #52
Adds github actions and dummy test by @edbeeching in #55
Update README.md by @Keith-Hon in #51
Update README.md by @TristanThrush in #49
Adds Python highlighting to the code block by @JulesGM in #45
xxxForCausalLM support by @younesbelkada in #53
[VHead] Fix slow convergence issue by @younesbelkada in #60
add docbuilder skeleton by @lvwerra in #59
fix docs workflow by @lvwerra in #63
accelerate integration by @younesbelkada in #58
add create_reference_model by @lvwerra in #61
Improve Makefile and code quality by @lvwerra in #62
Relax requirements by @lvwerra in #66
modeling - change namings by @younesbelkada in #65
[PPOTrainer] make the reference model optional by @younesbelkada in #67
Improvements 1a by @edbeeching in #70
update GitHub actions to main by @lvwerra in #77
[core] refactor step method by @younesbelkada in #76
[PPOTrainer] Support generic optimizers by @younesbelkada in #78
Update sentiment_tuning.mdx by @eltociear in #69
Remove references to "listify_batch" by @xiaoyesoso in #81
Collater -> collator by @LysandreJik in #88
Model as kwarg in pipeline by @LysandreJik in #89
Small typo correction by @LysandreJik in #87
[API] Make dataset attribute optional by @younesbelkada in #85
[Doc] Improve docs by @younesbelkada in #91
[core] Push v_head when using AutoModelForCausalLMWithValueHead by @younesbelkada in #86
[core] remove wandb dependency by @younesbelkada in #92
add logo by @lvwerra in #95
Encoder-Decoder models support by @younesbelkada in #93
Fix docs hyperlinks by @lewtun in #98
[API] LR scheduler support by @younesbelkada in #96
Version should have dev0 unless it is a release version by @mishig25 in #99
[core] improve API by @younesbelkada in #97
Add push to Hub for PPOTrainer by @lewtun in #68
[core] Advise to use fbs=1 by @younesbelkada in #102
[Doc] New additions by @younesbelkada in #105
restructure examples by @lvwerra in #107
Fix nits & missing things by @younesbelkada in #108
Convert notebook 05 by @edbeeching in #80

New Contributors

@lvwerra made their first contribution in #2
@vblagoje made their first contribution in #16
@dependabot made their first contribution in #26
@younesbelkada made their first contribution in #52
@edbeeching made their first contribution in #55
@Keith-Hon made their first contribution in #51
@TristanThrush made their first contribution in #49
@JulesGM made their first contribution in #45
@eltociear made their first contribution in #69
@xiaoyesoso made their first contribution in #81
@LysandreJik made their first contribution in #88
@lewtun made their first contribution in #98
@mishig25 made their first contribution in #99

Full Changelog: https://github.com/lvwerra/trl/commits/v0.2.0

Contributors

vblagoje, JulesGM, and 11 other contributors

Assets 2