Add Lion optimizer and a "skip step" version of it #43

epwalsh · 2024-09-06T21:21:40Z

I don't really intend to use the Lion optimizer, this is more meant as a proof of concept for two things:

an efficient (no host-device sync) implementation of the "skip step" optimizer idea that @2015aroras is investigating, and
compiling the optimizer step

2015aroras

If I want to log the step_factor, do I lose all the benefits of avoiding host-device sync?

2015aroras · 2024-09-06T21:36:49Z

src/olmo_core/optim/skip_step_optimizer.py

+        if self._grad_norms:
+            grad_norm_std = torch.std(torch.stack(self._grad_norms[:-1]))
+            return (
+                self.latest_loss <= self.sigma_factor * loss_std


The Karpathy thing actually filters out when loss/grad norm is sigma_factor standard deviations above the mean (this does above 0). A number that is s standard deviations above the mean is said to have a "z-score" of s.

Oh, duh, that's embarrassing after having spent 4 years in a stats PhD program

Forgot the .abs() but just pushed a fix for that.

epwalsh added 3 commits September 6, 2024 14:02

Add Lion optimizer and a "skip step" version of it

5a52c47

Add config classes

276a4fb

Integrate into trainer

f7b6e76

epwalsh requested a review from 2015aroras September 6, 2024 21:21

changelog

4fd8888

2015aroras reviewed Sep 6, 2024

View reviewed changes

epwalsh added 2 commits September 6, 2024 14:51

fix

82d2c52

add test for Lion

cc12cf0

2015aroras approved these changes Sep 6, 2024

View reviewed changes

epwalsh added 8 commits September 6, 2024 14:56

refactor

f6a5b5e

record a metric for skipping steps

067267c

log to console

d8bedfe

fix config

be93205

another fix

f28ff2b

improve how metric values of 0.0 are logged to console

9c39484

fix?

e56024d

remove .abs() and avoid division by zero

a14de85

epwalsh merged commit 5036d90 into main Sep 6, 2024
14 checks passed

epwalsh deleted the epwalsh/lion branch September 6, 2024 22:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Lion optimizer and a "skip step" version of it #43

Add Lion optimizer and a "skip step" version of it #43

epwalsh commented Sep 6, 2024

2015aroras left a comment

2015aroras Sep 6, 2024

epwalsh Sep 6, 2024

epwalsh Sep 6, 2024

epwalsh Sep 6, 2024

Add Lion optimizer and a "skip step" version of it #43

Add Lion optimizer and a "skip step" version of it #43

Conversation

epwalsh commented Sep 6, 2024

2015aroras left a comment

Choose a reason for hiding this comment

2015aroras Sep 6, 2024

Choose a reason for hiding this comment

epwalsh Sep 6, 2024

Choose a reason for hiding this comment

epwalsh Sep 6, 2024

Choose a reason for hiding this comment

epwalsh Sep 6, 2024

Choose a reason for hiding this comment