-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend functionality of Wandb Config Diff script #687
Conversation
…nt compare wandb config script to also flatten list dicts
scripts/compare_wandb_configs.py
Outdated
} | ||
if len(keys_with_differences) > 0: | ||
for k in sorted(keys_with_differences): | ||
if isinstance(left_config[k], list) and isinstance(right_config[k], list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also don't need this if you treat lists as Mapping[int, Any]
. And it will work right even if the list entries are complex. On the other hand, the output will look different / be less compact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you simplify this now that lists are properly flattened?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it seems this will never happen now?
dictionary (dict): The nested dictionary to be flattened. | ||
parent_key (str, optional): The parent key to be prepended to the keys of the flattened dictionary. Defaults to "". | ||
separator (str, optional): The separator to be used between the parent key and the keys of the flattened dictionary. Defaults to ".". | ||
include_lists (bool, optional): Whether to convert lists to dictionaries with integer keys. Defaults to False. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we ever want to turn this off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not for now, but seems fine to extend this in case someone wants to add different logic for dealing w list-valued config params?
scripts/compare_wandb_configs.py
Outdated
} | ||
if len(keys_with_differences) > 0: | ||
for k in sorted(keys_with_differences): | ||
if isinstance(left_config[k], list) and isinstance(right_config[k], list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you simplify this now that lists are properly flattened?
scripts/compare_wandb_configs.py
Outdated
components = path.split("/") | ||
|
||
# Remove common suffixes like 'allenai' | ||
components = [c for c in components if c != "allenai"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't think that's confusing? For copy and paste?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
scripts/compare_wandb_configs.py
Outdated
|
||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return |
scripts/compare_wandb_configs.py
Outdated
} | ||
if len(keys_with_differences) > 0: | ||
for k in sorted(keys_with_differences): | ||
if isinstance(left_config[k], list) and isinstance(right_config[k], list): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it seems this will never happen now?
You let me know when this is ready for another look? |
flatten_dict()
in utilsflatten_dict()
to also flatten any dicts that exist in Listsflatten_dict()
Motivation is, while comparing configs, the current implementation doesn't perform comparison of some key aspects of the configs, namely config keys representing dataset paths (which are all List[str]) as well as keys like
config["evaluators.value"]
which areList[Dict]
.The current behavior looks something like this:
where we can see that the fields
data.value.paths
andevaluators.value
aren't easily comparable.The new behavior looks like this:
where it preserves behavior of original script under old keys, but performs a specialized diff just for
data.paths
andevaluators
.for
data.paths
, the idea is that we just need to know the names of the datasets from the paths + how many files used. Added new function that does that specifically.