Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save output model to output_dir #1430

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Save output model to output_dir #1430

wants to merge 8 commits into from

Conversation

xiaoyu-work
Copy link
Contributor

Describe your changes

Save output model to output_dir

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.
  • Is this PR including examples changes? If yes, please remember to update example documentation in a follow-up PR.

(Optional) Issue link

@@ -321,17 +321,56 @@ def test_run_no_search(self, mock_local_system_init, tmp_path):
assert output_node.model_config == onnx_model_config
assert expected_metrics == output_node.metrics.value

output_model_dir = output_dir / "output_model"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update the tests in this too

workflow_output_dir = tmpdir / "output_model"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run_output_path = Path(config["output_dir"]) / "output_model"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why this test can pass

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test passes since base.py was not updated and we just mock the run output. we might have to update one of the cli tests to not do the mock.

@@ -422,3 +429,30 @@ def _plot_pareto_frontier(self, ranks=None, save_path=None, is_show=True, save_f

if is_show:
fig.show()


def get_best_candidate_node(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add some comments here to explain the logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do


B. Multiple pass flows:
output_dir/output_model/{pass_flow}/...: Same as A but for each pass flow
output_dir/{pass_flow}/...: Same as A but for each pass flow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there is only one save_model call, I don't think this structure is correct?
Please update the search and no search output model structure in this docstring after running and validating all scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is same for both search and no search. Engine can still get a output_footprints and only one model will be saved to output folder. These are intermediate folders created by the engine. I can double check this and paste the folder structure later here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
They have metrics file in each pass flow pass folder. Though I don't think those files are needed so we can keep the output folder clean. We can remove those intermediate folders later if needed.

@@ -284,7 +282,7 @@ def test_quantize_command(mock_repo_exists, mock_tempdir, mock_run, algorithm_na
mock_tempdir.return_value = tmpdir.resolve()
mock_run.return_value = {}

workflow_output_dir = tmpdir / "output_model" / algorithm_name
workflow_output_dir = tmpdir / algorithm_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the algorithm name exists anymore? This was for the behavior where each pass flow saved it's own output model

Copy link
Contributor

@jambayk jambayk Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiaoyu-work please take a look at this too

@@ -422,9 +429,6 @@ def run_no_search(
json.dump(signal.to_json(), f, indent=4)
logger.info("Saved evaluation results of output model to %s", results_path)

self.cache.save_model(model_id=model_ids[-1], output_dir=flow_output_dir, overwrite=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this is not present, then I don't think this pass_flow nesting is present?
image

not sure if we should keep it or not

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my previous, removing this means it does not save pass_flow models to this path, but we still have results_path = flow_output_dir / "metrics.json" in this path. My expectation is we could remove pass_flow path also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants