Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workspace: expose workspace choice to users #545

Open
tiborsimko opened this issue Aug 17, 2021 · 2 comments · May be fixed by reanahub/reana-workflow-controller#397 or reanahub/pytest-reana#88
Open
Labels

Comments

@tiborsimko
Copy link
Member

Now that we have an option to use several different POSIX workspaces where to run workflows, the users should be able to configure where they would like to run their given workflow. E.g. one workflow in the default place, another workflow in their EOS home, etc.

This configuration should be done in reana.yaml.

Option 1: introduce new top-level section

We can introduce a new section in reana.yaml to express the concept of workspace. Pros: instead of just writing the POSIX path, we could store more information there, should we need it in the future. Also, the concept of workspace will stand out clearly. Cons: we would need to amend parsing and REST API protocols due to having new section.

An example of how this could look like:

version: 0.6.0
inputs:
  files:
    - code/gendata.C
    - code/fitdata.C
  parameters:
    events: 20000
    data: results/data.root
    plot: results/plot.png
workflow:
  type: serial
  specification:
    steps:
      - name: gendata
        environment: 'reanahub/reana-env-root6:6.18.04'
        commands:
        - mkdir -p results && root -b -q 'code/gendata.C(${events},"${data}")'
      - name: fitdata
        environment: 'reanahub/reana-env-root6:6.18.04'
        commands:
        - root -b -q 'code/fitdata.C("${data}","${plot}")'
workspace:
  type: posix
  workspace_root_dir: /eos/home-s/simko/myworkflows
outputs:
  files:
    - results/plot.png

A future option could be:

workspace:
  type: s3
  workspace_root_dir: s3://mybucket/myworkflows

Option 2: use existing options clause

We have an option of not changing reana.yaml and simply use existing clauses, such as parameters or options. Parameters, such as temperature=20c and mass=10g, influence the research results, whilst options, such as cache=off, keep the physics results and only influence how the workflow is orchestrated. From this point of view, a choice of workspace is more an option than a parameter, since a good reproducible analysis should not depend on where it is run.
Hence we could choose options. Pros: we only add some parameter, REST API could use existing vehicle. Cons: conceptually the notion of workspace would not stand out so clearly, the workspace configuration would be "hidden" amongst other options. Also, options can be set via CLI options (e.g. reana-client start -o foo=bar) but this cannot be done for workspace, since it must be initialised before.

Example:

version: 0.6.0
inputs:
  files:
    - code/gendata.C
    - code/fitdata.C
  parameters:
    events: 20000
    data: results/data.root
    plot: results/plot.png
  options:
    workspace_root_prefix: /eos/home-s/simko/myworkflows
workflow:
  type: serial
  specification:
    steps:
      - name: gendata
        environment: 'reanahub/reana-env-root6:6.18.04'
        commands:
        - mkdir -p results && root -b -q 'code/gendata.C(${events},"${data}")'
      - name: fitdata
        environment: 'reanahub/reana-env-root6:6.18.04'
        commands:
        - root -b -q 'code/fitdata.C("${data}","${plot}")'
workspace:
  type: eos
  workspace_root_dir: /eos/home-s/simko/myworkflows
outputs:
  files:
    - results/plot.png

A future option could be:

  options:
    workspace_root_prefix: s3://mybucket/myworkflows

(The type is inferred from the beginning of the value. Or, if need be, more strings would be added, such as workspace_type: s3. This is basically "flattened" option 1 expressed via options clause.)

Notes

Regardless of which option we shall choose, there is a certain default that should be used in case the user does not set anything. This default will be set by the cluster administrator, but this will be part of another issue.

@tiborsimko
Copy link
Member Author

P.S. In the above, we might read workspace_root_path or whatever name we shall select 😉 in al the places.

@mvidalgarcia
Copy link
Member

IMO option 1 looks cleaner. I think the workspace is relevant enough to have its own section.

Currently, we support some input.options but those are very related to certain workflow languages whereas the workspace would be universal. OTOH, it's true that the CACHE option is directly related to the storage.. but still, I think it'd be harder for the final user to set it as an option.

marcdiazsan added a commit to marcdiazsan/reana-workflow-controller that referenced this issue Aug 18, 2021
marcdiazsan added a commit to marcdiazsan/pytest-reana that referenced this issue Aug 18, 2021
marcdiazsan added a commit to marcdiazsan/reana-server that referenced this issue Aug 18, 2021
…aml file

It passes a workspace_root_path as a parameter to rwc

Closes reanahub/reana-client#545
marcdiazsan added a commit to marcdiazsan/reana-server that referenced this issue Aug 25, 2021
…aml file

It passes a workspace_root_path as a parameter to rwc

Closes reanahub/reana-client#545
marcdiazsan added a commit to marcdiazsan/reana-workflow-controller that referenced this issue Aug 25, 2021
marcdiazsan added a commit to marcdiazsan/reana-server that referenced this issue Aug 26, 2021
marcdiazsan added a commit to marcdiazsan/reana-commons that referenced this issue Aug 26, 2021
marcdiazsan added a commit to marcdiazsan/reana-commons that referenced this issue Sep 5, 2021
marcdiazsan added a commit to marcdiazsan/reana-server that referenced this issue Sep 5, 2021
…aml file

It passes a workspace_root_path as a parameter to rwc

Closes reanahub/reana-client#545
marcdiazsan added a commit to marcdiazsan/reana-server that referenced this issue Sep 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment