Return scheduler job ids from FlowProject.submit #543
Labels
cluster submission
Enhancements to the submission process
enhancement
New feature or request
good first issue
Good for newcomers
Feature description
Requested by user @salazardetroya: https://signac.slack.com/archives/CVC04S9TN/p1623794700095400
This would enable complex submission workflows through something like the following snippet:
Proposed solution
We used to (partially) support this kind of behavior for PBS/Torque clusters but we did not implement it for SLURM. If we chose to support this feature, we would need to implement it for all schedulers so that we have a consistent API. See here for the past implementation (removed in 0.12):
signac-flow/flow/scheduling/torque.py
Lines 149 to 155 in 29afbe3
I believe that one possible issue with this approach is that I'm not sure if all clusters behave the same. Some clusters might print other messages / info via stdout / stderr that would break the parsing.
The return value of the scheduler class (the part I linked above) would need to be forwarded through a series of calling functions to the return value of
FlowProject.submit
. I think it might be appropriate to return a list of job ids as strings, sinceFlowProject.submit
can callsbatch
(or a different scheduler command) multiple times.To add this feature, here are the steps I would suggest:
_call_submit
return the captured output. This applies to all schedulers.signac-flow/flow/scheduling/base.py
Line 162 in 9d4f1b4
signac-flow/flow/scheduling/slurm.py
Line 150 in 9d4f1b4
ComputeEnvironment
class to pass through the captured scheduler job id if submission occurs (instead ofJobStatus.submitted
, which could be inferred by the calling functions) andNone
if submission didn't run or failed.signac-flow/flow/environment.py
Lines 215 to 217 in 9d4f1b4
FlowProject._submit_operations
to pass through scheduler job ids, just like in the previous step.signac-flow/flow/project.py
Lines 3691 to 3693 in 9d4f1b4
FlowProject.submit
to return job ids (and continue to update the job/operation status on success, as interpreted by the result of the above method calls).signac-flow/flow/project.py
Lines 3782 to 3791 in 9d4f1b4
python project.py submit
) should print the ids returned by theFlowProject.submit
method.Additional context
Another alternative would be to just return the raw captured stdout and leave it to the user to parse that information. In that case,
FlowProject.submit
would return a list of strings, each containing the raw output of one call tosbatch
(instead of a list of strings of parsed job ids).The text was updated successfully, but these errors were encountered: