-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use Ramble modifier to fill in allocation variables (#195)
* initial modifier * partial work * dont mess with locals() * changed variable name * Able to proceed with Ramble#452; that uncovered a str-to-int conversion issue * remove debugging statements * remove filled-in variable from experiment name * intermediate work on also getting modifier to generate batch submissions * finished up work that allows the modifier to define allocations as well * style fix * refactor away from context manager * handle flux directives and timeout * remove unused import * add references for clarification * n_threads is not special; also rename it to n_omp_threads_per_task * intermediate work * done with doing placeholder inference based on exceeding max-request limit * env_var_modification needs mode; not sure 100 percent what that should be * add n_cores_per_node (different than n_cores_per_rank) * style edits * there can now be one execute_experiment.tpl * removal of all individual execute_experiment.tpl files * update all system configs except Fugaku and Sierra * update all experiments based on (a) new names and (b) logic that fills in variables * style edit * sierra batch/run cmd options implemented * add fugaku batch opt generation logic * replace variables for Sierra and Fugaku * consolidate variable accessor logic into single class; add explanatory comment * syntax error * testing+fixing some issues for fugaku * typos for sierra * fix sierra reference errors etc. and recognition of 'queue' as variable * style fix * apply real values to sys_cpus_per_node/sys_gpus_per_node for LLNL systems * the scheduler used for Sierra is called 'lsf', so use that name * add basic alias substitution logic (omp_num_threads can be used instead of n_threads_per_proc) * fix alias issue and add comments * style fix * set appropriate schedulers * scheduler on Fugaku is called 'pjm' * all experiments need to use the allocation modifier * amg2023 benchmark should not be doing requesting any number of ranks/processes per node * logic to set n_ranks based on n_gpus (if the latter is set and the former isnt) * handle the most common case of gpu specification for Flux, not using jobspec * add docstring * syntax error * style fix * Fugaku system description * LUMI system description * add reference link * Piz Daint system description * add reference link * partial description of Eiger/Alps * proper detection of unset vars; fixed error w/ calculation of n_nodes from n_gpus * Both flux and lsf want gpus_per_rank * style fix * more style fixes * restore default nosite config * missed converting input param name * saxpy/raja-perf cuda/rocm experiments should just specify the number of gpus they want * add CI checks to exercise the allocation modifier logic (use --dry-run to avoid doing concretization/install as part of ramble workspace setup) * sys_cpus_per_node -> sys_cores_per_node * intercept divide-by-zero error * clarify we currently only support lrun and not jsrun * style fix --------- Co-authored-by: pearce8 <[email protected]>
- Loading branch information
Showing
54 changed files
with
671 additions
and
422 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,15 +4,16 @@ | |
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
variables: | ||
batch_time: '02:00' | ||
mpi_command: 'mpiexec' | ||
batch_submit: 'pjsub {execute_experiment}' | ||
batch_nodes: '#PJM -L "node={n_nodes}"' | ||
batch_ranks: '#PJM --mpi proc={n_ranks}' | ||
batch_timeout: '#PJM -L "elapse={batch_time}:00" -x PJM_LLIO_GFSCACHE="/vol0002:/vol0003:/vol0004:/vol0005:/vol0006"' | ||
default_comp: '[email protected]' | ||
#default_comp: '[email protected]' | ||
#default_comp: '[email protected]' | ||
fj_comp_version: '4.10.0' | ||
sys_arch: 'arch=linux-rhel8-a64fx' | ||
|
||
default_fj_version: '4.10.0' | ||
default_llvm_version: '17.0.2' | ||
default_gnu_version: '13.2.0' | ||
timeout: "120" | ||
scheduler: "pjm" | ||
sys_cores_per_node: "48" | ||
sys_mem_per_node: "32" | ||
max_request: "1000" # n_ranks/n_nodes cannot exceed this | ||
n_ranks: '1000001' # placeholder value | ||
n_nodes: '1000001' # placeholder value | ||
batch_submit: "placeholder" | ||
mpi_command: "placeholder" | ||
#sys_arch: 'arch=linux-rhel8-a64fx' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.