Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_data_override sporadic failures in CI testing #1480

Closed
rem1776 opened this issue Mar 18, 2024 · 0 comments · Fixed by #1595 · May be fixed by #1579
Closed

test_data_override sporadic failures in CI testing #1480

rem1776 opened this issue Mar 18, 2024 · 0 comments · Fixed by #1595 · May be fixed by #1579

Comments

@rem1776
Copy link
Contributor

rem1776 commented Mar 18, 2024

Describe the bug
occasionally the data_override monotonically increasing/decreasing tests fail in the CI but passes on subsequent runs:

expecting success of test_data_override2_mono.1 'test_data_override with monotonically increasing and decreasing data sets (r4)': 
    mpirun -n 6 ../test_data_override_ongrid_${KIND}
    
NOTE from PE     0: MPP_DOMAINS_SET_STACK_SIZE: stack size set to    32768.
NOTE from PE     0: MPP_DOMAINS_SET_STACK_SIZE: stack size set to 17280000.
 test_data_override_emc domain decomposition
whalo =    2, ehalo =    2, shalo =    2, nhalo =    2
  X-AXIS =  180 180

FATAL from PE     4: NetCDF: Unknown file format: netcdf_file_open:INPUT/grid_spec.nc

#0  0x7795ff7006dd in ???
#1  0x7795ffb4d34f in ???
#2  0x7795ffbace4f in ???
#3  0x7795ffbab916 in ???
#4  0x7795ffb5197f in ???
#5  0x7795ffc4cec9 in ???
#6  0x7795ffc4d212 in ???
#7  0x7795ffb35c24 in ???
#8  0x7795ffb33713 in ???
#9  0x7795ffdbd9c6 in ???
#10  0x7795ffdc33ad in ???
#11  0x4034be in ???
#12  0x4094fa in ???
#13  0x7795fdfd5eaf in ???
#14  0x7795fdfd5f5f in ???
#15  0x402324 in ???
#16  0xffffffffffffffff in ???
Abort(1) on node 4 (rank 4 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 4
error: last command exited with $?=1
  Y-AXIS =   60  60  60
not ok 1 - test_data_override with monotonically increasing and decreasing data sets (r4)
FAIL: test_data_override2_mono.sh 1 - test_data_override with monotonically increasing and decreasing data sets (r4)
#	
#	    mpirun -n 6 ../test_data_override_ongrid_${KIND}
#	    
FAIL: test_data_override2_mono.sh 1 - test_data_override with monotonically increasing and decreasing data sets (r4)

To Reproduce
TBD, only seen this pop up in CI testing every once in a while, and always disappears on a rerun. Most likely due to some type of file system slowness on the github hosted runner, I've seen similar errors while running on the cloud.

Expected behavior
not fail

System Environment
Describe the system environment, include:
CI image (gcc 12+ mpich)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant