Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error running test on TUD cluster #5

Open
thawn opened this issue Nov 25, 2022 · 1 comment
Open

error running test on TUD cluster #5

thawn opened this issue Nov 25, 2022 · 1 comment

Comments

@thawn
Copy link
Collaborator

thawn commented Nov 25, 2022

When I run the test run_workflow_test, I get the following error and the process never returns:

Task exception was never retrieved
future: <Task finished name='Task-25' coro=<_wrap_awaitable() done, defined at /app/env/lib/python3.9/asyncio/tasks.py:681> exception=RuntimeError('Command exited with non-zero exit code.\nExit code: 1\nCommand:\nsbatch /tmp/tmpzecm7bc_.sh\nstdout:\n\nstderr:\nsbatch: error: Unable to open file /tmp/tmpzecm7bc_.sh\n\n')>
Traceback (most recent call last):
  File "/app/env/lib/python3.9/asyncio/tasks.py", line 688, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/app/env/lib/python3.9/site-packages/distributed/deploy/spec.py", line 64, in _
    await self.start()
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 411, in start
    out = await self._submit_job(fn)
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 394, in _submit_job
    return self._call(shlex.split(self.submit_command) + [script_filename])
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 489, in _call
    raise RuntimeError(
RuntimeError: Command exited with non-zero exit code.
Exit code: 1
Command:
sbatch /tmp/tmpzecm7bc_.sh
stdout:

stderr:
sbatch: error: Unable to open file /tmp/tmpzecm7bc_.sh


Task exception was never retrieved
future: <Task finished name='Task-24' coro=<_wrap_awaitable() done, defined at /app/env/lib/python3.9/asyncio/tasks.py:681> exception=RuntimeError('Command exited with non-zero exit code.\nExit code: 1\nCommand:\nsbatch /tmp/tmppc8asb0e.sh\nstdout:\n\nstderr:\nsbatch: error: Unable to open file /tmp/tmppc8asb0e.sh\n\n')>
Traceback (most recent call last):
  File "/app/env/lib/python3.9/asyncio/tasks.py", line 688, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/app/env/lib/python3.9/site-packages/distributed/deploy/spec.py", line 64, in _
    await self.start()
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 411, in start
    out = await self._submit_job(fn)
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 394, in _submit_job
    return self._call(shlex.split(self.submit_command) + [script_filename])
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 489, in _call
    raise RuntimeError(
RuntimeError: Command exited with non-zero exit code.
Exit code: 1
Command:
sbatch /tmp/tmppc8asb0e.sh
stdout:

stderr:
sbatch: error: Unable to open file /tmp/tmppc8asb0e.sh


2022-11-25 14:37:52,840 - tornado.application - ERROR - Exception in callback functools.partial(<bound method IOLoop._discard_future_result of <tornado.platform.asyncio.AsyncIOLoop object at 0x2b332b6bd3d0>>, <Task finished name='Task-23' coro=<SpecCluster._correct_state_internal() done, defined at /app/env/lib/python3.9/site-packages/distributed/deploy/spec.py:330> exception=RuntimeError('Command exited with non-zero exit code.\nExit code: 1\nCommand:\nsbatch /tmp/tmp0xywjdi1.sh\nstdout:\n\nstderr:\nsbatch: error: Unable to open file /tmp/tmp0xywjdi1.sh\n\n')>)
Traceback (most recent call last):
  File "/app/env/lib/python3.9/site-packages/tornado/ioloop.py", line 741, in _run_callback
    ret = callback()
  File "/app/env/lib/python3.9/site-packages/tornado/ioloop.py", line 765, in _discard_future_result
    future.result()
  File "/app/env/lib/python3.9/site-packages/distributed/deploy/spec.py", line 369, in _correct_state_internal
    await w  # for tornado gen.coroutine support
  File "/app/env/lib/python3.9/site-packages/distributed/deploy/spec.py", line 64, in _
    await self.start()
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 411, in start
    out = await self._submit_job(fn)
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 394, in _submit_job
    return self._call(shlex.split(self.submit_command) + [script_filename])
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 489, in _call
    raise RuntimeError(
RuntimeError: Command exited with non-zero exit code.
Exit code: 1
Command:
sbatch /tmp/tmp0xywjdi1.sh
stdout:

stderr:
sbatch: error: Unable to open file /tmp/tmp0xywjdi1.sh


Task exception was never retrieved
future: <Task finished name='Task-34' coro=<_wrap_awaitable() done, defined at /app/env/lib/python3.9/asyncio/tasks.py:681> exception=RuntimeError('Command exited with non-zero exit code.\nExit code: 1\nCommand:\nsbatch /tmp/tmp25nk0l9d.sh\nstdout:\n\nstderr:\nsbatch: error: Unable to open file /tmp/tmp25nk0l9d.sh\n\n')>
Traceback (most recent call last):
  File "/app/env/lib/python3.9/asyncio/tasks.py", line 688, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/app/env/lib/python3.9/site-packages/distributed/deploy/spec.py", line 64, in _
    await self.start()
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 411, in start
    out = await self._submit_job(fn)
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 394, in _submit_job
    return self._call(shlex.split(self.submit_command) + [script_filename])
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 489, in _call
    raise RuntimeError(
RuntimeError: Command exited with non-zero exit code.
Exit code: 1
Command:
sbatch /tmp/tmp25nk0l9d.sh
stdout:

stderr:
sbatch: error: Unable to open file /tmp/tmp25nk0l9d.sh


Task exception was never retrieved
future: <Task finished name='Task-33' coro=<_wrap_awaitable() done, defined at /app/env/lib/python3.9/asyncio/tasks.py:681> exception=RuntimeError('Command exited with non-zero exit code.\nExit code: 1\nCommand:\nsbatch /tmp/tmp9ptr0d6r.sh\nstdout:\n\nstderr:\nsbatch: error: Unable to open file /tmp/tmp9ptr0d6r.sh\n\n')>
Traceback (most recent call last):
  File "/app/env/lib/python3.9/asyncio/tasks.py", line 688, in _wrap_awaitable
    return (yield from awaitable.__await__())
  File "/app/env/lib/python3.9/site-packages/distributed/deploy/spec.py", line 64, in _
    await self.start()
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 411, in start
    out = await self._submit_job(fn)
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 394, in _submit_job
    return self._call(shlex.split(self.submit_command) + [script_filename])
  File "/app/env/lib/python3.9/site-packages/dask_jobqueue/core.py", line 489, in _call
    raise RuntimeError(
RuntimeError: Command exited with non-zero exit code.
Exit code: 1
Command:
sbatch /tmp/tmp9ptr0d6r.sh
stdout:

stderr:
sbatch: error: Unable to open file /tmp/tmp9ptr0d6r.sh


/app/env/lib/python3.9/site-packages/dask/core.py:119: FutureWarning: Providing the `multichannel` argument positionally to gaussian is deprecated. Use the `channel_axis` kwarg instead.
  return func(*(_execute_task(a, cache) for a in args))
@JonathanCSmith
Copy link
Collaborator

I think the fix for this is to add in the 'local_directory' parameter to the cluster params. Will update with this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants