Hi Johannes,
yes, we are also seeing these messages in the logfiles, yet we could
not solve the issue up to now.
Best
Marcus
On 5/3/19 10:49 AM, Johannes Sauer
wrote:
Dear
all,
from time to time I keep getting erros similar to this one when
submitting jobs:
RuntimeError: Execution of 'sbatch -t 3600 --mem-per-cpu=10G
--account=rwth0333 --job-name
ChairliftRide_8192x4096_QP22_FTBE0to32 -o
log/ChairliftRide_8192x4096_QP22_FTBE0to32.queue_out.log -e
log/ChairliftRide_8192x4096_QP22_FTBE0to32.queue_out.log
rz_start_anysim.sh' exited with status != 0 (1): sbatch: error:
Batch job submission failed: Socket timed out on send/recv
operation
Anyone else having this problem? Doing the same submission again
works fine. Looks like the controller can not handle the load of
submissions?
Best
Johannes
_______________________________________________
claix18-slurm-pilot mailing list -- claix18-slurm-pilot@lists.rwth-aachen.de
To unsubscribe send an email to claix18-slurm-pilot-leave@lists.rwth-aachen.de
--
Marcus Wagner, Dipl.-Inf.
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner@itc.rwth-aachen.de
www.itc.rwth-aachen.de