Hi Johannes,

yes, we are also seeing these messages in the logfiles, yet we could not solve the issue up to now.


Best
Marcus

On 5/3/19 10:49 AM, Johannes Sauer wrote:
Dear all,

from time to time I keep getting erros similar to this one when submitting jobs:

RuntimeError: Execution of 'sbatch -t 3600 --mem-per-cpu=10G --account=rwth0333 --job-name ChairliftRide_8192x4096_QP22_FTBE0to32 -o log/ChairliftRide_8192x4096_QP22_FTBE0to32.queue_out.log -e log/ChairliftRide_8192x4096_QP22_FTBE0to32.queue_out.log rz_start_anysim.sh' exited with status != 0 (1): sbatch: error: Batch job submission failed: Socket timed out on send/recv operation

Anyone else having this problem? Doing the same submission again works fine. Looks like the controller can not handle the load of submissions?

Best

Johannes


_______________________________________________
claix18-slurm-pilot mailing list -- claix18-slurm-pilot@lists.rwth-aachen.de
To unsubscribe send an email to claix18-slurm-pilot-leave@lists.rwth-aachen.de

-- 
Marcus Wagner, Dipl.-Inf.

IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner@itc.rwth-aachen.de
www.itc.rwth-aachen.de