Jobs occasionally failing

26 May 2019

      Hi all,

some of my jobs are failing. It happens very rarely and with no apparent 
reason. Log says it got SIGKILL, although sacct just says COMPLETED. I 
had a job this week with this problem and it ran without issue after 
restarting it. This is particularly annoying since my jobs usually take
...
1 day. I'm not exceeding my requested runtime or memory limits.
I had just another one like it. I restarted it and believe it will run 
through without issue. I attached what sacct reported. It failed on ncm0217.

Anyone had issues like this?

Best

Johannes

-- 
M.Sc. Johannes Sauer
Researcher

Institut fuer Nachrichtentechnik
RWTH Aachen University
Melatener Str. 23
52074 Aachen
Tel +49 241 80-27678
Fax +49 241 80-22196
sauer@ient.rwth-aachen.de
http://www.ient.rwth-aachen.de

Johannes Sauer

Marcus Wagner

tags

participants (2)