Hi all, some of my jobs are failing. It happens very rarely and with no apparent reason. Log says it got SIGKILL, although sacct just says COMPLETED. I had a job this week with this problem and it ran without issue after restarting it. This is particularly annoying since my jobs usually take
1 day. I'm not exceeding my requested runtime or memory limits.
I had just another one like it. I restarted it and believe it will run through without issue. I attached what sacct reported. It failed on ncm0217. Anyone had issues like this? Best Johannes -- M.Sc. Johannes Sauer Researcher Institut fuer Nachrichtentechnik RWTH Aachen University Melatener Str. 23 52074 Aachen Tel +49 241 80-27678 Fax +49 241 80-22196 sauer@ient.rwth-aachen.de http://www.ient.rwth-aachen.de