[claix18-slurm-pilot] Re: Starting large amounts of jobs

25 Feb 2019

      Hi Gereon,

if you worry about load balancing in scenario 1 what you could do is
use a central syncronization tool like a db where submitted jobs can
fetch one task atomically and execute it. Once there are no more tasks
to fetch from the DB the job ends. But I'm not sure what network
requests the clusters firewall allows. And it would be more difficult
to setup.

Greetings,
Eugen

On Mon, Feb 25, 2019 at 6:14 PM Gereon Kremer
<gereon.kremer@cs.rwth-aachen.de> wrote:
...
Hello,
following the discussion at the end of todays workshop I tried how the
scheduler behaves when issuing a larger amount of jobs (Marcus
essentially told me I could use approach 3 as detailed below). To frame
my question, here is what want to do and how I try to do it (numbers
just to get the magnitude):
# Problem
10 Binaries, 10k input files. Run every binary on every input file, and
collect all the results (= parse stdout).
It seems array jobs are the tool for that, however the size of an array
job is capped at 1000, apparently because larger jobs make the scheduler
slow.
# Approach 1
- Create one file with 10*10k lines (./binary input-file)
- Create one job with 1000 array jobs
- Let ID be the id of the current array job
- Identify the slice (10*10k) / 1000 * ID .. (10*10k) / 1000 * (ID + 1)
- Execute all lines from the slice sequentially
- Pro: Only one job, no scheduling hassle on the user side.
- Con: weird script logic, 100 individual tasks in one scheduled array
job, sometimes bad load balancing (i.e. one job takes way longer than
the others)
# Approach 2
- Create (10*10k)/1000 files, each containing 1000 lines
- Create as many jobs, one for each file
- Load the ID'th line from the respective file and execute it
- Push all these jobs to the scheduler
- Pro: Easier logic in each script
- Con: Multiple job, I have to take care of submitting and waiting for
the results in parallel.
# Approach 3
- Create 10*10k jobs, let the scheduler deal with it
- Every job executes one task (./binary input-file)
- Pro: very simple jobs and scripts
- Con: huge amount of jobs, can the scheduler handle that?
I'm using approach 1 already and it works somewhat fine. That being said
the script logic is rather involved and load balancing is not that
great. I routinely have a handful of jobs at the end that run for 10
minutes or so longer than all the others where a single task is capped
at one minute. This is pretty annoying. Also, we are exploring what the
best-practice should be here...
I just tried approach 2 and it did not go to well, even for only about
12k tasks. To try the scaling I made every array job 100 in size, so I
tried to schedule about 120 jobs.
While it went well for about 75 jobs, sbatch started to come back with
the following afterwards:
sbatch: error: Slurm temporarily unable to accept job, sleeping and retrying
and quickly afterwards:
sbatch: error: Batch job submission failed: Resource temporarily unavailable
I then tried to "relax" a bit and added a one second delay between the
calls to sbatch... and it does not change everything.
Thus I don't have a lot of hope for approach 3...
Any comments or ideas?
Best,
Gereon
--
Gereon Kremer
Lehr- und Forschungsgebiet Theorie Hybrider Systeme
RWTH Aachen
Tel: +49 241 80 21243
_______________________________________________
claix18-slurm-pilot mailing list -- claix18-slurm-pilot@lists.rwth-aachen.de
To unsubscribe send an email to claix18-slurm-pilot-leave@lists.rwth-aachen.de

[claix18-slurm-pilot] Re: Starting large amounts of jobs

Eugen Beck