Starting large amounts of jobs
Hello, following the discussion at the end of todays workshop I tried how the scheduler behaves when issuing a larger amount of jobs (Marcus essentially told me I could use approach 3 as detailed below). To frame my question, here is what want to do and how I try to do it (numbers just to get the magnitude): # Problem 10 Binaries, 10k input files. Run every binary on every input file, and collect all the results (= parse stdout). It seems array jobs are the tool for that, however the size of an array job is capped at 1000, apparently because larger jobs make the scheduler slow. # Approach 1 - Create one file with 10*10k lines (./binary input-file) - Create one job with 1000 array jobs - Let ID be the id of the current array job - Identify the slice (10*10k) / 1000 * ID .. (10*10k) / 1000 * (ID + 1) - Execute all lines from the slice sequentially - Pro: Only one job, no scheduling hassle on the user side. - Con: weird script logic, 100 individual tasks in one scheduled array job, sometimes bad load balancing (i.e. one job takes way longer than the others) # Approach 2 - Create (10*10k)/1000 files, each containing 1000 lines - Create as many jobs, one for each file - Load the ID'th line from the respective file and execute it - Push all these jobs to the scheduler - Pro: Easier logic in each script - Con: Multiple job, I have to take care of submitting and waiting for the results in parallel. # Approach 3 - Create 10*10k jobs, let the scheduler deal with it - Every job executes one task (./binary input-file) - Pro: very simple jobs and scripts - Con: huge amount of jobs, can the scheduler handle that? I'm using approach 1 already and it works somewhat fine. That being said the script logic is rather involved and load balancing is not that great. I routinely have a handful of jobs at the end that run for 10 minutes or so longer than all the others where a single task is capped at one minute. This is pretty annoying. Also, we are exploring what the best-practice should be here... I just tried approach 2 and it did not go to well, even for only about 12k tasks. To try the scaling I made every array job 100 in size, so I tried to schedule about 120 jobs. While it went well for about 75 jobs, sbatch started to come back with the following afterwards: sbatch: error: Slurm temporarily unable to accept job, sleeping and retrying and quickly afterwards: sbatch: error: Batch job submission failed: Resource temporarily unavailable I then tried to "relax" a bit and added a one second delay between the calls to sbatch... and it does not change everything. Thus I don't have a lot of hope for approach 3... Any comments or ideas? Best, Gereon -- Gereon Kremer Lehr- und Forschungsgebiet Theorie Hybrider Systeme RWTH Aachen Tel: +49 241 80 21243
participants (1)
-
Gereon Kremer