Hello Marcus,

 

unfortunately I get essentially the same problem. srun spawns #cores instances of the CFX solver, every one of which tries to access all cores.

Since they still try to communicate with the other node over ssh the result is the same error as below.

 

Regards,

Thomas

 

 

Von: Marcus Wagner [mailto:wagner@itc.rwth-aachen.de]
Gesendet: Dienstag, 12. Februar 2019 15:21
An: claix18-slurm-pilot@lists.rwth-aachen.de
Betreff: [claix18-slurm-pilot] Re: Multi-Node ANSYS simulations

 

Dear Thomas,


could you please test the following:

srun cfx5solve -batch -parallel -partition $SLURM_NTASKS -def job.def -par-dist "$CFXHOSTS" -start-method "Intel MPI Distributed Parallel".


Best
Marcus

On 2/12/19 11:10 AM, Gier, Thomas wrote:

Hello,
 
I'm having issues running ANSYS CFX calculations across multiple nodes. 
Single-node simulations run fine, but multi-node configurations crash because ssh connections are being denied:
 
" +--------------------------------------------------------------------+
 |                An error has occurred in cfx5solve:                 |
 |                                                                    |
 | Remote connection to ncm0791.hpc.itc.rwthaachen.de                 |
 | (ncm0791.hpc.itc.rwth-aachen.de) could not be started, or exited   |
 | with return code 255.  It gave the following output:               |
 |                                                                    |
 |    Permission denied (publickey,gssapi-keyex,gssapi-with-mic,pass- |
 | word,hostbased).                                                   |
 |                                                                    |
 | Check that you have typed the hostname correctly, and that you     |
 | have an account "tg084461" on the specified host with access       |
 | permission from this host.  You can use the following command to   |
 | check the connection to a UNIX machine:                            |
 |                                                                    |
 |   ssh ncm0791.hpc.itc.rwth-aachen.de uname                         |
+--------------------------------------------------------------------+"
 
Am I missing something in my submission script, or is this a cluster config issue?
 
Regards,
Thomas Gier



_______________________________________________
claix18-slurm-pilot mailing list -- claix18-slurm-pilot@lists.rwth-aachen.de
To unsubscribe send an email to claix18-slurm-pilot-leave@lists.rwth-aachen.de



-- 
Marcus Wagner, Dipl.-Inf.
 
IT Center
Abteilung: Systeme und Betrieb
RWTH Aachen University
Seffenter Weg 23
52074 Aachen
Tel: +49 241 80-24383
Fax: +49 241 80-624383
wagner@itc.rwth-aachen.de
www.itc.rwth-aachen.de