$Id$ On Kabul, the code sometimes hangs after some 1000 or time steps (sometimes less, sometimes more). Here is the output of mpitask for a run on 5 CPUs (nprocy=nprocz=2): TASK (G/L) FUNCTION PEER|ROOT TAG COMM COUNT DATATYPE 0/0 LAM_MPI_Fortran_ Recv 3/3 86 WORLD* 1 REAL 1/1 LAM_MPI_Fortran_ Recv 0/0 15 WORLD* 1 REAL 2/2 LAM_MPI_Fortran_ Wait 0/0 5 WORLD 49152 REAL 3/3 LAM_MPI_Fortran_ Wait 1/1 5 WORLD 49152 REAL This basically means that processor 0 waits for data from proc 3 to receive, proc 1 waits for data from proc0, while procs2 and 3 have read 49152 real numbers each and are MPI_WAITing. And this is the output of mpimsg SRC (G/L) DEST (G/L) TAG COMM COUNT DATATYPE MSG 2/2 0/0 5 WORLD 49152 REAL n2,#93 2/2 0/0 6 WORLD 49152 REAL n2,#94 3/3 1/1 5 WORLD 49152 REAL n2,#99 3/3 1/1 6 WORLD 49152 REAL n2,#103 3/3 0/0 7 WORLD 4608 REAL n2,#95 3/3 0/0 8 WORLD 4608 REAL n2,#96 3/3 0/0 9 WORLD 4608 REAL n2,#97 3/3 0/0 10 WORLD 4608 REAL n2,#98 2/2 1/1 7 WORLD 4608 REAL n2,#100 2/2 1/1 8 WORLD 4608 REAL n2,#101 2/2 1/1 9 WORLD 4608 REAL n2,#102 2/2 1/1 10 WORLD 4608 REAL n2,#104 Tag 5 in WORLD is tolowz: integer :: tolowy=3,touppy=4,tolowz=5,touppz=6 ! msg. tags integer :: TOll=7,TOul=8,TOuu=9,TOlu=10 ! msg. tags for corners WORLD* seems to be an internal communicator used for reduction or broadcast operations. Thus everything hangs because: 3 waits for 1 (Wait; 5=tolowz) 1 waits for 0 (Recv; 15=??) 0 waits for 3 (Recv; 86=??) Possible explanation: Some reduction (or similar) operations are occurring between init_isendrecv and finalize_isendrecv and cause the mutual dependency loop depicted above. MPI_BARRIER in the right place would then help. wd, Wed Jul 24 14:23:03 CEST 2002