Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7261 Discussions

Code stops after a call to BLACS_GRIDEXIT

OP1
New Contributor III
3,075 Views

The following code (built with ifx 2025.2.1 on Windows and run with 7 processes) silently crashes after the call to BLACS_GRIDEXIT on line 19 (namely, it does not execute anything past line 19).

The code is linked with the ilp64 MKL libraries for BLACS and ScaLAPACK.

Am i missing something obvious here?

PROGRAM TEST
IMPLICIT NONE (TYPE, EXTERNAL)
EXTERNAL BLACS_GRIDINIT, BLACS_GRIDINFO, BLACS_PINFO, BLACS_GET, BLACS_PCOORD, BLACS_GRIDEXIT
INTEGER(KIND = 8), EXTERNAL :: BLACS_PNUM
INTEGER(KIND = 8) :: ICTXT, MY_ID, N_PROCS, N_PROW, N_PCOL
INTEGER(KIND = 8) :: MY_ROW, MY_COL

CALL BLACS_PINFO(MY_ID, N_PROCS)
N_PROW = INT(SQRT(REAL(N_PROCS)))
N_PCOL = N_PROCS/N_PROW

CALL BLACS_GET(-1_8, 0_8, ICTXT)
CALL BLACS_GRIDINIT(ICTXT, 'C', N_PROW, N_PCOL)
CALL BLACS_GRIDINFO(ICTXT, N_PROW, N_PCOL, MY_ROW, MY_COL)
IF (MY_ID /= 6) THEN
    WRITE(*, *) 'C', MY_ID, BLACS_PNUM(ICTXT, MY_ROW, MY_COL), MY_ROW, MY_COL
END IF

CALL BLACS_GRIDEXIT(ICTXT)

CALL BLACS_GET(-1_8, 0_8, ICTXT)
CALL BLACS_GRIDINIT(ICTXT, 'R', N_PROW, N_PCOL)
CALL BLACS_GRIDINFO(ICTXT, N_PROW, N_PCOL, MY_ROW, MY_COL)
IF (MY_ID /= 6) THEN
    WRITE(*, *) 'R', MY_ID, BLACS_PNUM(ICTXT, MY_ROW, MY_COL), MY_ROW, MY_COL
    CALL BLACS_GRIDEXIT(ICTXT)
END IF

END PROGRAM TEST

 

0 Kudos
11 Replies
OP1
New Contributor III
3,074 Views

Line 26 in the code above should be removed (i should have done so before posting...) but this does not modify the outcome here.

0 Kudos
OP1
New Contributor III
3,073 Views

Also, when did we lose the ability to edit posts?

0 Kudos
Aleksandra_K
Moderator
1,912 Views

Hi,


Does the same problem happen when running for 6 or 8 processes?


You create a grid of size 2×3 (6 positions), but you're running with 7 processes. Process number 6 has no valid grid position, which can cause the MPI to abort. 


Regards,

Alex


0 Kudos
OP1
New Contributor III
1,889 Views

I am afraid that I do not understand your reply at all. Are you saying that it is impossible to run an arbitrary number of processes for a program relying on BLACS, and that only the number of processes that match exactly the number of processes in a BLACS process grid is allowed? This is not how BLACS work!

I simplified the code a bit further:

PROGRAM TEST
IMPLICIT NONE (TYPE, EXTERNAL)
EXTERNAL BLACS_GRIDINIT, BLACS_PINFO, BLACS_GET, BLACS_GRIDEXIT
INTEGER(KIND = 8) :: ICTXT, MY_ID, N_PROCS, N_PROW, N_PCOL

CALL BLACS_PINFO(MY_ID, N_PROCS)
N_PROW = INT(SQRT(REAL(N_PROCS)))
N_PCOL = N_PROCS/N_PROW
CALL BLACS_GET(-1_8, 0_8, ICTXT)
CALL BLACS_GRIDINIT(ICTXT, 'C', N_PROW, N_PCOL)
CALL BLACS_GRIDEXIT(ICTXT)

WRITE(*, *) MY_ID

END PROGRAM TEST

Running the code with 7 processes, the output is:  

 1
 0
 3
 5
 2
 4
Press any key to continue . . .

Why isn't process 7 printing anything here? 

0 Kudos
OP1
New Contributor III
1,888 Views

[ignoring the fact that I should have written "process 6" in my latest message]

In fact, when I run this simplified example consecutively, multiple times, sometimes I get no output at all, sometimes only a subset of the processes 0... 5 print something. There is a randomness to it. Can you try to repeat this on your side?

0 Kudos
Aleksandra_K
Moderator
1,867 Views

Could you share how exactly you are running the code? So that I could precisely reproduce your issue.


0 Kudos
OP1
New Contributor III
1,848 Views

Here is the BuildLog.htm file that is produced when building the last example above.

 

 

0 Kudos
Aleksandra_K
Moderator
1,525 Views

Hi, 


We investigated your issue and confirmed that the problem is in the gridexit call. The context of rank 6 is set to -1 during blacs_gridinit -> blacs_gridmap, which causes the error when gridexit is called. You were right that it is fine to use only a subset of processes for computation. Nevertheless, it is not a bug, as this behavior of gridexit is consistent with the reference BLACS implementation (SCALAPACK: blacs_gridexit_).


Regards, 

Alex


0 Kudos
Aleksandra_K
Moderator
1,365 Views

Hi,

Do you have any further questions on the topic?


0 Kudos
Aleksandra_K
Moderator
1,236 Views

Hi,


I hope that you found the above explanation useful. We'll monitor this thread for another 3 days for any follow-up questions. If there's no response within that time, this thread will no longer be actively supported by Intel.


Regards,

Alex


0 Kudos
Aleksandra_K
Moderator
1,023 Views

With no response from you, this issue will no longer be tracked by Intel. If you need any additional information, please post a new question, ideally in a new thread,


Regards,

Alex


0 Kudos
Reply