Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Should I place a barrier before calling pdpotri()?

Georgios_S_
New Contributor II
358 Views

I am using pdpotrf() in order to perform the Cholesky decomposition. Then I want to call pdpotri(), in order to invert the matrix. The function is called from every process, just after pdpotrf(). Should I put a barrier there, so that I am sure that all the processes are done with the Cholesky factorization, and then move on to the inversion part, or it's not needed?

I wrote some examples with tiny inputs, which show that it's not needed, but I want to be sure that I am not just (un)lucky and face a problem with larger inputs.

0 Kudos
5 Replies
VipinKumar_E_Intel
358 Views

 

Could you please post your example/pseudo code the way you are calling these functions?

 

0 Kudos
Georgios_S_
New Contributor II
358 Views

Hi Vipin,

  of course, here is how I call them:

pdpotrf(type, &N, A_loc, &i_one, &i_one, descA, &info);
MPI_Barrier(MPI_COMM_WORLD); //do we need to set a barrier before inverting?
pdpotri(type, &N, A_loc, &i_one, &i_one, descA, &info);

  I just worry that inverting may start, before some other process has terminated the Cholesky decomposition. Does pdpotri() take care of this? I mean, it checks and waits if needed. Or does pdpotri() work only in the submatrix of its calling process (i.e. A_loc in my code)? If so, then no barrier is needed.

You can check the minor example here, in post #4. The post's author is Ying, but it's pretty close to my code. So just imagine the code there, by just adding the last lines of code I posted here, that would make my new minor example.

Thanks for taking time on my question,

George

0 Kudos
VipinKumar_E_Intel
358 Views

Hi Georgios,

   Since you are running these and part of a single rank, you do not need to have a barrier in between, as these run one after other (or sequential manner) in the MPI proc.

--Vipin

 

0 Kudos
Georgios_S_
New Contributor II
358 Views

Hi Vipin,

  excuse me, but I do not understand this "and part of a single rank". What does it mean? Maybe I am confused with the term rank.

Thanks,

George

0 Kudos
Georgios_S_
New Contributor II
358 Views

Just for the record, I had asked something similar in Stackoverflow and the response was in agreement with Vipin's post. I got this comment:

Explicit barrier synchronisation is almost never needed in correct MPI applications that perform pure message passing as the synchronisation occurs naturally through the data dependency between the send and receive calls. Barriers are usually needed in order to synchronise with the completion of external events, e.g. concurrent data I/O, or when benchmarking in order to get the processes in sync.

and this answer:

 

While I have not looked into the details of pdpotri() and pdpotrf(), I see two cases:

1) There needs to be a barrier between the two functions. In this case, however, because pdpotrf() must always come after pdpotri(), it would make the most sense for there to be an implicit boundary built-in to the beginning of pdpotri().

2) There does not need to be a barrier between the two functions.

In both cases, it should not be necessary for you to write your own explicit barrier using MPI_Barrier().

0 Kudos
Reply