topic Should I place a barrier before calling pdpotri()? in Intel® oneAPI Math Kernel Library

Should I place a barrier before calling pdpotri()?

Georgios_S_ — Wed, 15 Jul 2015 20:08:24 GMT

I am using pdpotrf() in order to perform the Cholesky decomposition. Then I want to call pdpotri(), in order to invert the matrix. The function is called from every process, just after pdpotrf(). Should I put a barrier there, so that I am sure that all the processes are done with the Cholesky factorization, and then move on to the inversion part, or it's not needed?

I wrote some examples with tiny inputs, which show that it's not needed, but I want to be sure that I am not just (un)lucky and face a problem with larger inputs.

VipinKumar_E_Intel — Thu, 16 Jul 2015 03:43:39 GMT

Could you please post your example/pseudo code the way you are calling these functions?

Hi Vipin,

Georgios_S_ — Thu, 16 Jul 2015 14:04:55 GMT

Hi Vipin,

of course, here is how I call them:

pdpotrf(type, &N, A_loc, &i_one, &i_one, descA, &info);
MPI_Barrier(MPI_COMM_WORLD); //do we need to set a barrier before inverting?
pdpotri(type, &N, A_loc, &i_one, &i_one, descA, &info);

I just worry that inverting may start, before some other process has terminated the Cholesky decomposition. Does pdpotri() take care of this? I mean, it checks and waits if needed. Or does pdpotri() work only in the submatrix of its calling process (i.e. A_loc in my code)? If so, then no barrier is needed.

You can check the minor example here, in post #4. The post's author is Ying, but it's pretty close to my code. So just imagine the code there, by just adding the last lines of code I posted here, that would make my new minor example.

Thanks for taking time on my question,

George

Hi Georgios,

VipinKumar_E_Intel — Fri, 17 Jul 2015 05:26:31 GMT

Hi Georgios,

Since you are running these and part of a single rank, you do not need to have a barrier in between, as these run one after other (or sequential manner) in the MPI proc.

--Vipin

Hi Vipin,

Georgios_S_ — Fri, 17 Jul 2015 10:58:52 GMT

Hi Vipin,

excuse me, but I do not understand this "and part of a single rank". What does it mean? Maybe I am confused with the term rank.

Thanks,

George

Just for the record, I had

Georgios_S_ — Fri, 17 Jul 2015 11:02:31 GMT

Just for the record, I had asked something similar in Stackoverflow and the response was in agreement with Vipin's post. I got this comment:

Explicit barrier synchronisation is almost never needed in correct MPI applications that perform pure message passing as the synchronisation occurs naturally through the data dependency between the send and receive calls. Barriers are usually needed in order to synchronise with the completion of external events, e.g. concurrent data I/O, or when benchmarking in order to get the processes in sync.

and this answer:

While I have not looked into the details of pdpotri() and pdpotrf(), I see two cases:

1) There needs to be a barrier between the two functions. In this case, however, because pdpotrf() must always come after pdpotri(), it would make the most sense for there to be an implicit boundary built-in to the beginning of pdpotri().

2) There does not need to be a barrier between the two functions.

In both cases, it should not be necessary for you to write your own explicit barrier using MPI_Barrier().