Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

pblas complex functions don't ask for complex variables

GTA
Beginner
445 Views

Hello,

I've been trying to use the parallel blas, pblas, functions in a windows c++ environment and I've run into a problem with the complex functions.  I am using the 'z' variation of all the functions (pzgemv for example) to compute solutions to elctromagnetic problems, but unfortunately the function calls seem to expect only a double instead of the MKL_Complex16 that the none parallel versions (zgemv for example) expect.  I've attached the mkl_pblas.h file that came with my purchase of MKL.  Am I trying to connect to the wrong header file?  If so please direct me to the solution.

Thank you,

-Gabe

0 Kudos
8 Replies
Noah_C_Intel
Employee
445 Views

Hi Gabe, thanks for your question. This is a good one that many can learn from.

These functions are a part of the time-tested standard. So we must stick with the original conventions so that people can re-use their old legacy codes, swap out the dependency, and presto, get better performance without rewriting the code.

That being said, your code should work even though the header may be telling you otherwise. So the recommendation would be to instruct your compiler to ignore function declarations types (print a warning instead of inducing a compile-time error).

So this function accepts pointers and you can indeeed feed in (MKL_Complex *).

Let me know how it goes!

0 Kudos
GTA
Beginner
445 Views

Thank you very much for the clarification.  I do have another question though.  How do I instruct my compiler to ignore function declarations types??? I've never heard of such a thing.  Also, would the function still work if i changed double to MKL_Complex16 in the header?

0 Kudos
Noah_C_Intel
Employee
445 Views

What compiler are you using? I can answer question #1 after that.

The P-version is the clustered version that uses a more general interface to pass complex numbers. One can essentially use interleaved and non-interleaved complex number components. The ordinary (non-P) version (this is not P-BLAS!) uses the COMPLEX type from the C language i.e., only interleaved complex number components are possible.

zgemv                   This is a BLAS level 3 function, and the interface is an industry standard

pzgemv                This is the so-called P-BLAS with MPI support underneath (cluster).

Pzgemv accepts pointers to double-precision numbers, but also accepts strides. This way one can pass RIRIRIRIRI… or RRRRR+IIIII where R is the real component and I is the imaginary component of a complex number.

0 Kudos
GTA
Beginner
445 Views

I am using c++ in the visual studios 2010 windows environment.

Okay, so if I had a matrix A be n by n, in a processor block that is blocked as nprocs by 1 where nprocs=n for the sake of this example, I should pass pzgemv a local double array A_local of length 2*n and pass the dimensions of A_local as row=ia=1 by column=ij=n even though the actual length is 2*n (one for real and one for imag)?

0 Kudos
Noah_C_Intel
Employee
445 Views

I think we may be over-engineering the problem here. Could you try editing the mkl-pblas.h file to match the following? This may be the lowest hanging fruit idea.

void    pzgemv( char *trans, MKL_INT *m, MKL_INT *n, MKL_Complex16 *alpha, MKL_Complex16 *a, MKL_INT *ia, MKL_INT *ja, MKL_INT *desca, MKL_Complex16 *x, MKL_INT *ix, MKL_INT *jx, MKL_INT *descx, MKL_INT *incx, MKL_Complex16 *beta, MKL_Complex16 *y, MKL_INT *iy, MKL_INT *jy, MKL_INT *descy, MKL_INT *incy );

Option #2 is figuring out how to ignore the function declaration type and just feeding in the MKL_Complex types anyway.

Option #3 is the RIRI or RRR+IIII idea. I like this one the least because it seems to be the least elegant.

0 Kudos
Noah_C_Intel
Employee
445 Views

By the way, I found a good resources for the ins and outs for using pblas in a c++ environment. This should address a number of different things and possible follow ups!

http://andyspiros.wordpress.com/2011/07/08/an-example-of-blacs-with-c/

 

0 Kudos
Noah_C_Intel
Employee
445 Views

You should be able to suppress any diagnostic with the option “-diag-disable <n>” where n is the diagnostic number (usually this number is printed as part of the diagnostic info output by the compiler).

So I think all 3 options possible have a path for success. Please let me know which one works best for you, because I'd like to communicate this as a BKM for all future mkl-pblas users.

0 Kudos
Hans_P_Intel
Employee
445 Views

Gabe,

>>> I've been trying to use the parallel blas, pblas, functions [...]
Spelling it "Parallel BLAS" suggests that the "other BLAS" (non-P) is not parallel (which is not true). First, I hope you are aware that PBLAS is about cluster support i.e., MPI (via BLACS) is used. The regular BLAS functions (industry-standard) are parallelized as well, and use multiple threads via OpenMP. To summarize, BLAS and PBLAS are different interfaces and each is expected to be conformant to its standard.

What I suggest is: just call zgemv in order to compute the solutions of your electromagnetic problems! I hope that this is easier than you expected, and the little PBLAS tour let you know that your can even use a cluster of computers (if needed) with not much coding effort. Btw, the pzgemv function allows you to pass complex numbers in two different ways: (1) interleaved, and (2) non-interleaved. This is the reason that it seems to expect only double-precision values. For interleaved complex number components (MKL_Complex16), one would essentially pass the address of the real component of the first complex number, and the address of the complex component of the same first complex number along with a stride of two.

Allow me a last comment: employing multiple threads is kind of the default in Intel MKL (except when you link against the sequential library). You can select your options to link against Intel MKL easily (http://software.intel.com/sites/products/mkl/MKL_Link_Line_Advisor.html). Also, you have multiple handles in order to adjust the threading (global, per function-domain, etc.).

- Hans

0 Kudos
Reply