Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

gemm(A,A,A) like possible?

rudolfii
Beginner
659 Views

Hello,

not knowing the internals I don't know if it is possible to safely write matrix multiplication in the form like

gemm(A, A, A)

where (e.g.) all matrices are the same, or one has to use some temporary matrix as

Z=A

gemm(A,A,Z)

A=Z

(The code above should return A = A + A*A.)

All documentation usually mentions that in gemm(A,B,C) A, B are untouched and C is overwritten. But this really does not answer the question since it somehow expects A, B, C to be different (or at least that C is different from A and B).

Thanks!

Ruda

0 Kudos
14 Replies
TimP
Honored Contributor III
659 Views
I think the documentation does answer the question; you are considering forcing a violation of the specification on which operands are unchanged. Besides, the reference implementation is Fortran, and compilers were expected to implement at least all "legal" optimizations, including those which rely on the caller's use of arguments being in compliance with Fortran.
You could compile the reference source code with invocation of an option which supports aliased arguments, such as ifort /assume:dummy_aliases (which had to be put in place because of the IMSL practice of supporting over-write, in violation of Fortran standard). Then you would have to test all 4 combinations of the TRANS arguments, to check whether your proposed practice breaks it.
0 Kudos
rudolfii
Beginner
659 Views
Quoting - tim18
I think the documentation does answer the question; you are considering forcing a violation of the specification on which operands are unchanged. Besides, the reference implementation is Fortran, and compilers were expected to implement at least all "legal" optimizations, including those which rely on the caller's use of arguments being in compliance with Fortran.
You could compile the reference source code with invocation of an option which supports aliased arguments, such as ifort /assume:dummy_aliases (which had to be put in place because of the IMSL practice of supporting over-write, in violation of Fortran standard). Then you would have to test all 4 combinations of the TRANS arguments, to check whether your proposed practice breaks it.

I think the documentation does answer the question; you are considering forcing a violation of the specification on which operands are unchanged.

Well, may be, may be not.

The meaning ofoperands are unchanged has always only had the meaning to me that these are not changed by the procedure itself (as for example in the case the procedure only reads the operands).

Anyway, the behaviour I now see puzzles me.

I have tried and tested a gemm(A,B,B, beta=0.0) like call for 2x2 matrices. And it really returns zero in B (i.e. returns a wrong result).

Nonetheless, before I wrote anything of this thread, in one of my programs I had tried and written an expression along these lines, and it had worked. Only afterwards I started to hesitate and decided to ask here in the list. Now, though I have tried quite a few things, I don't know how it is possible. Any test I've done shows gemm(A,B,B, beta=0.0) doesn't work (for matrices 2x2), however, in the 'big' program it somehow works...

In the program, the actual call is

call gemm(W(:,:,i), G(:,:,1), G(:,:,1), beta = zz)

where zz denotes a complex double precision zero, G and W are complex double precision matrices, the sizes of the first two indices is 36x36, i is some integer index.

(I could understand if there were G(:,1,:) instead, but it is not...; the same for e.g. G(:,:,1) + 0.0 instead of the 2nd argument; again it's not.)

Can anything happen so that such behaviour would be possible?

(For the tests I used the same compiler, mkl and options all the time.)

Thanks

Ruda

0 Kudos
TimP
Honored Contributor III
659 Views
The compiler would likely copy the array sections into new local arrays, so you would not be over-writing an input.
The ifort option -check:arg_temp_created should give you a run-time message if that happens.
For G(:,1,:) this would be mandatory.
0 Kudos
yuriisig
Beginner
659 Views
Quoting - rudolfii

Hello,

not knowing the internals I don't know if it is possible to safely write matrix multiplication in the form like

gemm(A, A, A)

where (e.g.) all matrices are the same, or one has to use some temporary matrix as

Z=A

gemm(A,A,Z)

A=Z

(The code above should return A = A + A*A.)

All documentation usually mentions that in gemm(A,B,C) A, B are untouched and C is overwritten. But this really does not answer the question since it somehow expects A, B, C to be different (or at least that C is different from A and B).

Thanks!

Ruda


No.
0 Kudos
rudolfii
Beginner
659 Views
Quoting - tim18
The compiler would likely copy the array sections into new local arrays, so you would not be over-writing an input.
The ifort option -check:arg_temp_created should give you a run-time message if that happens.
For G(:,1,:) this would be mandatory.

Thanks.

I tried different -check options (-check all; -check arg_temp_created), but none of those gave me any runtime- or compile-time message...

I now have a program whose behaviour differs depending on how large the used matrices are. Until some size like A(7,7) I am obtaining zeros, for A(8,8) and bigger I am getting a good result of the calculation...

But there is no warning of any kind (I suppose it'd be written to stdout or stderr; thus on the terminal under usual conditions).

Ruda

0 Kudos
yuriisig
Beginner
659 Views
Quoting - rudolfii

Thanks.

I tried different -check options (-check all; -check arg_temp_created), but none of those gave me any runtime- or compile-time message...

I now have a program whose behaviour differs depending on how large the used matrices are. Until some size like A(7,7) I am obtaining zeros, for A(8,8) and bigger I am getting a good result of the calculation...

But there is no warning of any kind (I suppose it'd be written to stdout or stderr; thus on the terminal under usual conditions).

Ruda


You fool about. I understand multiplication of matrixes a little: http://www.thesa-store.com/products/
0 Kudos
rudolfii
Beginner
659 Views
Quoting - yuriisig

You fool about. I understand multiplication of matrixes a little: http://www.thesa-store.com/products/

Although you may, your contribution here is worth nothing...

I need to have at least some idea about what has been happening in order to decide whether some rusults of calculations carried out by a 'wrong' code can be believed. As it seems, they can, although I don't know why.

The very code has been meanwhile corrected as soon as I realized that gemm(A,B,B) is generally wrong.

Ruda

0 Kudos
Dmitry_B_Intel
Employee
659 Views

Ruda,

The Fortran specification in the section "Restrictions on entities associated with dummy arguments" states that if there is a partial overlap between the actual arguments associated with two different dummy arguments of the same procedure, theoverlapped portions cannot be defined during the execution of the procedure. These restrictions are intended to facilitate compiler optimizations in the procedure.

Since the last matrixargument in gemm is defined during execution of gemm, it shall not be overlapped with the first two matrices. In other words, calling gemm(A,A,A) is not conforming to Fortran specification.

Thanks
Dima
0 Kudos
rudolfii
Beginner
659 Views

Ruda,

The Fortran specification in the section "Restrictions on entities associated with dummy arguments" states that if there is a partial overlap between the actual arguments associated with two different dummy arguments of the same procedure, theoverlapped portions cannot be defined during the execution of the procedure. These restrictions are intended to facilitate compiler optimizations in the procedure.

Since the last matrixargument in gemm is defined during execution of gemm, it shall not be overlapped with the first two matrices. In other words, calling gemm(A,A,A) is not conforming to Fortran specification.

Thanks
Dima

Thanks Dima!
I understand what you say (hopefully it also means that gemm(A,A,B) where A and B do not overlap is ok).

The only thing that I'm still missing is why my code (having a gemm(A,B,B) call) worked. I understand that if, for any reason, a copy of the 2nd argument is made during the call, the result is ok (as in gemm(A,B+0.0,B)). But even when the checks are active, specifically arg_temp_created, there is no sign of making a temporary copy. And still it is correct...

Thanks
Ruda
0 Kudos
Dmitry_B_Intel
Employee
659 Views


Ruda,

That improper use produced correct result should not be surprising, because it might have produced any result. In general case, saylarge B, arbitrary beta, and copy-in/copy-out argument passing mechanism avoided, you will see that gemm(A,B,B) computes trash.

Thanks
Dima

0 Kudos
rudolfii
Beginner
659 Views


Ruda,

That improper use produced correct result should not be surprising, because it might have produced any result. In general case, saylarge B, arbitrary beta, and copy-in/copy-out argument passing mechanism avoided, you will see that gemm(A,B,B) computes trash.

Thanks
Dima


I know that generally gemm(A,B,B) computes trash. But if it gives out a good result some time, isn't it a sign that a temporary copy was used? But then if yes, compiler should have said it was (with -check all). But it does not say anything. That surprises me...

Thanks
Ruda
0 Kudos
Dmitry_B_Intel
Employee
659 Views

MKL functions may internally docopying that would improve performance. This should not be considered as a sign thata temporarycopy is always used. The functionwas not compiled with the compiler flags that would make it report of the copying.

Thanks
Dima
0 Kudos
rudolfii
Beginner
659 Views

MKL functions may internally docopying that would improve performance. This should not be considered as a sign thata temporarycopy is always used. The functionwas not compiled with the compiler flags that would make it report of the copying.

Thanks
Dima

Thanks
Ruda
0 Kudos
yuriisig
Beginner
659 Views
Quoting - rudolfii

Although you may...


Therefore I have written that you are engaged in nonsense. If there is no memory it is necessary to operate differently.
0 Kudos
Reply