I would like to use cblas_zgemmt() to calculate C=C-A*B. The final C is symmetrical.
The input matrices is verified by cblas_zgemm() and the result is correct. However, the output C is unchanged when cblas_zgemmt() is used.
Should the input matrices of zgemmt() be different from zgemm()?
Thanks a lot!
What do you mean by "However, the output C is unchanged when cblas_zgemmt() is used" ? That the output C is the same as that which cblas_zgemm() gave? Or that the output C is the same as the input C?
It is up to you to know in advance that the result C is symmetric. If that is true, the cblas_zgemmt() routine will give the same result as cblas_zgemm(), but will update only the upper or lower triangle of C, as specified by you. Thus, cblas_zgemmt() will be nearly twice as fast as cblas_zgemm().
Thanks for the answer.
I fixed this problem. The developer reference of cblas_?gemmt has some problems. uplo should be CblasUpper instead of 'U'. Moreover, the explaination of lda and ldb is incorrect in the case of CblasColMajor.