Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
FPGA community forums and blogs have moved to the Altera Community. Existing Intel Community members can sign in with their current credentials.
7236 Discussions

Clarification on signed/unsigned input for cblas_gemm_s8u8s32

guillaumekln
Beginner
3,480 Views

Hello,

I have a quick question on cblas_gemm_s8u8s32.

What is the reasoning behind requiring one side to be signed and the other unsigned?

The cuBLAS equivalent of this function, cublasGemmEx, expects both a and b to be signed which seems simpler to work with according to me.

Thanks,

Guillaume

0 Kudos
10 Replies
Jing_Xu
Employee
3,480 Views

It is because for most cases in image processing, weights are usually signed values, and elements of image are usually unsigned values.

0 Kudos
guillaumekln
Beginner
3,480 Views

Thank you for the reply.

That's interesting. I'm working on text application and all values are usually signed. Could we expect a fully signed interface in future releases?

0 Kudos
Jing_Xu
Employee
3,480 Views

I'll escalate this request to engineer team. They will make the decision.

0 Kudos
Jing_Xu
Employee
3,480 Views

Hi,

Could you try to use gemm_s16s16s32?

0 Kudos
guillaumekln
Beginner
3,480 Views

We are already using gemm_s16s16s32 with success but are interested in going further in terms of model compression and speed (the application is neural machine translation to be more precise).

If gemm_s8u8s32 is the only planned interface for 8 bits GEMM that's acceptable, we will try to adapt and implement device-specific quantization schemes.

(I also found out that google/gemmlowp requires both operands to be unsigned so there does not seem to be a standard way to provide 8 bits quantization: that's 3 libraries mentionned in this thread and 3 different interfaces!)

0 Kudos
Jing_Xu
Employee
3,480 Views

Hi,

For technical reasons, we only have s8u8s32 and s16s16s32 for integer gemm now.

0 Kudos
guillaumekln
Beginner
3,480 Views

For reference, a fully signed INT8 GEMM interface is available in MKL-DNN:

https://intel.github.io/mkl-dnn/group__c__api__blas.html#gac1869eab851b572350fb450c50c61626

But it looks like it does the computation in... double precision?

0 Kudos
jianqian__zhou
Beginner
3,480 Views

when i use QuantizedMatMulWithBias quantized matmul,mkldnn_verbose output is:

mkldnn_verbose,exec,inner_product,igemm_s8u8s32:blas,forward_inference,fsrc:nc fwei:io fbia:x fdst:nc,,mb768ic1024oc512,1.146

but mkldnn dump bin is:

mkldnn_dump_gemm_x8s8s32x_inner_product_fwd_t::pp_kernel.0.bin

why dump bin is x8s8s32x not s8u8s32?what different in the two method?

0 Kudos
jingjing__wang
Beginner
3,480 Views

hello , when I use cblas_gemm_s8u8s32 , I found the result is error when OP_B(Col Major, Unsigned int8)'s  value is over 128。And ,  I tested the efficiency of int8 GEMM(use cblas_gemm_s8u8s32) and float GEMM (use cblas_sgemm) on my machine and found that the speed of int8 GEMM is close to float. Why? Do you have the efficiency test results of two interfaces?

0 Kudos
qiang__zhang
Beginner
3,479 Views

Dear sir,

Could tell me why cblas_gemm_s8s8s32 is not support? because AVX2 not support multiplying and adding vectors of the same type (either s8/s8 or u8/u8)? 

0 Kudos
Reply