- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a quick question on cblas_gemm_s8u8s32.
What is the reasoning behind requiring one side to be signed and the other unsigned?
The cuBLAS equivalent of this function, cublasGemmEx, expects both a and b to be signed which seems simpler to work with according to me.
Thanks,
Guillaume
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It is because for most cases in image processing, weights are usually signed values, and elements of image are usually unsigned values.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the reply.
That's interesting. I'm working on text application and all values are usually signed. Could we expect a fully signed interface in future releases?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'll escalate this request to engineer team. They will make the decision.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you try to use gemm_s16s16s32?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are already using gemm_s16s16s32 with success but are interested in going further in terms of model compression and speed (the application is neural machine translation to be more precise).
If gemm_s8u8s32 is the only planned interface for 8 bits GEMM that's acceptable, we will try to adapt and implement device-specific quantization schemes.
(I also found out that google/gemmlowp requires both operands to be unsigned so there does not seem to be a standard way to provide 8 bits quantization: that's 3 libraries mentionned in this thread and 3 different interfaces!)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
For technical reasons, we only have s8u8s32 and s16s16s32 for integer gemm now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For reference, a fully signed INT8 GEMM interface is available in MKL-DNN:
https://intel.github.io/mkl-dnn/group__c__api__blas.html#gac1869eab851b572350fb450c50c61626
But it looks like it does the computation in... double precision?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
when i use QuantizedMatMulWithBias quantized matmul,mkldnn_verbose output is:
mkldnn_verbose,exec,inner_product,igemm_s8u8s32:blas,forward_inference,fsrc:nc fwei:io fbia:x fdst:nc,,mb768ic1024oc512,1.146
but mkldnn dump bin is:
mkldnn_dump_gemm_x8s8s32x_inner_product_fwd_t::pp_kernel.0.bin
why dump bin is x8s8s32x not s8u8s32?what different in the two method?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hello , when I use cblas_gemm_s8u8s32 , I found the result is error when OP_B(Col Major, Unsigned int8)'s value is over 128。And , I tested the efficiency of int8 GEMM(use cblas_gemm_s8u8s32) and float GEMM (use cblas_sgemm) on my machine and found that the speed of int8 GEMM is close to float. Why? Do you have the efficiency test results of two interfaces?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear sir,
Could tell me why cblas_gemm_s8s8s32 is not support? because AVX2 not support multiplying and adding vectors of the same type (either s8/s8 or u8/u8)?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page