OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

OpenCL compiler generating mad instruction

krishnaraj
Beginner
806 Views
I have a compute instensive kernel performing lots of multiply and adds.
Am using opencl mad() function and float16 for the compiler to generate avx mad instruction.
But when I see the ASM dump from intel offline compiler, it shows mul and add instructions on YMM registers but no mad at all.

dump:
    ..................
    vmulps    YMM4, YMM3, YMMWORD PTR [R15 + R10 + 32864]
    vaddps    YMM2, YMM4, YMM2
    vunpckhps    YMM4, YMM0, YMM0
    vpermilps    YMM4, YMM4, 0
    vperm2f128    YMM4, YMM4, YMM0, 0
    vmulps    YMM5, YMM4, YMMWORD PTR [R15 + R10 + 32928]
    vaddps    YMM2, YMM5, YMM2
    vshufps    YMM5, YMM0, YMM0, 3
    .................

even tried cl-mad-enable (which was default) while building , but no change.
Am I missing something here?!
0 Kudos
1 Solution
Alexander_Heinecke
806 Views
Hi,


current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.

Alex

View solution in original post

0 Kudos
2 Replies
Alexander_Heinecke
807 Views
Hi,


current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.

Alex
0 Kudos
krishnaraj
Beginner
806 Views
Thanks for that. I saw FMA intrinsics on some intel page and didn't notice that it belonged to AVX2.
Hope its coming soon.
0 Kudos
Reply