Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
43 Views

OpenCL compiler generating mad instruction

Jump to solution
I have a compute instensive kernel performing lots of multiply and adds.
Am using opencl mad() function and float16 for the compiler to generate avx mad instruction.
But when I see the ASM dump from intel offline compiler, it shows mul and add instructions on YMM registers but no mad at all.

dump:
    ..................
    vmulps    YMM4, YMM3, YMMWORD PTR [R15 + R10 + 32864]
    vaddps    YMM2, YMM4, YMM2
    vunpckhps    YMM4, YMM0, YMM0
    vpermilps    YMM4, YMM4, 0
    vperm2f128    YMM4, YMM4, YMM0, 0
    vmulps    YMM5, YMM4, YMMWORD PTR [R15 + R10 + 32928]
    vaddps    YMM2, YMM5, YMM2
    vshufps    YMM5, YMM0, YMM0, 3
    .................

even tried cl-mad-enable (which was default) while building , but no change.
Am I missing something here?!
0 Kudos

Accepted Solutions
Highlighted
43 Views
Hi,


current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.

Alex

View solution in original post

0 Kudos
2 Replies
Highlighted
44 Views
Hi,


current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.

Alex

View solution in original post

0 Kudos
Highlighted
Beginner
43 Views
Thanks for that. I saw FMA intrinsics on some intel page and didn't notice that it belonged to AVX2.
Hope its coming soon.
0 Kudos