- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a compute instensive kernel performing lots of multiply and adds.
Am using opencl mad() function and float16 for the compiler to generate avx mad instruction.
But when I see the ASM dump from intel offline compiler, it shows mul and add instructions on YMM registers but no mad at all.
dump:
..................
vmulps YMM4, YMM3, YMMWORD PTR [R15 + R10 + 32864]
vaddps YMM2, YMM4, YMM2
vunpckhps YMM4, YMM0, YMM0
vpermilps YMM4, YMM4, 0
vperm2f128 YMM4, YMM4, YMM0, 0
vmulps YMM5, YMM4, YMMWORD PTR [R15 + R10 + 32928]
vaddps YMM2, YMM5, YMM2
vshufps YMM5, YMM0, YMM0, 3
.................
even tried cl-mad-enable (which was default) while building , but no change.
Am I missing something here?!
Am using opencl mad() function and float16 for the compiler to generate avx mad instruction.
But when I see the ASM dump from intel offline compiler, it shows mul and add instructions on YMM registers but no mad at all.
dump:
..................
vmulps YMM4, YMM3, YMMWORD PTR [R15 + R10 + 32864]
vaddps YMM2, YMM4, YMM2
vunpckhps YMM4, YMM0, YMM0
vpermilps YMM4, YMM4, 0
vperm2f128 YMM4, YMM4, YMM0, 0
vmulps YMM5, YMM4, YMMWORD PTR [R15 + R10 + 32928]
vaddps YMM2, YMM5, YMM2
vshufps YMM5, YMM0, YMM0, 3
.................
even tried cl-mad-enable (which was default) while building , but no change.
Am I missing something here?!
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.
Alex
current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.
Alex
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.
Alex
current AVX implementation doesn't have a mad instruction. This guy will be introduced with AVX2.
Alex
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for that. I saw FMA intrinsics on some intel page and didn't notice that it belonged to AVX2.
Hope its coming soon.
Hope its coming soon.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page