- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello
This code does not compile with icc 19.0.4.243.
#include <immintrin.h> extern "C" { void fct() {} } int main() { __asm__ __volatile__("callq fct"); __mmask8 a; __mmask8 b; __mmask8 r = _kand_mask8(a, b); }
I use this command for the compilation:
icc -mavx512f -mavx512dq -mavx512cd -mavx512bw -mavx512vl -march=skylake-avx512 -O0 main.cpp
I get an error message:
/tmp/iccRFTW6tas_.s: Assembler messages: /tmp/iccRFTW6tas_.s:56: Error: no such instruction: `vkmovb %eax,%k0' /tmp/iccRFTW6tas_.s:57: Error: no such instruction: `vkmovb %edx,%k1' /tmp/iccRFTW6tas_.s:59: Error: no such instruction: `vkmovb %k0,%eax'
If we remove the __asm__ line or if we replace "callq fct" by "nop", the code compiles with the same command:
icc -mavx512f -mavx512dq -mavx512cd -mavx512bw -mavx512vl -march=skylake-avx512 -O0 main.cpp
But, even without the __asm__ line, we get the same error with these commands:
icc -mavx512f -mavx512dq -mavx512cd -mavx512bw -mavx512vl -march=skylake-avx512 -O0 main.cpp -S icc -mavx512f -mavx512dq -mavx512cd -mavx512bw -mavx512vl -march=skylake-avx512 -O0 main.s
When the compilation works, icc (and g++) generate(s) kmovb instructions instead of vkmovb.
I have the same issue on godbolt.org (with ./a.out activated). It looks like an icc bug?
- Tags:
- CC++
- Development Tools
- Intel® C++ Compiler
- Intel® Parallel Studio XE
- Intel® System Studio
- Optimization
- Parallel Computing
- Vectorization
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've reported this issue to our compiler team. The internal bug number is CMPLRIL0-31941
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you. We have contacted Intel Support one day after this topic.
https://supporttickets.intel.com/requestdetail?id=5000P00000njMqNQAU&lang=en-US
Sorry for the potential duplicate bug reports.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
@Viet_H_Intel do you have any feedback or further information to share regarding this bug report ? I've encountered the very same problem with the latest icc from intel/oneapi-hpckit:
hostuser@docker-script-18481:/tmp$ icpc -v
icpc version 2021.2.0 (gcc version 7.5.0 compatibility)
hostuser@docker-script-18481:/tmp$ icpc -V
Intel(R) C++ Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000
Copyright (C) 1985-2021 Intel Corporation. All rights reserved.
This very test case still emits the faulty vkmovb instruction, and it also happens in my (unrelated) code.
I'd rather avoid marking all icc versions in some multi-year version range as buggy regarding the avx-512 code I'm developing. Is there a way around ? If I get things right, replacing the (nonexistent) vkmovb by kmovb in the assembly code should work, but that doesn't play well with the existing toolchain, of course.
Best regards.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you try this work around?
__asm__ __volatile__("callq *%0" :: "r"(fct));
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Viet_H_Intel wrote:
Can you try this work around?
__asm__ __volatile__("callq *%0" :: "r"(fct));Thanks,
Hi,
The problem is that icc generates the nonexistent instruction vkmovb. This instruction doesn't exist, there shouldn't be a single code path in the compiler to emit this instruction, period.
The test case that was posted is a minimal reproducer, but it's not the real code I'm interested in. A workaround that is good for the test case only does very little for my actual problem.
I did however notice that the problem (whether in my code or in the reproducer) is sensible to the presence of some inline asm instructions in the surrounding code. I had a huge block of inline asm in that compilation unit. When I delete this asm code, or some of it, the error goes away.
There's no such thing as the smallest example of a nearby inline assembly instruction that triggers the failure. Beyond the callq example above, I was able to extract two random (and rather nonsensical, when out of context) examples of inline asm statements which, when put in place the the inline asm in the sample code above, lead to the exact same failure.
// any of these lines triggers the same bug as in the orignal post,
// when put in place of the original inline asm statement.
// __asm__ __volatile__("cltd");
// unsigned int foo, bar, baz; __asm__ __volatile__("leal (%0,%2,4), %0" : "=r"(baz) : "r"(foo), "r"(bar));
As far as my code is concerned, I'm happy with disabling the inline asm code that triggers the compiler misbehaviour, since it's unrelated to the code I'm really interested in.
@jimdempseyatthecove no, I was not able to replace with equivalent instructions, the two problems seem unrelated as far as I can tell.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This appears to be (related to) the issue reported here.
And the solution listed at the bottom was:
replacing _kor_mask8 with _mm512_kor (and similar cases where kmovd was used to load an 8-bit mask).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've reported this bug to the compiler Developer.
Thanks,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This seems to be fixed in oneAPI 2022.3. Please upgrade to this version.
$ icpc -V
Intel(R) C++ Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.7.0 Build 20220726_000000
$ icpc -mavx512f -mavx512dq -mavx512cd -mavx512bw -mavx512vl -march=skylake-avx512 -O0 test.cpp
$ cat test.cpp
#include <immintrin.h>
extern "C" {
void fct() {}
}
int main()
{
__asm__ __volatile__("callq fct");
__mmask8 a;
__mmask8 b;
__mmask8 r = _kand_mask8(a, b);
}
We are going to close this thread.
Thanks,

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page