- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following intrinsic are incorrect?
- MM512_CMP_PD_MASK
- MM256_CMP_PD
- MM256_CMP_PS
- MM512_CMP_PS_MASK
#include <immintrin.h>
#include <math.h>
#include <stdio.h>
int main() {
double aaa[] = {-3.200000, 99.378500, 89.770000, 65.000000,
NAN, -88.654000, NAN, 0.000000};
double bbb[] = {NAN, 15.600000, -6.200000, 2.000000,
41.200000, 14.000000, NAN, -88.654000};
__m512d a = _mm512_loadu_pd(aaa);
__m512d b = _mm512_loadu_pd(bbb);
__mmask8 x = _mm512_cmp_pd_mask(a, b, _CMP_NLT_US);
printf("%u\n", x);
}
Should print 233. But SDE emulator print 142.
(compiled using Intel icx compiler and Intel sde emulate tigerlake)
icx -march=tigerlake -o test_cmp test_cmp.c
sde64 -tgl -- ./test_cmp
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 20
On-line CPU(s) list: 0-19
Vendor ID: GenuineIntel
Model name: 12th Gen Intel(R) Core(TM) i7-12700H
CPU family: 6
Model: 154
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 1
Stepping: 3
BogoMIPS: 5376.00
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm cons
tant_tsc rep_good nopl xtopology tsc_reliable nonstop_tsc cpuid pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt t
sc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_sh
adow vnmi ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xge
tbv1 xsaves umip waitpkg gfni vaes vpclmulqdq rdpid movdiri movdir64b fsrm serialize flush_l1d arch_capabilities
Virtualization features:
Virtualization: VT-x
Hypervisor vendor: Microsoft
Virtualization type: full
Caches (sum of all):
L1d: 480 KiB (10 instances)
L1i: 320 KiB (10 instances)
L2: 12.5 MiB (10 instances)
L3: 24 MiB (1 instance)
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling
Srbds: Not affected
Tsx async abort: Not affected
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I copied your test case with GCC (10.1) as:
% gcc cmppd.c -o cmppd -march=tigerlake
And I run it with Intel SDE version 9.7 (latest) and it printed 223.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page