- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
here's the code:
union U1 {
int f0;
int f1;
short f4;
};
int g = 1;
int f=1;
func_25(union U1 c) {
int32_t *d = &g;
c.f0 = f;
if (c.f1 > (uint64_t)c.f4) //f4必须拥有比f1小的size,并强制转化为64位
*d = 0;
}
when compiled with icc 2021.6.0 20220226 (-O1), generated code is like this:
0000000000401426 <func_25>:
401426: 8b 05 94 6c 00 00 mov 0x6c94(%rip),%eax # 4080c0 <g_166>
40142c: 0f be 15 8d 6c 00 00 movsbl 0x6c8d(%rip),%edx # 4080c0 <g_166>
401433: 3b c2 cmp %edx,%eax
401435: 7e 0a jle 401441 <func_25+0x1b>
401437: c7 05 7f d5 00 00 00 movl $0x0,0xd57f(%rip) # 40e9c0 <g>
40143e: 00 00 00
401441: c3 retq
we can see after assigning f to c, it still load f when c.f1 and c.f4 are needed, is this a problem? f may be a shared variable, or there may exist a performance problem.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thank you for posting in Intel Communities.
>>"we can see after assigning f to c, it still load f when c.f1 and c.f4 are needed, is this a problem? f may be a shared variable, or there may exist a performance problem."
We couldn't understand your problem statement. So could you please elaborate on your statement?
Thanks & Regards,
Hemanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your attention!
sorry for pasting the wrong asm code, it should be like this: (the g_166 should be f)
00000000004013f5 <func_25>:
4013f5: 8b 05 b1 6c 00 00 mov 0x6cb1(%rip),%eax # 4080ac <f>
4013fb: 0f bf 15 aa 6c 00 00 movswl 0x6caa(%rip),%edx # 4080ac <f>
401402: 3b c2 cmp %edx,%eax
401404: 76 0a jbe 401410 <func_25+0x1b>
401406: c7 05 98 6c 00 00 00 movl $0x0,0x6c98(%rip) # 4080a8 <g>
40140d: 00 00 00
401410: c3 retq
For the first situation, where f is a shared variable, then if f is modifed between the first and the second instruction, then then result of cmp instruction will be wrong because the f was loaded into local c.
And if not shared, maybe the comparison result can be inferred because c was just assigned before the if-condition.
Thanks again for your guidance!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
A union is a special data type available in C/C++ that allows to store of different data types in the same memory location. So if we update any variable in union reference variable(c), then all the variables present in the union are pointing to the updated variable. So if we compare c.f1 and c.f4, which are always the same(which points to the last updated code). thus We are getting the expected results after running the code.
Thanks & Regards,
Hemanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your explanation! I understand the necessity of comparing c.f1 and c.f4. But there may still be problem when the compiler choose to replace them with variable f ( I guess this operation can save stack memory? ).
When f is modified between
4013f5: 8b 05 b1 6c 00 00 mov 0x6cb1(%rip),%eax # 4080ac <f>
and
4013fb: 0f bf 15 aa 6c 00 00 movswl 0x6caa(%rip),%edx # 4080ac <f>
then %eax and %edx hold different versions of f and the result of comparison between them will be wrong. However, in the program it's the fields of local variable c are compared, and it doesn't have this vulnerability. So could we treat this as a change in semantics?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The ICC compiler could re-use the value of "f" in the register %eax, and sign-extend it, instead of re-loading it. The newer clang-based compiler (icx) does make this optimization properly.
What is actually happening in ICC compiler, is a "half-completed" optimization. The compiler can store the union completely in registers, but it is not doing this because of the short size of "c.f4". The field "c.f0" is registerized, but "c.f4" is kept in memory. Later, the compiler sees that a load of "c.f4" can be replaced with the load of "f".
If the value of "f" were changed, this would not cause any incorrect results. The 2nd load of "f" is done directly after the 1st load of "f".
Thanks & Regards,
Hemanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We haven't heard back from you. Could you please provide an update on your issue?
Thanks & Regards,
Hemanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We assume that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.
Thanks & Regards,
Hemanth

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page