- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I ran Vtune to collect the Store forward performance impact from the application,the source view showed the following source and assembly code that had the high value of the performance impact,
Address Line Source MOB Loads
Replay
Retired
0x39219 100 if ( ((long)*s & 3L) == 0) { 1183
0x39219 100 CollectGarb+158: mov esi, DWORD PTP[ecx] 152
0x39219 100mov edx, esi736
0x39219 100and edx, 0x3h 122
0x39219 100 jnz CollectGarb+3a7173
I have read about store forwarding from the intel manual and some articles, but given the above info from Vtune, how do I tell what's wrong with these code? Doesthese code violate the store forwarding restriction? How?
Thanks.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you had recently stored into part (but not all) of the 32-bit quantity at [ecx] (*s ?) you would be violating the hardware rules for efficient store forwarding. You would have to look further back in the code. Your source may cause an unnecessary move here, but the excessive performance impact would be a consequence of what happened earlier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for reply.
I went back a few lines that had the code with ecx,
mov ecx DWORD PTR [0x5238d4h]
add ecx, -0x4
move DWORD PTR[esp+058h], ecx
cmp ecx, eax
Would "mov ecx DWORD PTR[0x5238d4h]" be the instruction you are talking about? was it because the alignment problem?
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page