Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7953 Discussions

Should asm listing look like this?!?

levicki
Valued Contributor I
444 Views
        movss     DWORD PTR [esp+140], xmm4                     ;351.16
        DB        141                                           ;352.16
        DB        116                                           ;352.16
        DB        38                                            ;352.16
        DB        0                                             ;352.16
        movss     xmm4, DWORD PTR _2il0floatpacket$17           ;352.16
0 Kudos
8 Replies
JenniferJ
Moderator
444 Views

Those "DB" instructions are used for padding purpose in order to get better performance. To combine all 4 DB together, it's one nop instruction "lea esi, [esi]".

One possibility is to make the load that follows the padding be at a different offset modulo 256 of some other load in the loop that looked important to have IP address based data prefetch working on.

So there's nothing wrong with the asm.

0 Kudos
jimdempseyatthecove
Honored Contributor III
444 Views

Shouldn't the ASM generator/viewer shown the text of the instruction sequence in lieu of the op-code byte sequence?

Jim

0 Kudos
JenniferJ
Moderator
444 Views

No. Because the asm is DB instruction.

There are a number of different possible ways to encode lea esi,[esi],thismethod (using specific bytes) assures the padding stays the same whether assembly or direct object generation is used.

0 Kudos
jimdempseyatthecove
Honored Contributor III
444 Views

OK,

But then your C++ -> ASM generator should insert a comment indicating

DB ... ;; pad with lea esi,[esi]

Otherwise you will continue to get questions as to what those DB...'s are doing in the code.

Jim

0 Kudos
JenniferJ
Moderator
444 Views
This seems a nice feature request. I'll get it into our tracker. Thanks for the suggestion!
0 Kudos
levicki
Valued Contributor I
444 Views

Then why aren't they shown as instructions in disassembly with a comment on their purpose instead of making it look like the assembly code generator has gone haywire?

EDIT: Doh, I see Jim has beaten me to it :)

I would suggest to show the actual instruction though:

	lea	esi, [esi]	; PAD

Rationale: assembler listings can get rather long and hard to follow even without adding several lines each containing only one DB.

By the way, that is not LEA ESI, [ESI], it is LEA ESI, [ESI + 0]. LEA ESI, [ESI] is two bytes long (0x8D 0x36).

Moreover, I am not sure if this can really improve performance since it seems to affect decoding throughput.

0 Kudos
TimP
Honored Contributor III
444 Views
Padding in the instruction sequence would be used so as to get favorable alignment of the top of an inner loop body, and possibly for frequent jump targets within the loop. On early Core CPUs, to take advantage of the loop accelerator, the frequently executed instructions must fit within 4 16 byte (aligned, not necessarily contiguous) chunks of code. Penryn models accept a larger number of such chunks. In addition, hardware prefetch may be improved by making the top of the loop 32-byte aligned. The extended no-op above the loop is executed only before entering the loop, and no-ops at the beginning of an else segment would never be executed.
0 Kudos
levicki
Valued Contributor I
444 Views

I know about loop alignment, but what happens if this code itself is in an (outer) loop? How the compiler judges the benefit of alignment .vs. the size of the code of the innermost loop?

0 Kudos
Reply