Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Efficient use of enhanced rep movsb/stosb with an unaligned destination address

Hani_Deek
Beginner
2,597 Views

This question is about assembly code.

Many people believe that the implementation of 'enhanced rep movsb/stosb' obtains an aligned address internally before it starts moving data in large chunks, which means that it is unnecessary for us to write code to align the destination address before passing it to the operation.

However, the language of the Intel optimization manual, as well as some other things that I have read online,  make me unsure about this point. It could be more efficient to write code to do unaligned moves separately and then pass a 16-byte aligned address to 'rep movsb/stosb' .

When using enhanced 'rep movsb/stosb' with an unaligned destination address, is it more efficient to rely completely on the hardware implementation, or should I write code to pass an aligned address to the operation?

 

0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
2,589 Views

With (arbitrary) unaligned source and destination addresses, there is no assurance that extent of the unalignment of the source matches the extent of the unalignment of the destination, and thus one cannot assume you can ever coordinate source and destination alignment.

The newer CPU designs are intended to take mismatched alignment issues into consideration (abstractly through pipelining where the input port and output port of the pipeline can be out of phase).

>>should I write code to pass an aligned address to the operation?

Probably yes. This would depend on if you intend to use the copied data later with SIMD instructions,
Probably no, if you are simply moving text about (e.g. in a text editor)

Jim Dempsey

View solution in original post

0 Kudos
3 Replies
jimdempseyatthecove
Honored Contributor III
2,590 Views

With (arbitrary) unaligned source and destination addresses, there is no assurance that extent of the unalignment of the source matches the extent of the unalignment of the destination, and thus one cannot assume you can ever coordinate source and destination alignment.

The newer CPU designs are intended to take mismatched alignment issues into consideration (abstractly through pipelining where the input port and output port of the pipeline can be out of phase).

>>should I write code to pass an aligned address to the operation?

Probably yes. This would depend on if you intend to use the copied data later with SIMD instructions,
Probably no, if you are simply moving text about (e.g. in a text editor)

Jim Dempsey

0 Kudos
PrasanthD_intel
Moderator
2,565 Views

Hi,

Thanks for reaching out to us.

Jim seems to have answered your question.

Do let us know if all your queries were resolved.


Thanks

Prasanth


0 Kudos
PrasanthD_intel
Moderator
2,516 Views

Hi,


Thanks for Marking the solution.

This issue has been resolved and we will no longer respond to this thread. Please start a new thread if you require additional assistance from Intel. Any further interaction in this thread will be considered community only.


Regards

Prasanth


0 Kudos
Reply