Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
257 Views

Efficient use of enhanced rep movsb/stosb with an unaligned destination address

Jump to solution

This question is about assembly code.

Many people believe that the implementation of 'enhanced rep movsb/stosb' obtains an aligned address internally before it starts moving data in large chunks, which means that it is unnecessary for us to write code to align the destination address before passing it to the operation.

However, the language of the Intel optimization manual, as well as some other things that I have read online,  make me unsure about this point. It could be more efficient to write code to do unaligned moves separately and then pass a 16-byte aligned address to 'rep movsb/stosb' .

When using enhanced 'rep movsb/stosb' with an unaligned destination address, is it more efficient to rely completely on the hardware implementation, or should I write code to pass an aligned address to the operation?

 

0 Kudos

Accepted Solutions
Highlighted
249 Views

With (arbitrary) unaligned source and destination addresses, there is no assurance that extent of the unalignment of the source matches the extent of the unalignment of the destination, and thus one cannot assume you can ever coordinate source and destination alignment.

The newer CPU designs are intended to take mismatched alignment issues into consideration (abstractly through pipelining where the input port and output port of the pipeline can be out of phase).

>>should I write code to pass an aligned address to the operation?

Probably yes. This would depend on if you intend to use the copied data later with SIMD instructions,
Probably no, if you are simply moving text about (e.g. in a text editor)

Jim Dempsey

View solution in original post

0 Kudos
3 Replies
Highlighted
250 Views

With (arbitrary) unaligned source and destination addresses, there is no assurance that extent of the unalignment of the source matches the extent of the unalignment of the destination, and thus one cannot assume you can ever coordinate source and destination alignment.

The newer CPU designs are intended to take mismatched alignment issues into consideration (abstractly through pipelining where the input port and output port of the pipeline can be out of phase).

>>should I write code to pass an aligned address to the operation?

Probably yes. This would depend on if you intend to use the copied data later with SIMD instructions,
Probably no, if you are simply moving text about (e.g. in a text editor)

Jim Dempsey

View solution in original post

0 Kudos
Highlighted
Moderator
225 Views

Hi,

Thanks for reaching out to us.

Jim seems to have answered your question.

Do let us know if all your queries were resolved.


Thanks

Prasanth


0 Kudos
Highlighted
Moderator
176 Views

Hi,


Thanks for Marking the solution.

This issue has been resolved and we will no longer respond to this thread. Please start a new thread if you require additional assistance from Intel. Any further interaction in this thread will be considered community only.


Regards

Prasanth


0 Kudos