- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am trying to write a simple assembly code in asm using the AVX instructions. I have seen a problem rising up while adding large numbers. The code is here:
__asm__ __volatile__(
"vzeroall\n\t"
"movl $0, %%r9d\n\t"
"movl $4, %%r10d\n\t"
"leal (%%eax, %%r9d, 1), %%edx\n\t"
"vbroadcastss (%%edx), %%ymm0\n\t"
"leal (%%eax, %%r10d, 1), %%edx\n\t"
"vmovups (%%edx), %%ymm1\n\t"
"vaddps %%ymm0, %%ymm1, %%ymm2\n\t"
"vmovups %%ymm2, (%%edx)"
: "=a"(x) : "a"(x));
Let's say, that the input array is an array of integers. I want to distribute the first element of the array to all the elements of register ymm0 and then move the next 8 integers of the input array to register ymm1. Finally, I want to add the two registers and store the result back to the input array. Assuming that the first element of the array is for instance 33445531and all the other elements have the value 1 then by running the above code I get a very bizarre result:
33445531
33445531
33445531
33445531
33445531
33445531
33445531
33445531
33445531
It seems that the vaddps doesn't run. If I use for the first element a smaller number, for instance 32768 the result is correct. Can anyone explain why it's happening and how can I make it work for large numbers?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The number 33445531 is too large to be stored without loss in single precision floating point variables. 33445531 and 33445532 have the same rounded representation. You should use smaller numbers or a greater precision (i.e. "double" precision floating point ).
See also http://www.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf and http://download.intel.com/products/processor/manual/325462.pdf (chapter 4.2.2).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Angelos,
by switching to double precision floating point data you will be able to operate on 4 component vector(YMMx).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks everyone for your answers. They were very thorough and detailed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Angelos P. wrote:
Thanks everyone for your answers. They were very thorough and detailed.
You are welcome:)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page