Processors
Intel® Processors, Tools, and Utilities
14638 Discussions

Unexpected long time to multiply a buffer by a value.

ErikH
Beginner
949 Views

Hi,

 

For a project I encountered an unexpected long time to multiply a buffer by a value in the form of:

d++ = *s++ * f;

 

The problem is that it takes 50 times longer to finish when d and s are the same. If I replace the multiplication by an addition that difference disappears. If I add the result to the multiplication the difference also disappears.

I've tested this using both gcc and clang with the same result. Testing it on another CPU architecture doesn't show this behavior.

 

The test code can be found here:

https://www.adalin.com/downloads/t.c

 

I'm lost on what could cause this other than maybe a CPU firmware bug?

 

Erik

 

0 Kudos
9 Replies
Steven_Intel
Moderator
908 Views

Hello ErikH,


Thank you for posting on the Intel® communities.


In order to better assist you, could you please confirm the model of the Intel processor where you found this behavior?


Best regards,


Steven G.

Intel Customer Support Technician.


0 Kudos
ErikH
Beginner
895 Views

Hi Steven,

 

I first encountered it on a Core i7-4790K but got it confirmed on a Core i3-370M

 

I hope that helps.

 

Erik

0 Kudos
ErikH
Beginner
894 Views

And a friend of mine tested it on a Core i5-8350U with the same results.

 

Erik

0 Kudos
Steven_Intel
Moderator
847 Views

Thank you for your response.


I will research about this. As soon as I have an update, I will let you know.


Best regards,


Steven G.

Intel Customer Support Technician.


0 Kudos
ErikH
Beginner
841 Views
0 Kudos
Steven_Intel
Moderator
822 Views

Hello ErikH,


After researching about this, our best recommendation is to go to our Developer Zone to find documentation and further support for this. You may also create a new community post there.


Developer Zone: https://www.intel.com/content/www/us/en/developer/overview.html#gs.fqciyu


Developer Forums: https://community.intel.com/t5/Developer-Software-Forums/ct-p/developer-software-forums


Please keep in mind that this thread will no longer be monitored by Intel. Thank you for your understanding.  


Best regards,


Steven G.

Intel Customer Support Technician.


0 Kudos
ErikH
Beginner
810 Views

Could you explain to me how calling the exact same function, once with both pointers pointing to the same buffer and ones with both pointers pointing to different buffers, having a different result is a compiler problem and not a hardware problem? Especially as this is a cross-compiler issue and ARM does not suffer the same issue. Unfortunately I don't have an AMD at hand but I bet it does not show the problem either.

 

Erik

0 Kudos
ErikH
Beginner
796 Views

I got a bit further, the problem goes away when running gcc with the -ffast-math option.

For clang this does not make any difference.

 

This log file shows the results and the diff between the two assembler outputs (which is minimal).

 https://www.adalin.com/downloads/t.log

 

The difference is not really obvious to me, but is looks like it's only using different registers?

 

Here are the individual assembler outputs:

https://www.adalin.com/downloads/t-O3.asm

https://www.adalin.com/downloads/t-O3fast.asm

 

Erik

 

 

0 Kudos
Reply