Processors
Intel® Processors, Tools, and Utilities
14403 Discussions

Poor memcpy performance for dual-socket E5-* Xeons

TBens1
Beginner
1,640 Views

Hello,

This question is very similar to the following stackoverflow question, which has many relevant details:

http://stackoverflow.com/questions/22793669/poor-memcpy-performance-on-linux c++ - Poor memcpy Performance on Linux - Stack Overflow

In short, dual-socket systems with E5-26* CPUs (Sandy Bridge and Ivy Bridge) exhibit poor memcpy performance. For this question, I will focus on the CacheBench results (http://icl.cs.utk.edu/projects/llcbench/index.htm LLCbench Home Page) whereas the stackoverflow question has some other results as well. I have tested about half a dozen systems with varying CPU models (E5-2630, E5-2670, E5-2650 v2, X5650/X5660) and different versions of Linux. The older systems with X5660 Xeons yield around 10 GB/s for the larger test sizes in CacheBench whereas the newer E5 models only yield around 6 GB/s. I have also seen this issue exhibited in other ways on some optimized code. For example, I have several optimized versions of out-of-place matrix transpose (so similar to memcpy) and the cache-blocked versions are actually slower than the naive version. For essentially all other CPU models that I test, the cache-blocked version is substantially faster than the naive version. The tests take NUMA into account and are pinned to a single core so that the accesses are to local memory only. Any thoughts on what can be causing the performance degradation on the newer dual-socket systems? (I have also tested many single socket systems, but those have all behaved as expected.)

Thanks and regards,

Thomas

0 Kudos
3 Replies
Allan_J_Intel1
Employee
729 Views

Thanks for joining the community.

I understand you are experiencing some memory performance issues with some Intel Xeon processors.

I am currently researching on this issue. As soon as I can, I will send you a message with my findings. Thank you for your patience and understanding.

Allan.

0 Kudos
Allan_J_Intel1
Employee
729 Views

Thank you for your patience.

I have contacted our experts and they have recommended getting in touch with our software people.

Please engage at http://software.intel.com/en-us/forum http://software.intel.com/en-us/forum

In the meantime, our hardware department is investigating about this matter, I will report back once I have more information.

Allan.

0 Kudos
TBens1
Beginner
729 Views

Allan,

Okay, thank you for the update. I will check with the software group, but would appreciate any updates from the hardware side as I have seen this behavior manifest itself in several different ways using many different benchmarks.

Regards,

Thomas

0 Kudos
Reply