Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7944 Discussions

BKM for zero init large float array

Zhu_W_Intel
Employee
447 Views
I need to periodically zero init large float arrays, 100+M bytes. I used memset (it calls intel_new_memset) now and it becomes the top hot spot during vtune analysis. How can I optimize this part? Thanks.
0 Kudos
5 Replies
jimdempseyatthecove
Honored Contributor III
447 Views

Zhu Wang,

What this may indicate is you are unnecessarily zero init-ing the large float arrays.
If this is not the case, then, the zeroing out of the array may be the hottest spot
but it will not benifet from further optimizations.

Jim Dempsey
0 Kudos
Zhu_W_Intel
Employee
447 Views

Zhu Wang,

What this may indicate is you are unnecessarily zero init-ing the large float arrays.
If this is not the case, then, the zeroing out of the array may be the hottest spot
but it will not benifet from further optimizations.

Jim Dempsey

Thanks for your response. I have to zero init the large arays. I wonder whether I should chop it into smaller arrays, or use some other methods. This is a sparse array. Are you saying this is the best I can do with memset?
0 Kudos
jimdempseyatthecove
Honored Contributor III
447 Views

Are you sure you must zero init the arrays?

Are your arrays allocated as full arrays, inited as full arrays, then sparsely used?
(i.e. you are zero-initing elements that will never beused.)

Many methods that require initial values of 0.0 are loops performing summations

for(i=0; i Array += function(...);

This can easily be converted to

if(firstTime) // or test outer loop control varaiable value
{
firstTime = false;
for(i=0; i Array = function(...);
}
else
{
for(i=0; i Array += function(...);
}

Jim Dempsey
0 Kudos
Zhu_W_Intel
Employee
447 Views

Are you sure you must zero init the arrays?

Are your arrays allocated as full arrays, inited as full arrays, then sparsely used?
(i.e. you are zero-initing elements that will never beused.)

Many methods that require initial values of 0.0 are loops performing summations

for(i=0; i Array += function(...);

This can easily be converted to

if(firstTime) // or test outer loop control varaiable value
{
firstTime = false;
for(i=0; i Array = function(...);
}
else
{
for(i=0; i Array += function(...);
}

Jim Dempsey

Thank you for the example. but my case is not as simple as this. It is more like a FFT transform, which requires a valid init value for all matrix elements.
0 Kudos
jeff_keasler
Beginner
447 Views
#include <xmmintrin.h>
_mm_stream_ps() may be a good bet, since it bypasses cache. But write-combining still occurs, so basically a cache line at at time is zeroed in memory without polluting the cache. This may cut the overhead in half. I suspect you are operating on doubles, but since you are just zeroing out memory, it's ok to pretend you're zeroing (twice as many) floats. I think the address of your arrays need to be 16-byte alignd to use this feature. You can use posix_memalign() in place of malloc() if you are allocating memory from heap.

Good luck,
-Jeff
0 Kudos
Reply