Hi Intel Engineers,
I am now trying to use IPP 2018 Beta Updata1 to replace IPP 7.0.
From the vTune Memory Access, a brand new function (called LGLOOPgas_2) came up, which consumes the most clockticks.
I had never seen this function from IPP 7.0, and could not find any related information about it from the internet. Due to its high consuming of clockticks, high CPI Rate(10.966), and high Back-End Bound(95.0%), I am curious about what this function is doing.
Could you inform me about this function, and illustrating a document will be deeply appriciated.
Attachment is a screenshot from vTune Memory Access.
I can't open attached png. Could you be more specific and provide public IPP APIs you are investigating under VTune? The LGLOOPgas_2 symbol you see is just one of the internal labels in the optimized asm code and is not a function.
Thank you for your reply.
The previous attachment had been auto-encrypted by my workstation, and I have updated it. Could you check if the newly uploaded file can be opened. If so, you can see in our algorithm, a lot of IPP APIs have been used, like, ippsAddProduct_32fc, ippsCopy_32f, ippsMul_32f_I,ippsRealToCplx_32f ,etc.
As you said, the LGLOOPgas_2 is an internal label in the optimized asm code. Is there any way, I can find out which part of my code being optimized to the lable of LGLOOPgas_2, and what exactly it is doing?
The compiler is gcc 4.4.5, and the running platform is Xeon Phi 7250 with Flat mode.
If you need any further information, please let me know.
In "Function/Memory Object/Allocation Stack" mode, I found the memory objects that LGLOOPgas_2 was working with.
It looks like its doing something initialization work.
Thank you for your support
This internal symbol is called from ippsn0 which is MIC AVX512 optimization for ipps. Could you please show the call stack of this symbol so that we can see which ipps function invoke this symbol. Thanks.
Thank you for your tips.
In Advanced Hotspots mode, I find the call stack.
LGLOOPgas_2 <--n0_ownsCopy_8u<--n0_ippsCopy_8u, so its doing memory copy.