Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Bob_Kirnum
Beginner
119 Views

Getting stuck in e9_ownSearchOptimalPulsePos_M122_GSMAMR_16s_optSSE

One of our customers is reporting an issue which we have isolated to the Intel IPP for GSMAMR processing.  After forcing a core dump we have determined that we randomly get stuck in e9_ownSearchOptimalPulsePos_M122_GSMAMR_16s_optSSE.  We had been using IPP 8.2.1 on Linux and, due to issues we previously had observed on Windows, updated to IPP 8.2.3 but the problem persists.  In addition to the IPP update, we changed the sample code to use the ippsAlgebraicCodebookSearchEX function as was recommended from the Windows issue.  Would greatly appreciate any suggestions to resolve or work around this issue.

Thanks - Bob / Dialogic

Back trace from the forced core dump when thread is hung.

Thread 62 (Thread 0x7f58eb9fc700 (LWP 26864)):
#0  0x00007f598a730fe8 in e9_ownSearchOptimalPulsePos_M122_GSMAMR_16s_optSSE () from /usr/dialogic/data/ssp.mlm
#1  0x00007f598a54232f in e9_ownAlgebraicCodebookSearch_M122_GSMAMR_16s () from /usr/dialogic/data/ssp.mlm
#2  0x00007f598a541f0a in e9_ownsAlgebraicCodebookSearch_GSMAMR_16s () from /usr/dialogic/data/ssp.mlm
#3  0x00007f598a516ad0 in e9_ippsAlgebraicCodebookSearchEX_GSMAMR_16s () from /usr/dialogic/data/ssp.mlm
#4  0x00007f598a4ec7f5 in ownEncode_GSMAMR (encSt=0x7f5971e9dc18, rate=<value optimized out>, pAnaParam=0x7f58eb9fb5ce,
    pVad=<value optimized out>, pSynthVec=0x7f58eb9fb470)
    at /cm/vobs/3rdparty/components/intel/ipp-samples.7.1.1.013/sources/speech-codecs/codec/speech/gsmamr/src/encgsmamr.c:589
#5  0x00007f598a4ecefd in apiGSMAMREncode (encoderObj=0x7f5971e9dc00, src=<value optimized out>, rate=GSMAMR_RATE_12200,
    dst=0x7f589188ef10 "", pVad=0x7f58eb9fb7d4)
    at /cm/vobs/3rdparty/components/intel/ipp-samples.7.1.1.013/sources/speech-codecs/codec/speech/gsmamr/src/encgsmamr.c:313
#6  0x00007f598a068063 in GSMAMR_Encode (handle=0x7f58eb9fa8c0, src=0x2, rate=GSMAMR_RATE_DTX, dst=
    0xffff7e2f <Address 0xffff7e2f out of bounds>, pVad=0x7) at x86/gsmamrapi.c:154
#7  0x00007f598a2ae413 in GSMAMREncode (pCodec=0x7f589188ee88, pSrcData=0x2, ppCodedData=0x7f58eb9fbdb0,
    numSamples=<value optimized out>, idtmfFlag=<value optimized out>, silenceFlag=1207968416) at codec.c:1740

Environment details from IPP debug we have in our code.

DisplayIPPCPUFeatures: 0x4a : 0x60
ippCore 8.2.3 (r48108)
ippIP AVX2 (l9) 8.2.3 (r48108)
ippSP AVX2 (l9) 8.2.3 (r48108)
ippVC AVX2 (l9) 8.2.3 (r48108)
Processor supports Advanced Vector Extensions 2 instruction set
    4 cores on die
ippGetMaxCacheSizeB 8192 k
Available 0xefff Enabled 0xefff
MMX     A E
SSE     A E
SSE2    A E
SSE3    A E
SSSE3   A E
MOVBE   A E
SSE41   A E
SSE42   A E
AVX     A E
AVX(OS) A E
AES     A E
CLMUL   A E
ABR     X X
RDRRAND A E
F16C    A E
AVX2    A E
ADCOX     X X
RDSEED    X X
PREFETCHW X X
SHA       X X
KNC       X X

 

 

0 Kudos
2 Replies
Ying_H_Intel
Employee
119 Views

Hi Bob,

Thank you for reporting the issue. I saw you issue in premier.intel.com.  we will investigate them together and get back to you later.

Please note all of speech codec function are deprecated, so related developer and support work are discontinued.

Regarding the  e9_ownSearchOptimalPulsePos_M122_GSMAMR_16s_optSSE issue,  I get idea from another forum thread 628141.

IPP dispatched the optimized code according to the CPU type.

For example , the table in  https://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-understa...

and related article:   https://software.intel.com/en-us/articles/ipp-dispatcher-control-functions-ippinit-functions

e9  :  is for AVX Sandy Bridge µarchitecture

Platform Architecture SIMD Requirements Processor / µarchitecture Notes
IA-32 px C optimized for all IA-32 processors i386+  
  w7 SSE2 P4, Xeon, Centrino  
  v8 Supplemental SSE3 Core 2, Xeon® 5100, Atom  
  p8 SSE4.1, SSE4.2, AES-NI Penryn, Nehalem, Westmere see notes below
  g9 AVX Sandy Bridge µarchitecture new since   IPP v.6.1
  h9  AVX2  Haswell µarchitecture  
Intel® 64 (EM64T) mx C-optimized for all Intel® 64 platforms P4 SSE2 minimum
  m7 SSE3 Prescott  
  u8 Supplemental SSE3 Core 2, Xeon® 5100, Atom  
  y8 SSE4.1, SSE4.2, AES-NI Penryn, Nehalem, Westmere see notes below
  e9 AVX Sandy Bridge µarchitecture new in 6.1
  l9 AVX2 Haswell µarchitecture  

From your output, the code should be 64bit l9 codec.

Could you please try 

call ippInitCpu()  with CPU-type argument for y8 and below  CPU type, 

ippCpuSSE = 0x40, /* Processor supports Pentium(R) III processor instruction set */
Intel® Integrated Performance Primitives Concepts 2 11
ippCpuSSE2, /* Processor supports Streaming SIMD Extensions 2 instruction set */
ippCpuSSE3, /* Processor supports Streaming SIMD Extensions 3 instruction set */
ippCpuSSSE3, /* Processor supports Supplemental Streaming SIMD Extensions 3 instruction set */

and see if it can workaround the issue?

please print the CPU info when run-time with the functions also

  lib = ippsGetLibVersion();
printf(“%s %s %d.%d.%d.%d\n”,
lib->Name, lib->Version,
lib->major,
lib->minor, lib->majorBuild, lib->build);
}

Best Regards,

Ying

Bob_Kirnum
Beginner
119 Views

In parallel to posting here have been trying a number of things to replicate the issue our customer is reporting.  We already had a means of changing the CPU type value using an environment variable.  When trying to limit the CPU type to 0x45 (ippCpuSSE42) we see the following.

May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: APInit.c.164:DisplayIPPCPUFeatures: 0x46 : 0x60
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot:        532: APInit.c.179:DisplayIPPCPUFeatures: dsp_framework, ipp_cpu_limit: Limiting from 0x46 to 0x45
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ippCore 8.2.3 (r48108)
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ippIP SSE4.1/4.2 (y8)+ 8.2.3 (r48108)
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ippSP SSE4.1/4.2 (y8)+ 8.2.3 (r48108)
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ippVC SSE4.1/4.2 (y8)+ 8.2.3 (r48108)
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: Processor supports Advanced Vector Extensions instruction set
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot:     16 cores on die
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ippGetMaxCacheSizeB 4096 k
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: Available 0xdf Enabled 0xdf
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: MMX       A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SSE       A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SSE2      A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SSE3      A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SSSE3     A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: MOVBE     X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SSE41     A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SSE42     A E
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: AVX       X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: AVX(OS)   X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: AES       X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: CLMUL     X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ABR       X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: RDRRAND   X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: F16C      X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: AVX2      X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: ADCOX     X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: RDSEED    X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: PREFETCHW X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: SHA       X X
May  5 10:05:24 bl-108-vm01 ssp_x86Linux_boot: KNC       X X

Unfortunately this results in a segmentation fault rather quickly in our testing.  The back trace appears corrupted.

#0  0x00007f6075f8e570 in y8_ipps_cRadix4FwdNorm_32fc () from /usr/dialogic/data/ssp.mlm
#1  0x0000000000000000 in ?? ()

 

Based on the error above, I assumed the CPU type selected is not quite valid.  Since the compiler (and Intel documentation) shows the ippInitCpu is deprecated, we changed our code to use the ippSetCpuFeatures providing a mask value to override the 'available features mask'.

As far as I can tell, using the recommended CPU type value (ippCpuSSE, 0x40) I expect this results in a feature mask of 0x1f.  This in turn results in an instruction set of u8.  Am I missing something?  What feature mask value(s) should we try?

 

Reply