Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

G720 Codec and Performance

harald85
Beginner
1,462 Views
Hi
I am using the Intel IPP for GSM and G720 Codec in kernelmode. The GSM Codec works perfect but the G729 Codec makes a little problem. He encodes and decodes but when i stop the encode and decode he needs a lot of CPU time. When i start the Vtune Analyze i saw that the function v8_ownFixedCodebookSearchVec_V8 needs the whole performance but i didn't found this function in the documentation of the IPP. So my question is what this function does and how can i avoid that the function needs so much performance?!?

Harald
0 Kudos
7 Replies
Vyacheslav_Baranniko
New Contributor II
1,462 Views
Quoting - harald85
Hi
I am using the Intel IPP for GSM and G720 Codec in kernelmode. The GSM Codec works perfect but the G729 Codec makes a little problem. He encodes and decodes but when i stop the encode and decode he needs a lot of CPU time. When i start the Vtune Analyze i saw that the function v8_ownFixedCodebookSearchVec_V8 needs the whole performance but i didn't found this function in the documentation of the IPP. So my question is what this function does and how can i avoid that the function needs so much performance?!?

Harald

Hi Harald,
The function v8_ownFixedCodebookSearchVec_V8 is one of several functions used in implementation of theippsFixedCodebookSearch_G729_32s16s function which really takes in total ~25% of G729 Encode. The ippsFixedCodebookSearch_G729_32s16s is documented in the IPP manual. To say sortly itperforms fixed codebook search whichis mandatory part of G729 encode algorithm so it can not be "avoided".
Vyacheslav, IPP speech
0 Kudos
harald85
Beginner
1,462 Views
Quoting - vbaranni

Hi Harald,
The function v8_ownFixedCodebookSearchVec_V8 is one of several functions used in implementation of theippsFixedCodebookSearch_G729_32s16s function which really takes in total ~25% of G729 Encode. The ippsFixedCodebookSearch_G729_32s16s is documented in the IPP manual. To say sortly itperforms fixed codebook search whichis mandatory part of G729 encode algorithm so it can not be "avoided".
Vyacheslav, IPP speech

Hi

Thanks for this information. But i dont understand how it is possible that when i dont use the codecs that the driver is in this function and needs there 90% of the CPU time. Is in this funcion any blocking code which is able to hold the system? My driver has 2 or more threads which works with rtp sessions and every session has its own USCCodec for G729. Is it a problem when these threads encode simultenous for the different session or is this an possible reason for the crash? When the G720 needs the 90% Cpu time i started the Vtune to see which function needs the whole time and it shows me that from my driver only the ippsFixedCodebookSearch_G729_32s16s function is active, which is ok because i stopped alle my rtp sessions before.

Harald
0 Kudos
harald85
Beginner
1,462 Views
Hi
I looked at the source code in the file encg729.c and i saw the function ippsFixedCodebookSearch_G729_32s16s. Is it normal that the parameter pSearchTimes is never set in the whole application?

Harald
0 Kudos
Vyacheslav_Baranniko
New Contributor II
1,462 Views
Quoting - harald85
Hi
I looked at the source code in the file encg729.c and i saw the function ippsFixedCodebookSearch_G729_32s16s. Is it normal that the parameter pSearchTimes is never set in the whole application?

Harald

Hi
Firstly, according to G729 spec extraTime parameter of the ippsFixedCodebookSearch_G729_32s16s function
limitsa number of the codebook search loops. The finction set itto 105 anyfirst subframe and increments it by 75 anysecondsubframe. Outside the function one may see how fasteach search was performed.

Secondly,IPP speech codecs areMT safeso multiple instances can be used simultaneously.I see no problem with your driver usage model of G729 integer codec.Please note, whenyou are using floating points calculationsin driver or floating point IPP codec (IPP_G729_FP) you mustbracket each portion of FP code with save/restore FPU status functions.See for example, http://msdn.microsoft.com/en-us/library/aa489566.aspx.

And, last but not least,there can be a bug in IPP function, of course. So, please provide some info about IPP version,processor andmachine types, OS. Pleasetry to use ippStaticInitCpu or ippInitCpu functions to dispacth to differentcode branch, for exampleto non-optimized IPP code PX orto other optimized codes: W7, T7 or P8 ifyou your macine supportsSSE4.1. Now your are using V8 code. This may help us to narrow the issue in case it is processor dependent.

Thnx
Vyacheslav
0 Kudos
harald85
Beginner
1,462 Views
Quoting - vbaranni

Hi
Firstly, according to G729 spec extraTime parameter of the ippsFixedCodebookSearch_G729_32s16s function
limitsa number of the codebook search loops. The finction set itto 105 anyfirst subframe and increments it by 75 anysecondsubframe. Outside the function one may see how fasteach search was performed.

Secondly,IPP speech codecs areMT safeso multiple instances can be used simultaneously.I see no problem with your driver usage model of G729 integer codec.Please note, whenyou are using floating points calculationsin driver or floating point IPP codec (IPP_G729_FP) you mustbracket each portion of FP code with save/restore FPU status functions.See for example, http://msdn.microsoft.com/en-us/library/aa489566.aspx.

And, last but not least,there can be a bug in IPP function, of course. So, please provide some info about IPP version,processor andmachine types, OS. Pleasetry to use ippStaticInitCpu or ippInitCpu functions to dispacth to differentcode branch, for exampleto non-optimized IPP code PX orto other optimized codes: W7, T7 or P8 ifyou your macine supportsSSE4.1. Now your are using V8 code. This may help us to narrow the issue in case it is processor dependent.

Thnx
Vyacheslav
Hi

I use the IPP version 6.1.1.035 and i do static dispatching for different CPUs. So i use the v8 Code for the Core2Duo Cpu which is in the testserver. I will now test this problem on an dualcore Xeon Cpu whith the p8 Code. Because on my first test with an QuadCore Xeon I dont have the Problem until now. But on the Quadcore the Cpu Performance was only 50% and not 90%. The part with the save and restore for the floating point calculation is implemented.

I tested the driver now on an second Server with Windows Server 2003 Enterprise Edition and two Xeon Dual Core CPUs inside and i got an BLuescreen in the encode function of the GSM and the G729 Codec. An the other Testserver is an Windows Server 2003 Web Edition installed and there the GSM works fine. Could the Enterprise editiion an Problem or the two physical Processors? -->I solved this problem. I used the P8 optimized code but there is an Xeon with netburst technology installed. when I used the normal code it works. But i have another question about the optimized code. Now i include the needed header file (for example the ipp_p8.h). Is it possible that the i use an dynamic dispatcher in the gernel mode which decide himself which optimized code he needs?

Harald
0 Kudos
harald85
Beginner
1,462 Views
Hi

The problem that he crash in the ippsFixedCodebookSearch_G729_32s16s exists only when i use the processor optimized Code (w7 or v8). When i use the general code (px) i have no crash.

Harald
0 Kudos
Vyacheslav_Baranniko
New Contributor II
1,462 Views
Quoting - harald85
Hi

I use the IPP version 6.1.1.035 and i do static dispatching for different CPUs. So i use the v8 Code for the Core2Duo Cpu which is in the testserver. I will now test this problem on an dualcore Xeon Cpu whith the p8 Code. Because on my first test with an QuadCore Xeon I dont have the Problem until now. But on the Quadcore the Cpu Performance was only 50% and not 90%. The part with the save and restore for the floating point calculation is implemented.

I tested the driver now on an second Server with Windows Server 2003 Enterprise Edition and two Xeon Dual Core CPUs inside and i got an BLuescreen in the encode function of the GSM and the G729 Codec. An the other Testserver is an Windows Server 2003 Web Edition installed and there the GSM works fine. Could the Enterprise editiion an Problem or the two physical Processors? -->I solved this problem. I used the P8 optimized code but there is an Xeon with netburst technology installed. when I used the normal code it works. But i have another question about the optimized code. Now i include the needed header file (for example the ipp_p8.h). Is it possible that the i use an dynamic dispatcher in the gernel mode which decide himself which optimized code he needs?

Harald

As I wrote ippStaticInitCpu or ippInitCpu functions can be used to dispacth also in kernel mode to different code branch when linked to IPP merged libs. Defaultly, they dispatch to the code best suitespecific processor.

Vyacheslav
0 Kudos
Reply