<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic ippSetNumThreads fails to create threads in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippSetNumThreads-fails-to-create-threads/m-p/1096645#M25058</link>
    <description>&lt;P&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;We are using IPP 7.0.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;I have a small console application in which I am calling the FFT function from here&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/articles/how-to-use-intel-ipp-s-1d-fourier-transform-functions" target="_blank"&gt;https://software.intel.com/en-us/articles/how-to-use-intel-ipp-s-1d-fourier-transform-functions&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;When i call&amp;nbsp;ippSetNumThreads(10); and get the number of threads using&amp;nbsp;ppGetNumThreads(&amp;amp;origTh); its returning 10.&lt;/P&gt;

&lt;P&gt;However when my FFT is running if i monitor number of threads in task manager i only see one thread.&lt;/P&gt;

&lt;P&gt;Dont understand why. Please help.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Shashi&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;void IPPFFT()
{
	    //Set the size
    const int N=128;
    const int order=(int)(log((double)N)/log(2.0));

    // Spec and working buffers
    IppsFFTSpec_C_32fc *pFFTSpec=0;
    Ipp8u *pFFTSpecBuf, *pFFTInitBuf, *pFFTWorkBuf;

    // Allocate complex buffers
    Ipp32fc *pSrc=ippsMalloc_32fc(N);
    Ipp32fc *pDst=ippsMalloc_32fc(N); 

    // Query to get buffer sizes
    int sizeFFTSpec,sizeFFTInitBuf,sizeFFTWorkBuf;
    ippsFFTGetSize_C_32fc(order, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, &amp;amp;sizeFFTSpec, &amp;amp;sizeFFTInitBuf, &amp;amp;sizeFFTWorkBuf);

    // Alloc FFT buffers
    pFFTSpecBuf = ippsMalloc_8u(sizeFFTSpec);
    pFFTInitBuf = ippsMalloc_8u(sizeFFTInitBuf);
    pFFTWorkBuf = ippsMalloc_8u(sizeFFTWorkBuf);

    // Initialize FFT
    ippsFFTInit_C_32fc(&amp;amp;pFFTSpec, order, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, pFFTSpecBuf, pFFTInitBuf);
    if (pFFTInitBuf) ippFree(pFFTInitBuf);

    // Do the FFT
    ippsFFTFwd_CToC_32fc(pSrc,pDst,pFFTSpec,pFFTWorkBuf);


    //check results
    ippsFFTInv_CToC_32fc(pDst,pDst,pFFTSpec,pFFTWorkBuf);
    int OK=1;
    for (int i=0;i&amp;lt;N;i++) {
        pDst&lt;I&gt;.re/=(Ipp32f)N;
        pDst&lt;I&gt;.im/=(Ipp32f)N;
        if ((abs(pSrc&lt;I&gt;.re-pDst&lt;I&gt;.re)&amp;gt;.001) || 

            (abs(pSrc&lt;I&gt;.im-pDst&lt;I&gt;.im)&amp;gt;.001) ) 
        {
            OK=0;break;
        }
    }
    puts(OK==1?"FFT OK":"FFT Fail");

    if (pFFTWorkBuf) ippFree(pFFTWorkBuf);
    if (pFFTSpecBuf) ippFree(pFFTSpecBuf);

    ippFree(pSrc);
    ippFree(pDst);

}
int _tmain(int argc, _TCHAR* argv[])
{
	
	//ipp
	int origTh = 0;
	ippGetNumThreads(&amp;amp;origTh);
	printf("NumThreads = %d", origTh);
	getch();

	ippSetNumThreads(10);

	ippGetNumThreads(&amp;amp;origTh);
	printf("NumThreads = %d", origTh);
	getch();

	IPPFFT();

	getch();
		
	return 0;


}&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 04 Dec 2015 21:28:38 GMT</pubDate>
    <dc:creator>Shashi_K_</dc:creator>
    <dc:date>2015-12-04T21:28:38Z</dc:date>
    <item>
      <title>ippSetNumThreads fails to create threads</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippSetNumThreads-fails-to-create-threads/m-p/1096645#M25058</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;We are using IPP 7.0.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;I have a small console application in which I am calling the FFT function from here&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/articles/how-to-use-intel-ipp-s-1d-fourier-transform-functions" target="_blank"&gt;https://software.intel.com/en-us/articles/how-to-use-intel-ipp-s-1d-fourier-transform-functions&lt;/A&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;When i call&amp;nbsp;ippSetNumThreads(10); and get the number of threads using&amp;nbsp;ppGetNumThreads(&amp;amp;origTh); its returning 10.&lt;/P&gt;

&lt;P&gt;However when my FFT is running if i monitor number of threads in task manager i only see one thread.&lt;/P&gt;

&lt;P&gt;Dont understand why. Please help.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Shashi&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;void IPPFFT()
{
	    //Set the size
    const int N=128;
    const int order=(int)(log((double)N)/log(2.0));

    // Spec and working buffers
    IppsFFTSpec_C_32fc *pFFTSpec=0;
    Ipp8u *pFFTSpecBuf, *pFFTInitBuf, *pFFTWorkBuf;

    // Allocate complex buffers
    Ipp32fc *pSrc=ippsMalloc_32fc(N);
    Ipp32fc *pDst=ippsMalloc_32fc(N); 

    // Query to get buffer sizes
    int sizeFFTSpec,sizeFFTInitBuf,sizeFFTWorkBuf;
    ippsFFTGetSize_C_32fc(order, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, &amp;amp;sizeFFTSpec, &amp;amp;sizeFFTInitBuf, &amp;amp;sizeFFTWorkBuf);

    // Alloc FFT buffers
    pFFTSpecBuf = ippsMalloc_8u(sizeFFTSpec);
    pFFTInitBuf = ippsMalloc_8u(sizeFFTInitBuf);
    pFFTWorkBuf = ippsMalloc_8u(sizeFFTWorkBuf);

    // Initialize FFT
    ippsFFTInit_C_32fc(&amp;amp;pFFTSpec, order, IPP_FFT_NODIV_BY_ANY, 
        ippAlgHintAccurate, pFFTSpecBuf, pFFTInitBuf);
    if (pFFTInitBuf) ippFree(pFFTInitBuf);

    // Do the FFT
    ippsFFTFwd_CToC_32fc(pSrc,pDst,pFFTSpec,pFFTWorkBuf);


    //check results
    ippsFFTInv_CToC_32fc(pDst,pDst,pFFTSpec,pFFTWorkBuf);
    int OK=1;
    for (int i=0;i&amp;lt;N;i++) {
        pDst&lt;I&gt;.re/=(Ipp32f)N;
        pDst&lt;I&gt;.im/=(Ipp32f)N;
        if ((abs(pSrc&lt;I&gt;.re-pDst&lt;I&gt;.re)&amp;gt;.001) || 

            (abs(pSrc&lt;I&gt;.im-pDst&lt;I&gt;.im)&amp;gt;.001) ) 
        {
            OK=0;break;
        }
    }
    puts(OK==1?"FFT OK":"FFT Fail");

    if (pFFTWorkBuf) ippFree(pFFTWorkBuf);
    if (pFFTSpecBuf) ippFree(pFFTSpecBuf);

    ippFree(pSrc);
    ippFree(pDst);

}
int _tmain(int argc, _TCHAR* argv[])
{
	
	//ipp
	int origTh = 0;
	ippGetNumThreads(&amp;amp;origTh);
	printf("NumThreads = %d", origTh);
	getch();

	ippSetNumThreads(10);

	ippGetNumThreads(&amp;amp;origTh);
	printf("NumThreads = %d", origTh);
	getch();

	IPPFFT();

	getch();
		
	return 0;


}&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 04 Dec 2015 21:28:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippSetNumThreads-fails-to-create-threads/m-p/1096645#M25058</guid>
      <dc:creator>Shashi_K_</dc:creator>
      <dc:date>2015-12-04T21:28:38Z</dc:date>
    </item>
    <item>
      <title>Hi Shashi,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippSetNumThreads-fails-to-create-threads/m-p/1096646#M25059</link>
      <description>&lt;P&gt;Hi Shashi,&lt;BR /&gt;
	&lt;BR /&gt;
	1D FFT can not get good performance improvement regarding the threading.&amp;nbsp; 128 is too small to get good performance with the threading.&lt;/P&gt;

&lt;P&gt;You can check these two posting the 1D FFT threading performance discussion:&lt;BR /&gt;
	&lt;A href="https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/283657"&gt;https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/283657&lt;/A&gt;&lt;BR /&gt;
	&lt;A href="https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/385354"&gt;https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/385354&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Basically, the&amp;nbsp; FFT is threaded for fit into shared L2 cases only. For small orders OMP overhead is greater than benefit, for large orders (out-of-cache) memory effects play negative role so customers investigation is right there is no any threading for order 19 and above.&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	Chao&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 07 Dec 2015 03:55:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippSetNumThreads-fails-to-create-threads/m-p/1096646#M25059</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2015-12-07T03:55:45Z</dc:date>
    </item>
  </channel>
</rss>

