Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
6814 Discussions

high context switching when multi threading VC1 encoder

vanista
Beginner
607 Views
I have an issue with the VC1 encoder from IPP sample code. I don't think the issue is directly related to this codec though.

If in a same process I run a number of encoding threads that exceeds the number of CPU cores on the host, the number of contex switches jumps over 1 million per second (compared to 3000 per second in normal maximum load conditions) and the CPU usage suffers from 50% system overhead.

I'm using IPP v5.3.1.062

I was able to isolate some parts of the code which trigger the c-switch bomb, for one I identified the ippAddC function to be in cause.

From what the doc says this function is thread optimized, so I wonder if there could be a conflict between this and linux kernel thread handling


0 Kudos
2 Replies
vanista
Beginner
607 Views
As additionnal info, it's exactly two unique function calls in intra mb encoding which trigger the problem (ippiSubC_16s_C1IRSfs and ippiAddC_16s_C1IRSfs)

umc_vc1_enc_picture_sm.cpp:1362

 //only intra blocks:

for (blk = 0; blk<6; blk++)

{

roiSize.height = 8;

roiSize.width = 8;

IPP_STAT_START_TIME(m_IppStat->IppStartTime);

_own_Diff8x8C_16s(128, pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk], roiSize, 0);

IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);

STATISTICS_START_TIME(m_TStat->FwdQT_StartTime);

IntraTransformQuantACFunction(pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk],

DCQuant, doubleQuant);

STATISTICS_END_TIME(m_TStat->FwdQT_StartTime, m_TStat->FwdQT_EndTime, m_TStat->FwdQT_TotalTime);

}



umc_vc1_enc_picture_sm.cpp:1449

 for (blk=0;blk<6; blk++)

{

STATISTICS_START_TIME(m_TStat->InvQT_StartTime);

IntraInvTransformQuantACFunction(pCurMBData->m_pBlock[blk],pCurMBData->m_uiBlockStep[blk],

TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],

DCQuant, doubleQuant);

STATISTICS_END_TIME(m_TStat->InvQT_StartTime, m_TStat->InvQT_EndTime, m_TStat->InvQT_TotalTime);

IPP_STAT_START_TIME(m_IppStat->IppStartTime);

_own_Add8x8C_16s(128, TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],roiSize,0);

IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);

}

Meanwhile, I've fixed my issue by writing custom unoptimized replacement function for those two cases.

0 Kudos
Chao_Y_Intel
Moderator
607 Views
Hello vanista,

How are you linking with Intel IPP libraries? Internal threading within IPP functions (ippAddC ) are only enabled when you link with IPP dynamic library or threaded static libraries. If you are using static library, it will include none threaded functions.

Thanks,
Chao

0 Kudos
Reply