Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6709 Discussions

high context switching when multi threading VC1 encoder

vanista
Beginner
292 Views
I have an issue with the VC1 encoder from IPP sample code. I don't think the issue is directly related to this codec though.

If in a same process I run a number of encoding threads that exceeds the number of CPU cores on the host, the number of contex switches jumps over 1 million per second (compared to 3000 per second in normal maximum load conditions) and the CPU usage suffers from 50% system overhead.

I'm using IPP v5.3.1.062

I was able to isolate some parts of the code which trigger the c-switch bomb, for one I identified the ippAddC function to be in cause.

From what the doc says this function is thread optimized, so I wonder if there could be a conflict between this and linux kernel thread handling


0 Kudos
2 Replies
vanista
Beginner
292 Views
As additionnal info, it's exactly two unique function calls in intra mb encoding which trigger the problem (ippiSubC_16s_C1IRSfs and ippiAddC_16s_C1IRSfs)

umc_vc1_enc_picture_sm.cpp:1362

 //only intra blocks:

for (blk = 0; blk<6; blk++)

{

roiSize.height = 8;

roiSize.width = 8;

IPP_STAT_START_TIME(m_IppStat->IppStartTime);

_own_Diff8x8C_16s(128, pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk], roiSize, 0);

IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);

STATISTICS_START_TIME(m_TStat->FwdQT_StartTime);

IntraTransformQuantACFunction(pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk],

DCQuant, doubleQuant);

STATISTICS_END_TIME(m_TStat->FwdQT_StartTime, m_TStat->FwdQT_EndTime, m_TStat->FwdQT_TotalTime);

}



umc_vc1_enc_picture_sm.cpp:1449

 for (blk=0;blk<6; blk++)

{

STATISTICS_START_TIME(m_TStat->InvQT_StartTime);

IntraInvTransformQuantACFunction(pCurMBData->m_pBlock[blk],pCurMBData->m_uiBlockStep[blk],

TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],

DCQuant, doubleQuant);

STATISTICS_END_TIME(m_TStat->InvQT_StartTime, m_TStat->InvQT_EndTime, m_TStat->InvQT_TotalTime);

IPP_STAT_START_TIME(m_IppStat->IppStartTime);

_own_Add8x8C_16s(128, TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],roiSize,0);

IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);

}

Meanwhile, I've fixed my issue by writing custom unoptimized replacement function for those two cases.

0 Kudos
Chao_Y_Intel
Moderator
292 Views
Hello vanista,

How are you linking with Intel IPP libraries? Internal threading within IPP functions (ippAddC ) are only enabled when you link with IPP dynamic library or threaded static libraries. If you are using static library, it will include none threaded functions.

Thanks,
Chao

0 Kudos
Reply