Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
Announcements
The Intel sign-in experience has changed to support enhanced security controls. If you sign in, click here for more information.
6671 Discussions

high context switching when multi threading VC1 encoder

vanista
Beginner
191 Views
I have an issue with the VC1 encoder from IPP sample code. I don't think the issue is directly related to this codec though.

If in a same process I run a number of encoding threads that exceeds the number of CPU cores on the host, the number of contex switches jumps over 1 million per second (compared to 3000 per second in normal maximum load conditions) and the CPU usage suffers from 50% system overhead.

I'm using IPP v5.3.1.062

I was able to isolate some parts of the code which trigger the c-switch bomb, for one I identified the ippAddC function to be in cause.

From what the doc says this function is thread optimized, so I wonder if there could be a conflict between this and linux kernel thread handling


0 Kudos
2 Replies
vanista
Beginner
191 Views
As additionnal info, it's exactly two unique function calls in intra mb encoding which trigger the problem (ippiSubC_16s_C1IRSfs and ippiAddC_16s_C1IRSfs)

umc_vc1_enc_picture_sm.cpp:1362

 //only intra blocks:

for (blk = 0; blk<6; blk++)

{

roiSize.height = 8;

roiSize.width = 8;

IPP_STAT_START_TIME(m_IppStat->IppStartTime);

_own_Diff8x8C_16s(128, pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk], roiSize, 0);

IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);

STATISTICS_START_TIME(m_TStat->FwdQT_StartTime);

IntraTransformQuantACFunction(pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk],

DCQuant, doubleQuant);

STATISTICS_END_TIME(m_TStat->FwdQT_StartTime, m_TStat->FwdQT_EndTime, m_TStat->FwdQT_TotalTime);

}



umc_vc1_enc_picture_sm.cpp:1449

 for (blk=0;blk<6; blk++)

{

STATISTICS_START_TIME(m_TStat->InvQT_StartTime);

IntraInvTransformQuantACFunction(pCurMBData->m_pBlock[blk],pCurMBData->m_uiBlockStep[blk],

TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],

DCQuant, doubleQuant);

STATISTICS_END_TIME(m_TStat->InvQT_StartTime, m_TStat->InvQT_EndTime, m_TStat->InvQT_TotalTime);

IPP_STAT_START_TIME(m_IppStat->IppStartTime);

_own_Add8x8C_16s(128, TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],roiSize,0);

IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);

}

Meanwhile, I've fixed my issue by writing custom unoptimized replacement function for those two cases.

Chao_Y_Intel
Employee
191 Views
Hello vanista,

How are you linking with Intel IPP libraries? Internal threading within IPP functions (ippAddC ) are only enabled when you link with IPP dynamic library or threaded static libraries. If you are using static library, it will include none threaded functions.

Thanks,
Chao

Reply