- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have an issue with the VC1 encoder from IPP sample code. I don't think the issue is directly related to this codec though.
If in a same process I run a number of encoding threads that exceeds the number of CPU cores on the host, the number of contex switches jumps over 1 million per second (compared to 3000 per second in normal maximum load conditions) and the CPU usage suffers from 50% system overhead.
I'm using IPP v5.3.1.062
I was able to isolate some parts of the code which trigger the c-switch bomb, for one I identified the ippAddC function to be in cause.
From what the doc says this function is thread optimized, so I wonder if there could be a conflict between this and linux kernel thread handling
If in a same process I run a number of encoding threads that exceeds the number of CPU cores on the host, the number of contex switches jumps over 1 million per second (compared to 3000 per second in normal maximum load conditions) and the CPU usage suffers from 50% system overhead.
I'm using IPP v5.3.1.062
I was able to isolate some parts of the code which trigger the c-switch bomb, for one I identified the ippAddC function to be in cause.
From what the doc says this function is thread optimized, so I wonder if there could be a conflict between this and linux kernel thread handling
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As additionnal info, it's exactly two unique function calls in intra mb encoding which trigger the problem (ippiSubC_16s_C1IRSfs and ippiAddC_16s_C1IRSfs)
umc_vc1_enc_picture_sm.cpp:1362
umc_vc1_enc_picture_sm.cpp:1449
Meanwhile, I've fixed my issue by writing custom unoptimized replacement function for those two cases.
umc_vc1_enc_picture_sm.cpp:1362
//only intra blocks:
for (blk = 0; blk<6; blk++)
{
roiSize.height = 8;
roiSize.width = 8;
IPP_STAT_START_TIME(m_IppStat->IppStartTime);
_own_Diff8x8C_16s(128, pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk], roiSize, 0);
IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);
STATISTICS_START_TIME(m_TStat->FwdQT_StartTime);
IntraTransformQuantACFunction(pCurMBData->m_pBlock[blk], pCurMBData->m_uiBlockStep[blk],
DCQuant, doubleQuant);
STATISTICS_END_TIME(m_TStat->FwdQT_StartTime, m_TStat->FwdQT_EndTime, m_TStat->FwdQT_TotalTime);
}
umc_vc1_enc_picture_sm.cpp:1449
for (blk=0;blk<6; blk++)
{
STATISTICS_START_TIME(m_TStat->InvQT_StartTime);
IntraInvTransformQuantACFunction(pCurMBData->m_pBlock[blk],pCurMBData->m_uiBlockStep[blk],
TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],
DCQuant, doubleQuant);
STATISTICS_END_TIME(m_TStat->InvQT_StartTime, m_TStat->InvQT_EndTime, m_TStat->InvQT_TotalTime);
IPP_STAT_START_TIME(m_IppStat->IppStartTime);
_own_Add8x8C_16s(128, TempBlock.m_pBlock[blk],TempBlock.m_uiBlockStep[blk],roiSize,0);
IPP_STAT_END_TIME(m_IppStat->IppStartTime, m_IppStat->IppEndTime, m_IppStat->IppTotalTime);
}
Meanwhile, I've fixed my issue by writing custom unoptimized replacement function for those two cases.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello vanista,
How are you linking with Intel IPP libraries? Internal threading within IPP functions (ippAddC ) are only enabled when you link with IPP dynamic library or threaded static libraries. If you are using static library, it will include none threaded functions.
Thanks,
Chao
How are you linking with Intel IPP libraries? Internal threading within IPP functions (ippAddC ) are only enabled when you link with IPP dynamic library or threaded static libraries. If you are using static library, it will include none threaded functions.
Thanks,
Chao

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page