Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

condition_variable wait_for is broken

aksenofo
Beginner
629 Views
tbb ver: tbb40_20111130oss

condition_variable class , method wait_for in linux broken.
wrong recalculation of time leeds to exeption generating

It take place because the result
line 403 : req.tv_nsec += static_cast( (sec - static_cast(sec))*1e9 );
may exeed 1e9 nanosec
what makes
line 421: pthread_cond_timedwait( &my_cv, lock.mutex()->native_handle(), &req ) .....
return errno 22 ( Invalid param ) and than throw an exeption

fixing patch is:

--- condition_variable 2012-01-25 09:25:13.050000085 +0300
+++ condition_variable.ORIG 2012-01-25 09:23:50.705000086 +0300
@@ -411,9 +411,6 @@
req.tv_nsec = tv.tv_usec*1000 + static_cast( (sec - static_cast(sec))*1e9 );
#endif /*(choice of OS) */

- req.tv_sec += req.tv_nsec / 1000000000;
- req.tv_nsec = req.tv_nsec % 1000000000;
-
int ec;
cv_status rc = no_timeout;
__TBB_ASSERT( lock.owns, NULL );

Good luck

0 Kudos
10 Replies
Wooyoung_K_Intel
Employee
629 Views
Thank you very much for reporting the bug.
We will incorporate the fix into the next TBB release.

0 Kudos
dulantha_f
Beginner
629 Views

I'm not sure if this is related, but I noticed that the wait_for call waits for a less time than I ask for. Following is the test code. I use 200ms for wait_for

tbb::tick_count beforeWait = tbb::tick_count::now();
m_CondVar.wait_for( lock,tbb::tick_count::interval_t(200/1000.0) );// interval_t takes time in seconds
tbb::tick_count::interval_t delayedTime = tbb::tick_count::now() - beforeWait;
std::cout<<"sleep duration:"<<delayedTime.seconds()*1000<<std::endl;

Following are 5 runs of this code:

sleep duration:198.521

sleep duration:198.866

sleep duration:198.846

sleep duration:198.771

sleep duration:198.824

Any ideas guys? And yes, I made sure notify_* methods are not called.

 

Thank you.

0 Kudos
RafSchietekat
Valued Contributor III
629 Views

How about writing a test program directly in terms of pthread_cond_timedwait()? You'll find that it isn't as accurate as the use of timespec might suggest, although it is a bit disappointing that it returns early, contrary to literal interpretation of the specification.

(2013-12-21 Added) How important is that anyway? Important enough for TBB to attempt a workaround?

0 Kudos
RafSchietekat
Valued Contributor III
629 Views

While I was looking at the source code, I noticed that CLOCK_REALTIME is used instead of CLOCK_MONOTONIC for Linux, whereas C++11 requires the use of steady_clock to implement an interval-based condition_variable, not system_clock. How about correcting that?

(Added) Also, this is not specific to Linux.

0 Kudos
dulantha_f
Beginner
629 Views

Sorry for the late reply. Was away from work and Happy new Year!!

I'm actually working on Windows. So I can play around with Windows API and see.

The program I'm working on is a Timer. Currently I'm using this_thread::sleep() which functions correctly but I need to be able to wake up the thread if someone resets the timer while the thread is sleeping. Currently I create the new thread and let the old thread die out. But with the condition variable I can wake up the thread while it's sleeping so I don't have the overhead of creating a new thread.

0 Kudos
RafSchietekat
Valued Contributor III
629 Views

So now you say (this thread started with a problem on Linux)… Maybe you could ask your vendor why you're seeing consistently early returns instead of spread evenly around the target time? I guess the appropriate workaround depends on your exact requirements: either a fixed or a self-calibrating correction, and perhaps followed by a busy-wait loop.

0 Kudos
dulantha_f
Beginner
629 Views

I tested with CONDITION_VARIABLE and CRITICAL_SECTION on Windows 7 and the results were correct. Following is the code I used:

CONDITION_VARIABLE condVar;
CRITICAL_SECTION condVarLock;

InitializeConditionVariable (&condVar);
InitializeCriticalSection (&condVarLock);
EnterCriticalSection(&condVarLock);
tbb::tick_count beforeWait = tbb::tick_count::now();
SleepConditionVariableCS(&condVar,&condVarLock,200);
tbb::tick_count::interval_t delayedTime = tbb::tick_count::now() - beforeWait;
std::cout<<"sleep duration:"<<delayedTime.seconds()*1000<<std::endl;

The code I inserted in the main() method. I didn't initialize a new thread or anything. Whereas for my previous test it was on a "thread created by TBB."

The results of running 5 times were:

199.813, 199.942, 199.92, 199.982, 199.963

Still under 200 but the difference is not as significant as using TBB.

 

0 Kudos
RafSchietekat
Valued Contributor III
629 Views

At first glance TBB uses the same functionality for its implementation (CONDITION_VARIABLE, SleepConditionVariableCS), but the timings you are getting now seem about 1 ms closer to target, which leads me to suspect a rounding issue in internal_condition_variable_wait() in src/tbb/condition_variable.cpp, where a seconds() value is multiplied with 1000 to get milliseconds and then simply cast to an integral value, which rounds toward zero instead of to the nearest integral value. The scenario seems to be something like "typedef uint32_t DWORD /*only for investigation on non-Windows environment*/; double sec = 200 / 1000.0; long long ticks_per_second = 50000/*example*/; long long value = static_cast<long long>(sec * ticks_per_second); double resolution = 1.0 / ticks_per_second; double seconds = value * resolution; DWORD duration = DWORD(seconds * 1000);". This happens to yield 200, but with 3000 substituted for 50000 I do get 199.

My suggestion is to check the value returned by QueryPerformanceFrequency() (which will be the actual ticks_per_second instead of the example value above), the scenario value on your machine, and, if you use the source distribution, the actual "duration" in internal_condition_variable_wait(): if it's not 200, perhaps you could apply roundf() or ceil() before the cast to DWORD happens. Let us know!

(2014-01-04 Added) The example value 50000 above was mentioned in documentation by Microsoft. In a Dr. Dobb's article from 2003, the author Matthew Wilson reports having observed a value of 3579545. While that particular value still produces a duration of 200, it would seem to indicate that my analysis and proposed remedy are at least plausible.

0 Kudos
RafSchietekat
Valued Contributor III
629 Views

Could somebody please follow up?

0 Kudos
dulantha_f
Beginner
629 Views

Hey Raf, I got pulled into something else at work. I'll post something as soon as I get time to play around with this.

0 Kudos
Reply