Software Archive
Read-only legacy content
17061 Discussions

Error with offloads from multiple threads causing crash

Corey_P_
Beginner
337 Views

I have found an error when trying to perform offloads from multiple threads running the host, leading to a crash. I have created some test code to replicate this error. In this code many threads are created on the host, and each of these makes offload calls repeatidly, waiting for a hundred milli-seconds before making another. This does not cause a problem until the number of threads reaches around 140-150, at which point the program crashes with the following error (sometimes printed 2-3 times):

[plain]offload error: cannot create pipeline on the device 0 (error code 16)[/plain]

If I split the offload transfers into two calls, first a: length(1) alloc_if(1) free_if(0) then, after a wait, use a: length(0) alloc_if(0) free_if(0) call it still fails, though if instead there is NO memory allocation in the offload, then no error and resultant crash occur.

[cpp]static void *offload_thread ( void *data )
{
    float *test_data = (float *) malloc ( 1*sizeof(float) );
    while ( 1 )
    {
        #pragma offload_transfer target(mic:0) in ( test_data : length(1) alloc_if(1) free_if(1) )
        usleep ( 100000 );
    }
    return NULL;
}

void thread_offload_test ( void )
{
    int NTHREADS = 200;
    int i;
    pthread_t off_thread[NTHREADS];
    for ( i = 0; i < NTHREADS; i++ )
        pthread_create ( &off_thread, NULL, offload_thread, NULL );
    while ( 1 );
}[/cpp]

Corey

0 Kudos
2 Replies
Kevin_D_Intel
Employee
337 Views

I reproduced this error and reported it to Development (internal tracking noted below) for further investigation and will update this thread as I learn more.

(Internal tracking id: DPD200245751)

0 Kudos
Kevin_D_Intel
Employee
337 Views

The underlying issue related to the maximum number of pipes allowed as per the COI_PIPELINE_MAX_PIPELINES setting of 128 in /usr/include/intel-coi/source/COIPipeline_source.h. With NTHREADS=200, this exceeded this limit and resulted in the offload run-time error.

This setting was recently increased to 512 in the MPSS 3.1 release.

0 Kudos
Reply