OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1722 Discussions

Excessively slow binary load times...

janez-makovsek
New Contributor I
1,381 Views
Hi!

I have checked the recent update of Open CL drivers v1.5 and the loading times of compiled binaries have not improved. It takes almost 20 seconds for my application to start when loading binary program for Intel drivers. It starts almost instantly with Nvidia and AMD drivers. There is only a small difference between times required to compile the code and the time needed to load already precompiled binaries with Intel Open CL.

Are there any plans to improve on this?

Thanks!
Atmapuri


0 Kudos
7 Replies
Eli_Bendersky__Intel
1,381 Views
Hi Atmapuri,

Could you please clarify your situation in more detail? What is the compilation and execution flow you're implementing?

Can you reproduce this problem for a simple application and attach it?
Thanks in advance
0 Kudos
janez-makovsek
New Contributor I
1,381 Views
Dear Eli,

My application works like this:

1.) Check if compiled binaries exist.
2.) If not, load the source code and compile the source and save the compiled binaries to disk.
3.) If compiled binaries do exist, load the binaries and continue execution.

It is point #3 which takes 20seconds as measured. The time is spent within the

Status = clBuildProgram(clProgram, 1, DeviceList, cFlags, NULL, NULL);

The clProgram is created with a call to clCreateProgramWithBinary which returns immediately. The 20 seconds delay does not happen with other vendor drivers. My code is also maybe specific in terms of Kernel count. It has about 500 kernels.

Thanks!
Atmapuri
0 Kudos
Eli_Bendersky__Intel
1,381 Views
Atmapuri,
Thanks. We'll check this issue and come back to you when we have more information.
Eli
0 Kudos
Boaz_O_Intel
Employee
1,381 Views

Hi Atmapuri,

The binaries which are returned are not executables but rather in intermediate form. This means that when you build the program from these binaries we have to recompile them all the way to device executables.

To validatemy "theory", and to make sure there isn't another issue which needs further investigaion, I would like to kindly ask you to do another measurement. The measurement should include the time it takes you to compile the sources initially (described in the step 2 in the scenario where the binaries don't exist yet). Make sure you measure only the build program and not the io of saving to the disk.
If I am correct the results should be >= 20 seconds.

Please let me know what are the results so that we can proceed with the investigation.

Thanks,
Boaz

0 Kudos
janez-makovsek
New Contributor I
1,381 Views
Dear Boaz,

Here are some compile times for my sources:

1.) Nvidia Open CL: 1 second
2.) AMD HD5770: 20 seconds
3.) AMD CPU: 35 seconds
4.) Intel CPU: 60 seconds

Binary load times:

1.) Nvidia Open CL: 1 second
2.) AMD HD5770: 1 seconds
3.) AMD CPU: 1 seconds
4.) Intel CPU: 20 seconds

So, you are correct, that the binaries actually are loaded and used, but the binary load times are by far the worst in the industry. (compile times as well). If kernels are independent from each other It would be possible to run the compilation also in parallel on all available cores. I currently I see with Intel 2 full cores being used during compile time and only 1 with the rest of the group.

Thanks!
Atmapuri
0 Kudos
Boaz_O_Intel
Employee
1,381 Views
Hi Atmapuri,

Thanks for the feedback, we will need to work on this and improve our compilation times.
And another question,will improving our binary load times to 1 second resolve your issue?


Thanks,
Boaz
0 Kudos
janez-makovsek
New Contributor I
1,381 Views
By all means : )

Thanks!
Atmapuri
0 Kudos
Reply