OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.

mic_server dies after consuming a lot of memory

moises_v_
Beginner
625 Views

Hi, I have an OpenCL program of simulation that consists in a loop that launch 4 kernels per iteration. The execution can last hours.

I've launched this same application in Nvidia Fermi, ATI Radeon HD, Intel CPU X5650, Intel CPU E5...  Now, I'm launching this application in Xeon Phi.

The problem is: I execute the application in the Xeon Phi node using the Xeon Phi as a OpenCL device (ACCELERATOR opencl type). More or less, at the second minute of execution, the mic_server process starts to consume more and more memory (RES memory in linux top command), and when this memory reaches 1GB the mic_server process dies. The compiler is ICPC 13.1.1 and the Intel Opencl version is 1.2-3.2.1.16712

Did anyone have this same problem? I appreciate any help.

Thank you in advance

 

0 Kudos
4 Replies
Yuri_K_Intel
Employee
625 Views

Hi,

What version of MPSS are you using? Please note that officially supported version for this release is MPSS 3.1.1.

In case this doesn't help (or you're using exactly this version) I'd like to ask for a reproducer for this issue. It could be either your entire application or some minimal, stripped down version which exposes the problem.

Thanks, Yuri

0 Kudos
moises_v_
Beginner
625 Views

Thank you Yuri,

The MPSS version is 3.1.1. I forgot to say before  that for the smallest size of the program, there are 160K iterations (4 kernels for iteration) and the problems appear after 80K iterations more or less. There are not mallocs, allocs, or any memory allocation in the iterations.

I will try to reproduce the problem but today it is impossible because I don't have time :-/ Probably at the weekend.

Thank you again :)

0 Kudos
moises_v_
Beginner
625 Views

Hi Yuri,

I've attached a similar example of my program simulator. I've used C++ Opencl wrapper. I've simplified a lot the program and now, in each iteration of the loop, I only launch a kernel and transfer 4 bytes from Device to Host. This code has a makefile and a dummy kernel that does not anything.

Currently, the execution of this code abort at 544K iterations. And the cause is that mic_server starts to consume a lot of memory until it dies.

Could the driver not be able to unallocate temporal memory that probably is used in clEnqueueReadBuffer?

https://www.dropbox.com/s/ropi10clyhgg48i/moises_break.tar.gz

Thank you for the help,

 

Moisés Viñas Buceta

http://gac.udc.es/~moises/index_en.html

 

 

0 Kudos
Yuri_K_Intel
Employee
625 Views
Hi Moises, This is just to inform you that the issue has been fixed and the fix will be available in our next release (14.2) later this year. Thanks, Yuri
0 Kudos
Reply