I called DGEMM function in offload to perform matrix multiplication.
#pragma offload target (mic) in(tempH, tempW, idx, buf: ALLOC FREE)\
in (TransA, TransB, K, alpha, beta: REUSE RETAIN) \
in(startA : length(0) REUSE RETAIN) \
in(dB : length(0) REUSE RETAIN) \
in(startC : length(0) REUSE RETAIN) \
&tempH, &tempW, &K, &alpha, startA, &tempH,
dB, &K, &beta, startC, &tempH);
startA, dB and startC are three arrays transferred to the mic with offlload_transfer(REUSE RETAIN) previously.
Although the buffers are loaded, the aforementioned buffer reloads buffers and allocates new memory to them on mic.
I wrote a simple matrix multiplication function and replace dgemm with this function. It is noteworthy that memory reallocation problem did not occur with this function.
I would appreciate it if someone could explain why running the offload with DGEMM result in memory reallocation and how the problem can be resolved.
Rajiv Deodhar (Intel) wrote:
Could you explain how you concluded that startA, dB and startC have new memory allocated for them?
Hi Mr. Rajiv Deodhar,
I found that this is Dgemm kernel wasting Xeon phi's memory. I set offload-report environment variable and found that the program did not allocate new memory to the buffers. However, it is dgemm kernel using memory more and more each time which is called.
I cannot explain the strange behavior of dgemm. I would appreciate it if you could help me to understand why it happens.