Intel® oneAPI DL Framework Developer Toolkit
Gain insights from peers and Intel experts to develop new deep learning frameworks or to customize an framework utilizing common APIs.
25 Discussions

When inference is executed, oneDNN Forced termination.

k_higashi
Beginner
4,053 Views

Hi.

I modified the sample code cnn_inference_f32.cpp and
convolution Created only one network.

I also created a function for the part that creates the primitive.

 

<Problem>

The following issues occur:

If I specify a CPU device, it will run to the end.

C:\Test.\_build\cnn-inference-f32-cpp.exe cpu
execute,start,end
Use time: 206 ms per iteration.
Example passed on CPU.

 

However, if I specify a GPU device, it will be killed.

C:\Test>.\_build\cnn-inference-f32-cpp.exe gpu
execute,start,

As a result of investigation, it ends with the following code.

net.at(i).execute(s, net_args.at(i));

 

What am i doing wrong? Or is this a library bug?
Please advise me the cause and solution.

I have attached the code that can reproduce the problem.

 

<Information>
OS: Windows 10 Pro (21H1)
Toolkit:
Intel oneAPI 2021.3
cmake: ver.3.19.2
ninja: ver.1.8.2

CPU: Intel Core i7-1065G7 1.3GHz
Accelerator:Iris Xe Graphics
driver ver.: 27.20.100.9664

Best regard.

0 Kudos
1 Solution
Jianyu_Z_Intel
Employee
3,551 Views

Hi,

I answer your questions in below:

Hello,

 

I was able to run it on the GPU using the code I got from you.

Thank you!

 

I want a secure implementation. I have a question.

 

Question:

 

Do I need to keep only the dnnl :: memory variable?

[Intel] No. In the updated sample code, it also includes the engine, stream, net, net_args.

dnnl::memory is the key variable to lead to crash in original code.

 

For example L136

net.push_back(convolution_forward(conv1_prim_desc));

 

A local variable in conv1_prim_desc is used as an argument for push_back().

Is conv1_prim_desc safe using local variables?

[Intel] Yes.

My first answer/reply yesterday including wrong info. I have removed the wrong post.

convolution_forward(conv1_prim_desc) doesn't include the data malloced in device/GPU. So no need to set as global variable. net.push_back() will copy it to new element as you said.

But dnnl::memory includes the data in device/GPU, so it must be set as global.

 

 

Best regard.

 

View solution in original post

0 Kudos
9 Replies
AthiraM_Intel
Moderator
4,027 Views

Hi,


Thanks for reaching out to us.


We are trying to reproduce the issue, will get back to you soon with the updates.


Thanks.


0 Kudos
AthiraM_Intel
Moderator
3,987 Views

Hi,


We are able to reproduce the issue using your code and checking on it internally.



Thanks


0 Kudos
k_higashi
Beginner
3,933 Views

Hello.

I am still expecting your response regarding this issue.
I would like to know the cause of this issue and how to deal with it.

Best regard.

0 Kudos
k_higashi
Beginner
3,792 Views

Hello.

Please let us know the current status regarding this issue.

0 Kudos
Jianyu_Z_Intel
Employee
3,698 Views

Hi k_higashi,

 I find the root cause of the crash: wrong code usage in original code.

 In function create_net(), net.push(xxx) are called. But the variable xxx is local variable of create_net().

 When code leave the function create_net(), the xxx will be released.

 Then, call net.at(i).execute() will crash when access variable xxx which is unavailable now.

 But it's possible not to trigger crash when running on CPU or some GPU.

 Because in some case, the OS doesn't change memory of xxx after create_net(), so the net.at(i).execute() will get correct data in xxx's address.

 But it's only for lucky.

 If the system is busy or in other hardware, the memory of xxx will be covered soon and crash will appear frequently.

 Solution 1:

  Change the code:

   net.push() and net.execute() are called in same function. local variable is used in local. So no crash.

 Solution 2:

   Define the xxx as global variable.​

 I remember I have made same mistake when I learned oneDNN code. 

 I spend more time to recall the mistake.

 Thank you!

 

0 Kudos
Jianyu_Z_Intel
Employee
3,613 Views

Hi,

Your feedback doesn't appear in the ​community loop. So, I just see your feedback.

Question 1:

>In function create_net(), net.push(xxx) are called. But the variable xxx is local variable of create_net().

Is "net.push (xxx)" exactly "net.push_back (xxx)" ?

[Intel] Yes. it's pseudo code. In your code, it's net.push_back(xxx).

Question 2:

> Solution 2:

>   Define the xxx as global variable.

Are the following measures correct?

For example

net.push_back(convolution_forward(conv1_prim_desc));

-> I define "conv1_prim_desc" as global variable.

For example

net.push_back(reorder(conv1_dst_memory, user_dst_memory));

 -> I define "conv1_dst_memory" and "user_dst_memory" as global variable.

[Intel] Yes

---------------

Question 3:

I have doubts about the cause of the crash.

About the push_back () specification of vector,

I think "net.push_back (xxx)" reallocates memory for net and copies the value of xxx to the end of the net variable.

I think the "net" variable retains its value even when the original local variable xxx is released.

Therefore, I don't think execute uses net and the address of xxx is not directly referenced.

What am I doing wrong?

I hope for good advice.

[Intel] If net.push_back() call the deep copy method of xxx, it could avoid this issue in CPU. But I guess xxx wouldn't provide deep copy method in most cases.

In GPU case, the xxx would include some member variables assigned to GPU(device) memory, it's a little complex to implement the deep copy of xxx in such case. Nobody like to implement deep copy in GPU, it waste memory and time.

It's hard to trust std::vector to copy the variable in push_back().

Global/static/instance's member are good way to keep data.

-- What I want to achieve ---

I want the inference execution time to be as fast as possible.

​The process of [Create network] requires a considerable amount of time even if the oneDNN cache is used.

So I wondered if I could execute [Create network] as a create_net function in advance.

I want to infer only the processing of [Execute model].

[Intel] Yes, I understand your goal.

Here is sample code updated as your requirement. please refer to it.

I will publish it in next post.

Thank you!


0 Kudos
Jianyu_Z_Intel
Employee
3,612 Views

Here is sample code updated as your requirement. please refer to it.

0 Kudos
Jianyu_Z_Intel
Employee
3,552 Views

Hi,

I answer your questions in below:

Hello,

 

I was able to run it on the GPU using the code I got from you.

Thank you!

 

I want a secure implementation. I have a question.

 

Question:

 

Do I need to keep only the dnnl :: memory variable?

[Intel] No. In the updated sample code, it also includes the engine, stream, net, net_args.

dnnl::memory is the key variable to lead to crash in original code.

 

For example L136

net.push_back(convolution_forward(conv1_prim_desc));

 

A local variable in conv1_prim_desc is used as an argument for push_back().

Is conv1_prim_desc safe using local variables?

[Intel] Yes.

My first answer/reply yesterday including wrong info. I have removed the wrong post.

convolution_forward(conv1_prim_desc) doesn't include the data malloced in device/GPU. So no need to set as global variable. net.push_back() will copy it to new element as you said.

But dnnl::memory includes the data in device/GPU, so it must be set as global.

 

 

Best regard.

 

0 Kudos
Jianyu_Z_Intel
Employee
3,467 Views

Hi,

Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.


0 Kudos
Reply