Community
cancel
Showing results for 
Search instead for 
Did you mean: 
idata
Community Manager
351 Views

mvNCProfile misunderstand PRelu in Caffe prototxt

Hi,

 

I'm trying to compile Sphereface into graph with NCAPI v1. But I found there's error after the compilation. The report from mvNCCheck shows the result are NANs, and mvNCProfile shows a model structure which is not the same as the model should be - the eltwise layer should combine two "prelued" conv layers, but after compilation, it combines two conv layers straightly, ignoring the prelu layer. You can see the differences between the below images.

 

But I found that relu works fine in Alexnet and GoogleNet written in caffe. Is that special with prelu layer? could anyone tell me how to fix it please? The result is always NaN, I think it's because the eltwise layers always add two conv layers without the prelu.

 

Thank you!

 

A part of the prototxt is as below, you can find it in https://github.com/wy1iu/sphereface/blob/master/train/code/sphereface_deploy.prototxt

 

name: "SpherefaceNet-20"

 

input: "data"

 

input_shape

 

{ dim:1 dim:3 dim:112 dim:96}

 

######## CNN Architecture

 

layer {

 

name: "conv1_1"

 

type: "Convolution"

 

bottom: "data"

 

top: "conv1_1"

 

param {

 

lr_mult: 1

 

decay_mult: 1

 

}

 

param {

 

lr_mult: 2

 

decay_mult: 0

 

}

 

convolution_param {

 

num_output: 64

 

kernel_size: 3

 

stride: 2

 

pad: 1

 

weight_filler {

 

type: "xavier"

 

}

 

bias_filler {

 

type: "constant"

 

value: 0

 

}

 

}

 

}

 

layer {

 

name: "relu1_1"

 

type: "PReLU"

 

bottom: "conv1_1"

 

top: "conv1_1"

 

}

 

layer {

 

name: "conv1_2"

 

type: "Convolution"

 

bottom: "conv1_1"

 

top: "conv1_2"

 

param {

 

lr_mult: 1

 

decay_mult: 1

 

}

 

param {

 

lr_mult: 0

 

decay_mult: 0

 

}

 

convolution_param {

 

num_output: 64

 

kernel_size: 3

 

stride: 1

 

pad: 1

 

weight_filler {

 

type: "gaussian"

 

std: 0.01

 

}

 

bias_filler {

 

type: "constant"

 

value: 0

 

}

 

}

 

}

 

layer {

 

name: "relu1_2"

 

type: "PReLU"

 

bottom: "conv1_2"

 

top: "conv1_2"

 

}

 

layer {

 

name: "conv1_3"

 

type: "Convolution"

 

bottom: "conv1_2"

 

top: "conv1_3"

 

param {

 

lr_mult: 1

 

decay_mult: 1

 

}

 

param {

 

lr_mult: 0

 

decay_mult: 0

 

}

 

convolution_param {

 

num_output: 64

 

kernel_size: 3

 

stride: 1

 

pad: 1

 

weight_filler {

 

type: "gaussian"

 

std: 0.01

 

}

 

bias_filler {

 

type: "constant"

 

value: 0

 

}

 

}

 

}

 

layer {

 

name: "relu1_3"

 

type: "PReLU"

 

bottom: "conv1_3"

 

top: "conv1_3"

 

}

 

layer {

 

name: "res1_3"

 

type: "Eltwise"

 

bottom: "conv1_1"

 

bottom: "conv1_3"

 

top: "res1_3"

 

eltwise_param {

 

operation: 1

 

}

 

}
Tags (1)
0 Kudos
1 Reply
idata
Community Manager
89 Views

I just change the relu layer like

 

_layer {

 

name: "relu1_2"

 

type: "PReLU"

 

bottom: "conv1_2"

 

top: "conv1_2"

 

}_

 

to

 

_layer {

 

name: "relu1_2"

 

type: "PReLU"

 

bottom: "conv1_2"

 

top: "relu1_2"

 

}_

 

and change the eltwise layer like

 

_layer {

 

name: "res1_3"

 

type: "Eltwise"

 

bottom: "conv1_1"

 

bottom: "conv1_3"

 

top: "res1_3"

 

eltwise_param {

 

operation: 1

 

}

 

}_

 

to

 

_layer {

 

name: "res1_3"

 

type: "Eltwise"

 

bottom: "relu1_1"

 

bottom: "relu1_3"

 

top: "res1_3"

 

eltwise_param {

 

operation: 1

 

}

 

}_

 

it works, at least no NaNs but with much lower accuracy. I still want to know if it only happens on PRelu as the relu works well in GoogleNet

Reply