mvNCProfile misunderstand PRelu in Caffe prototxt

idata · ‎10-29-2018

Hi,

I'm trying to compile Sphereface into graph with NCAPI v1. But I found there's error after the compilation. The report from mvNCCheck shows the result are NANs, and mvNCProfile shows a model structure which is not the same as the model should be - the eltwise layer should combine two "prelued" conv layers, but after compilation, it combines two conv layers straightly, ignoring the prelu layer. You can see the differences between the below images.

But I found that relu works fine in Alexnet and GoogleNet written in caffe. Is that special with prelu layer? could anyone tell me how to fix it please? The result is always NaN, I think it's because the eltwise layers always add two conv layers without the prelu.

Thank you!

A part of the prototxt is as below, you can find it in https://github.com/wy1iu/sphereface/blob/master/train/code/sphereface_deploy.prototxt

input: "data"

input_shape

{ dim:1 dim:3 dim:112 dim:96}

######## CNN Architecture

layer {

type: "Convolution"

bottom: "data"

top: "conv1_1"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 2

decay_mult: 0

}

convolution_param {

num_output: 64

kernel_size: 3

stride: 2

pad: 1

weight_filler {

type: "xavier"

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "PReLU"

bottom: "conv1_1"

top: "conv1_1"

}

layer {

type: "Convolution"

bottom: "conv1_1"

top: "conv1_2"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 0

decay_mult: 0

}

convolution_param {

num_output: 64

kernel_size: 3

stride: 1

pad: 1

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "PReLU"

bottom: "conv1_2"

top: "conv1_2"

}

layer {

type: "Convolution"

bottom: "conv1_2"

top: "conv1_3"

param {

lr_mult: 1

decay_mult: 1

}

param {

lr_mult: 0

decay_mult: 0

}

convolution_param {

num_output: 64

kernel_size: 3

stride: 1

pad: 1

weight_filler {

type: "gaussian"

std: 0.01

}

bias_filler {

type: "constant"

value: 0

}

layer {

type: "PReLU"

bottom: "conv1_3"

top: "conv1_3"

}

layer {

type: "Eltwise"

bottom: "conv1_1"

bottom: "conv1_3"

top: "res1_3"

eltwise_param {

operation: 1

}

idata · ‎10-30-2018

I just change the relu layer like

_layer {

type: "PReLU"

bottom: "conv1_2"

top: "conv1_2"

}_

to

_layer {

type: "PReLU"

bottom: "conv1_2"

top: "relu1_2"

}_

and change the eltwise layer like

_layer {

type: "Eltwise"

bottom: "conv1_1"

bottom: "conv1_3"

top: "res1_3"

eltwise_param {

operation: 1

}

}_

to

_layer {

type: "Eltwise"

bottom: "relu1_1"

bottom: "relu1_3"

top: "res1_3"

eltwise_param {

operation: 1

}

}_

it works, at least no NaNs but with much lower accuracy. I still want to know if it only happens on PRelu as the relu works well in GoogleNet