- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to compile Sphereface into graph with NCAPI v1. But I found there's error after the compilation. The report from mvNCCheck shows the result are NANs, and mvNCProfile shows a model structure which is not the same as the model should be - the eltwise layer should combine two "prelued" conv layers, but after compilation, it combines two conv layers straightly, ignoring the prelu layer. You can see the differences between the below images.
But I found that relu works fine in Alexnet and GoogleNet written in caffe. Is that special with prelu layer? could anyone tell me how to fix it please? The result is always NaN, I think it's because the eltwise layers always add two conv layers without the prelu.
Thank you!
A part of the prototxt is as below, you can find it in https://github.com/wy1iu/sphereface/blob/master/train/code/sphereface_deploy.prototxt
name: "SpherefaceNet-20"
input: "data"
input_shape
{ dim:1 dim:3 dim:112 dim:96}
######## CNN Architecture
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
stride: 2
pad: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_1"
type: "PReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
stride: 1
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_2"
type: "PReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "conv1_3"
type: "Convolution"
bottom: "conv1_2"
top: "conv1_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
stride: 1
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_3"
type: "PReLU"
bottom: "conv1_3"
top: "conv1_3"
}
layer {
name: "res1_3"
type: "Eltwise"
bottom: "conv1_1"
bottom: "conv1_3"
top: "res1_3"
eltwise_param {
operation: 1
}
}
- Tags:
- Caffe
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just change the relu layer like
_layer {
name: "relu1_2"
type: "PReLU"
bottom: "conv1_2"
top: "conv1_2"
}_
to
_layer {
name: "relu1_2"
type: "PReLU"
bottom: "conv1_2"
top: "relu1_2"
}_
and change the eltwise layer like
_layer {
name: "res1_3"
type: "Eltwise"
bottom: "conv1_1"
bottom: "conv1_3"
top: "res1_3"
eltwise_param {
operation: 1
}
}_
to
_layer {
name: "res1_3"
type: "Eltwise"
bottom: "relu1_1"
bottom: "relu1_3"
top: "res1_3"
eltwise_param {
operation: 1
}
}_
it works, at least no NaNs but with much lower accuracy. I still want to know if it only happens on PRelu as the relu works well in GoogleNet
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page