GNA Plugin error while loading model with single affine layer

Karol_D_Intel · ‎09-13-2019

Hi All,

I'm trying to run inference on a very simple model using OpenVino GNA plugin. The model has only one affine layer defined in Keras as a Dense layer holding a 8x8 matrix of weights. All I want to do is multiply the input 1x8 vector by this 2D matrix of weights.

I was able to convert this model to IR representation. However when I try to load it into the GNA plugin I get the following error:

Exception: [GNAPlugin] in function void GNAPluginNS::GNAPlugin::LoadNetwork(InferenceEngine::ICNNNetwork &): The plugin does not support networks with MIXED format. Supported network precisions are FP32, FP16

I'm a little bit confused, because I'm defining the model in Keras like this:

    matrix = np.ones(shape=[8, 8], dtype=np.float32)
    a = 1
    for i in range(8):
        for j in range(8):
            matrix = a
            a += 1

    bias = np.zeros(8, dtype=np.float32)

    input_y_float = Input(shape=(1, 8, ), dtype=tf.float32, name="input_y_float")

    matmul_weights_out = Dense(units=8, activation='linear', weights=(matrix, bias.reshape(8, )),
                               trainable=False)(input_y_float)

    model = Model(inputs=input_y_float, outputs=matmul_weights_out)
    model.compile(optimizer='Adam', loss='mean_squared_error')

Then I simply dump the model to .h5 format, convert it to .pb and run Model Optimizer like this:

python mo_tf.py --input_model matrix_mul_weights.pb --input "input_y_float" --input_shape (1,1,8)

Is there any way I can enforce MO to generate a IR with consistent precision across the model?

I'm attaching the .pb and resulting .xml & .bin files if anyone wants to try it out on their side.

Thanks in advance for any help.

Regards,

Karol

Kenneth_C_Intel · ‎09-13-2019

Hi, using the flag

--data_type

with either FP16 or FP32 as the argument should quantize the weights and biases.

Try that and let me know if that fixes your issue