I'm trying to run inference on a very simple model using OpenVino GNA plugin. The model has only one affine layer defined in Keras as a Dense layer holding a 8x8 matrix of weights. All I want to do is multiply the input 1x8 vector by this 2D matrix of weights.
I was able to convert this model to IR representation. However when I try to load it into the GNA plugin I get the following error:
Exception: [GNAPlugin] in function void GNAPluginNS::GNAPlugin::LoadNetwork(InferenceEngine::ICNNNetwork &): The plugin does not support networks with MIXED format. Supported network precisions are FP32, FP16
I'm a little bit confused, because I'm defining the model in Keras like this:
matrix = np.ones(shape=[8, 8], dtype=np.float32) a = 1 for i in range(8): for j in range(8): matrix
= a a += 1 bias = np.zeros(8, dtype=np.float32) input_y_float = Input(shape=(1, 8, ), dtype=tf.float32, name="input_y_float") matmul_weights_out = Dense(units=8, activation='linear', weights=(matrix, bias.reshape(8, )), trainable=False)(input_y_float) model = Model(inputs=input_y_float, outputs=matmul_weights_out) model.compile(optimizer='Adam', loss='mean_squared_error')
Then I simply dump the model to .h5 format, convert it to .pb and run Model Optimizer like this:
python mo_tf.py --input_model matrix_mul_weights.pb --input "input_y_float" --input_shape (1,1,8)
Is there any way I can enforce MO to generate a IR with consistent precision across the model?
I'm attaching the .pb and resulting .xml & .bin files if anyone wants to try it out on their side.
Thanks in advance for any help.