Solved: Error when loading network to GNA inference engine

Karol_D_Intel · ‎09-10-2019

Hi All,

I'm trying to perform inference using a very simple model defined in TF freeze graph (attached) converted to IR format using Model Optimizer (attached). Unfortunately when executing my code I get the following error:

Exception: [GNAPlugin] in function void GNAPluginNS::GNAPlugin::LoadNetwork(InferenceEngine::ICNNNetwork &): The plugin does not support layer: matrix_multiplication_explicit/MatMul:Gemm

AFAIU this means that my model includes a GEMM layer that is not supported by the GNA plugin. Is there any way I can enforce Model Optimizer NOT to use GEMM in the IR representation and instead use something that will be supported by GNA plugin?

BTW, I'm currently using the following command to convert .pb file to IR:

python mo_tf.py --input_model matrix_mul_explicit.pb --input "input_x_float","input_y_float" --input_shape (1,8,8),(1,8,1)

Thanks in advance for any help.

Regards,

Karol

Shubha_R_Intel · ‎09-11-2019

Dear Dear Karol Duzinkiewicz

I later found that this is in fact not a bug, but you're correct - GNA does not support GEMM at the moment.

BatchMatMul operation in attached model has dynamic inputs. And you have to print your pb model to text to see the BatchMatMul. You will not see it in the MO Generated IR.

node {
  name: "prefix/matrix_multiplication_explicit/MatMul"
  op: "BatchMatMul"
  input: "prefix/input_x_float"
  input: "prefix/input_y_float"
  attr {
    key: "T"
    value {
      type: DT_DOUBLE
    }
  }

BatchMatMul is in the Supported Framework Layers Doc and mapped to GEMM but it's under ONNX not Tensorflow, and that is a documentation bug actually.

FullyConnected - You can find FullyConnected definition here. It gets a 2D or 4D input blob. And it is properly represented in the aforementioned document and correctly mapped backward to MatMul.

Gemm - is pure matrix multiplication operation with no restrictions on inputs. Unfortunately GEMM is supported by a few plugins and GNA is not in that list.

Hope it helps and I apologize for the confusion.

Shubha

View solution in original post

Shubha_R_Intel · ‎09-10-2019

Dear Karol Duzinkiewicz

According to the Supported Frameworks Document for Tensorflow, MatMul should be converted to FullyConnected. So yes, it's true that According to Supported Devices Gemm is not supported by GNA, but the greater question is - why is MatMul not getting converted to FullyConnected ? And FullyConnected is supported by GNAPlugin.

I think it's a bug.

Let me reproduce your error and file a bug on your behalf.

Shubha

Shubha_R_Intel · ‎09-11-2019

Dear Dear Karol Duzinkiewicz

I later found that this is in fact not a bug, but you're correct - GNA does not support GEMM at the moment.

BatchMatMul operation in attached model has dynamic inputs. And you have to print your pb model to text to see the BatchMatMul. You will not see it in the MO Generated IR.

node {
  name: "prefix/matrix_multiplication_explicit/MatMul"
  op: "BatchMatMul"
  input: "prefix/input_x_float"
  input: "prefix/input_y_float"
  attr {
    key: "T"
    value {
      type: DT_DOUBLE
    }
  }

BatchMatMul is in the Supported Framework Layers Doc and mapped to GEMM but it's under ONNX not Tensorflow, and that is a documentation bug actually.

FullyConnected - You can find FullyConnected definition here. It gets a 2D or 4D input blob. And it is properly represented in the aforementioned document and correctly mapped backward to MatMul.

Gemm - is pure matrix multiplication operation with no restrictions on inputs. Unfortunately GEMM is supported by a few plugins and GNA is not in that list.

Hope it helps and I apologize for the confusion.

Shubha

Karol_D_Intel · ‎09-12-2019

OK, thanks - now I get it.