There are three different locations about the usage of model.
1. DLA architecture generator generate xxx. prototxt
2.The file used by linux command line is xxx. aocx and then programmed into the PAC FPGA
3.In X86 C++ code plugin class, there is no mention about the FPGA bitstream.
What is the relation between this three location?
The .aocx file will be link to prototxt file where OpenVINO will refer to as to understand what is supported by the FPGA.
The plugin class will only execute the acceleration into the FPGA depending on what type of design is being programmed into the FPGA by referring to prototxt. It will be done automatically by the OpenVINO tools.
Thanks for your response @JohnT_Intel
There are still some puzzles with me.
The prototxt for FPGA will only provide the information on what is available in the FPGA and depending on what Model you are running. If the function can be performed in FPGA then it will be run in FPGA and if it is not able to be performed then it will be run in CPU. The FPGA will not have any architecture but just the function that is able to performed convolution and other features depending on which bitstream is being programmed.
The OpenCL API is only contain the driver to interface with the FPGA. OpenVINO will communicate with the driver where it will send the data to FPGA if it need to be performed in FPGA.
In my understand, the prototxt contains the description of this figure. If there is a custom primitive, the more description will be added according to the manual <<dlas-compile-and-customize.pdf>.
Is it right?
Thank you very much.
One more question. I read the source code of openvino and get the information that Cldnn already support about fifty primitives.
(1)Whether all of those primitives are implemented by FPGA?
(2)If some primitives in DNN can be merged in FPGA, that is the first primitive's output can be the input of the second primitive, how to write the source code and see the compiler result?
It will depend on which model you are running. If the layer is able to executed in FPGA then it will be performed in FPGA.
You may run the "Per Layer performance" in order to understand where the layer is being performed. You will need to modify your host code in order to derive the model to run on specific location.