Inference results by FPGA different from ones by CPU

dai__sijie · ‎03-20-2019

I'm recently using OpenVINO to plant a CNN model on FPGA.

I have succeeded in running it on a CPU, and the inferred results are identical (completely same in labels and difference < 1e-5 in probabilities) with the original TensorFlow ones.

But when I try to run inference on an FPGA with the same converted model, more than 1% labels are incorrect. I also tried ImageNet data with Inception V3 on both CPU and FPGA, the top1 and top5 rates drop from 82.7% and 96.5% to 68.9% and 89.0%, respectively.

The inference time of FPGA is also longer, approximately twice of that of my CPU.

Does anyone else have met this problem?

Shubha_R_Intel · ‎03-22-2019

Dear sijie:

It's difficult to know exactly what is happening.

May I know the answers to these questions ?

Which bitstream are you using ?
What command are you running ?
What does your graph/topology look like ?
What’s the actual throughput you are getting on CPU vs. FPGA ?

For instance the FP11 bitstream should not cause more than a 1% degradation in accuracy in FPGA versus CPU.

It is possible that you selected an FP11 bitstream/but the topology doesn't support it.

Can you kindly provide more details ?

Thanks !

Shubha

dai__sijie · ‎03-24-2019

Dear Shubha,

Thanks for the reply.

The board I'm using is a third-party board from Speed Clouds, based on the A10 1150 FPGA.

The bitstream is compiled by an Intel co-worker, both fp16 and fp11 are provided, but neither can give precise probabilities.

The command is "python3 classification_sample_async.py -m frozen_inception_v3_b64_fp16.xml -i tfrecord_fnlist -d HETERO:FPGA,CPU -pc". The python code is found in "/opt/intel/computer_vision_sdk/deployment_tools/inference_engine/samples/python_samples/", and is slightly modified to read matrices directly.

The topology is GoogLeNet Inception V3.

I haven't watched the throughput of the CPU and the FPGA yet, I can show it later.

Is there any place I can find some instructions to program my own bitstream? The demo page only gives some pre-programmed ones.

Best regards,

Sijie

dai__sijie · ‎03-25-2019

The throughput for FPGA is 545ms per batch (64 images) and 134ms per batch (64 images) for CPU.

Shubha_R_Intel · ‎03-26-2019

Dear dai, sijie:

Are you an intel employee ? I'm confused because of this statement, but I couldn't find you in the Intel employee directory either :

The bitstream is compiled by an Intel co-worker, both fp16 and fp11 are provided, but neither can give precise probabilities.

I'm afraid that this forum is not the appropriate place for an Intel person to help you, especially since you're not using an officially supported board. That said, perhaps someone in the community out there can help.

Best of luck and thank you for using OpenVino !

Shubha

Shubha_R_Intel · ‎03-26-2019

By the way dai, sijie, the official supported hardware is listed here:

https://software.intel.com/en-us/openvino-toolkit/documentation/system-requirements

Thanks for using OpenVino !

Shubha

dai__sijie · ‎03-26-2019

Thanks for the reply.

>Are you an intel employee ? I'm confused because of this statement, but I couldn't find you in the Intel employee directory either :

I'm not an Intel employee, but I'm cooperating with one.

I can actually ask him, but I just didn't want to bother him too much.