Hello Liang Heng,
What are the GPU models in the systems you are testing?
I think you will find in most cases GPU inference will be faster on GPUs with more EUs and/or higher GPU clock but there are many other factors too.
For details on GPU EU/Clock please refer to https://ark.intel.com/content/www/us/en/ark.html
Dear liang, heng,
In your experiments I assume that you're using the same image(s) each time. What nikos said is correct but also depending on the model you are using and the image sizes you are passing in, the GPU plugin optimizes models for certain ideal kernel sizes. It's best to feed in an image size which is optimal for the model.
You can also use the benchmark_app to perform experiments.
Hope it helps,