I have a model ,it only cost 30ms on CPU, but 57ms on GPU. the time cost is average value by 1000 times.
how can I reduce execution time on GPU?
there are five "Interp" layers in the model， is this the main reson that doing inference on GPU device cost much more time?
Does anyone know the reason？
For more complete information about compiler optimizations, see our Optimization Notice.