Community
cancel
Showing results for 
Search instead for 
Did you mean: 
95 Views

Why is OpenVX application slower using GPU than CPU?

I'm try to accelerate the process using OpenVX and GPU.

Firstly, I compare the process speed CPU and GPU using OpenVX sample, "video_stabilization".

According to this result, GPU is slower. Why? How to improve?

【CPU】

./video_stabilization --input toy_flower.mp4 --hetero-config hetero.config.cpu-all.txt 
Video frame size: 1280x720
[ INFO ] Number of supported targets: 3
[ INFO ]     Target[0] name: intel.cpu
[ INFO ]     Target[1] name: intel.gpu
[ INFO ]     Target[2] name: intel.ipu
[ INFO ] Node vxColorConvertNode is assigned to intel.cpu
[ INFO ] Node vxChannelExtractNode is assigned to intel.cpu
[ INFO ] Node vxHarrisCornersNode is assigned to intel.cpu
[ INFO ] Node vxGaussianPyramidNode is assigned to intel.cpu
[ INFO ] Node vxOpticalFlowPyrLKNode is assigned to intel.cpu
[ INFO ] Node vxEstimateTransformNode not assigned to any specific target (missed in config)
[ INFO ] Node vxChannelExtractNode(R) is assigned to intel.cpu
[ INFO ] Node vxChannelExtractNode(G) is assigned to intel.cpu
[ INFO ] Node vxChannelExtractNode(B) is assigned to intel.cpu
[ INFO ] Node vxWarpAffineNode(R) is assigned to intel.cpu
[ INFO ] Node vxWarpAffineNode(G) is assigned to intel.cpu
[ INFO ] Node vxWarpAffineNode(B) is assigned to intel.cpu
[ INFO ] Node vxChannelCombineNode(warp) is assigned to intel.cpu
Reached end of video file
Processed 166 iterations
0.158 ms by EstimateTransformKernel averaged by 166 samples
Sample was finished successfully
4.327 ms by ProcessFrame averaged by 166 samples
1.922 ms by ReadFrame averaged by 167 samples
9.374 ms by Frame averaged by 166 samples
62.150 ms by vxVerifyGraph averaged by 1 samples
4.289 ms by vxProcessGraph averaged by 166 samples
0.153 ms by estimateTransform_lib averaged by 166 samples

 

【GPU】

./video_stabilization --input toy_flower.mp4 --hetero-config hetero.config.gpu-all.txt 
Video frame size: 1280x720
[ INFO ] Number of supported targets: 3
[ INFO ]     Target[0] name: intel.cpu
[ INFO ]     Target[1] name: intel.gpu
[ INFO ]     Target[2] name: intel.ipu
[ INFO ] Node vxColorConvertNode is assigned to intel.gpu
[ INFO ] Node vxChannelExtractNode is assigned to intel.gpu
[ INFO ] Node vxHarrisCornersNode is assigned to intel.gpu
[ INFO ] Node vxGaussianPyramidNode is assigned to intel.gpu
[ INFO ] Node vxOpticalFlowPyrLKNode is assigned to intel.gpu
[ INFO ] Node vxEstimateTransformNode not assigned to any specific target (missed in config)
[ INFO ] Node vxChannelExtractNode(R) is assigned to intel.gpu
[ INFO ] Node vxChannelExtractNode(G) is assigned to intel.gpu
[ INFO ] Node vxChannelExtractNode(B) is assigned to intel.gpu
[ INFO ] Node vxWarpAffineNode(R) is assigned to intel.gpu
[ INFO ] Node vxWarpAffineNode(G) is assigned to intel.gpu
[ INFO ] Node vxWarpAffineNode(B) is assigned to intel.gpu
[ INFO ] Node vxChannelCombineNode(warp) is assigned to intel.gpu
Reached end of video file
Processed 166 iterations
0.144 ms by EstimateTransformKernel averaged by 166 samples
Sample was finished successfully
10.054 ms by ProcessFrame averaged by 166 samples
1.899 ms by ReadFrame averaged by 167 samples
15.358 ms by Frame averaged by 166 samples
2786.578 ms by vxVerifyGraph averaged by 1 samples
10.005 ms by vxProcessGraph averaged by 166 samples
0.141 ms by estimateTransform_lib averaged by 166 samples

0 Kudos
1 Reply
Kenneth_C_Intel
Employee
95 Views

Hi i almost suspect that this may be because of data transfer. Did you run into the same issue when using a longer video of the same resolution ?

 

Reply