- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Does anybody know of a 2D convolution function that is well optimized for the phi? We have 2-megapixel images and a 26x26 nonseparable kernel. Also, how much speedup should we expect compared to a 6-core i7 or Xeon? My baseline is ippiFilter_8u_C4R on a CPU or nppiFilter_8u_C4R on GPU.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My understanding is that most computer vision convolutions don’t use floating point, and so might not be best suited for the coprocessor. Currently, there no MIC implementations of ippi_filter_8u_CAR available. Still, if you can make a case for its optimization, we’ll forward it to the IPP development team. The more well justified requests for MIC optimized IPP functionality we have, the more likely it will be recognized as a priority.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Currently, NVidia nppiFilter is much faster on a mid-range GPU than ippiFilter on a mid-range CPU, when used with moderately large kernels. I would think Intel would want to do even better with the Phi. Evidentally, NVidia thinks it is important enough to continue shipping updates to NPP, which I find useful. Despite the lack of extreme integer support (which GeForce also lacks), the Phi should do well, especially in high-def, due to being able to do tiled execution over its well-equipped memory hierarchy.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page