Hi, I have a code optimization project for a signal processing task (audio echo cancellation). My client has a core i5-7500 intel HD 630 graphics system while I have a core i7 9 gen system with NVIDIA RTX 2070. The requirement of my client is to optimize the code so that it runs in under 20 ms instead of the current 100ms per frame of 192 samples of PCM sound. The task has to be done in visual studio 2017 using C++, OpenCL, and assembly language (if needed). There are now two issues,
1) which combination of OpenCL SDK + runtime + libraries to use
2) is there a cross-platform profiling tool so that I can ensure that the optimized code on MY machine will run in at most 20ms on my client's machine?
The deliverable is a Visual Studio 2017 Project.
Regarding 1) it is not clear to me that if I use NVIDIA's OpenCL libraries and headers in msvc project they will compile and run on my client's machine? There are several OpenCL examples on NVIDIA website that are complete MSVC projects and illustrate everything from multithreaded.
As a last question, can i use the OpenCL runtime for intel CPU together with the OpenCL libraries and headers provided by NVIDIA?
For the Nvidia GPU profiling (provided your hotspot(s)) are executed by the GPU you may use nvprof:
I'm not aware of any cross-platform (GPU) profiler