Intel® Graphics Performance Analyzers (Intel® GPA)
Improve your game's performance by quickly specifying problem areas

Checking GPU performance with Intel GPA

Mikhail_Smirnov
Beginner
737 Views
Hi,
I have a challenge to check the perfomance of my application on Intel HD4000, nVidia and AMD GPU architectures.
OpenCL code only.
Is it possible to provide such tests with Intel GPA or I should use different tools for each platform?
Thank you.
Regards,
Mikhail 
0 Kudos
7 Replies
Neal_Pierman
Valued Contributor I
737 Views
Hello,

Could you please provide some more information about what you are trying to do?

Even though you mention "checking GPU performance", it's pretty hard to separate out the GPU without understanding what the CPU is doing as well.

So when you mention trying to check the performance of your application, are you trying to determine system performance, detailed frame analysis, or "platform" analysis (that is, operation of your app across all CPU cores)? Intel GPA has three main tools, one each for the different analysis operations you want (Intel GPA System Analyzer, Intel GPA Frame Analyzer, and Intel GPA Platform Analyzer), and it's not clear what kind of performance measurement you want.

You also mention "OpenCL code only" -- can you help me understand what you mean by this? Does this mean you only want to analyze the OpenCL threads, and you don't care about the rest of the system performance? Also, is this for the CPU, GPU, or both? Again, "only GPU" may hide some important aspects of your overall performance, so please help me understand what you want to do here.

Thanks!

Neal
0 Kudos
Mikhail_Smirnov
Beginner
737 Views
Hi Neal,
Sorry for may be very common request.
We have very well paralleled SW (at least we think so) running on CPU. This SW provides real-time image processing and at least 80% of code works with different parts of images, so have now intersections by data and control flow. We use Intel CI-3770 now and TBB for parallel computing.
Our current task is to port all SW (or major part) on GPU. And out terget GPU is HD4000 for now (we want to fit ultrabooks platform). But i'm worried about HD4000 performance (it has much less graphic cores then embedded AMD graphics), so I'd like to check the performance of other GPU also (i.e. low-cost nVidia or AMD cards). 
I plan to hva as much portable OpenCL code as possible and I'm interested if it possiblr to run and measure it's performance under all GPUs in Intel GPA or it is better to use separate tools for each platform?
Platform will be CI7-3770 + HD4000, or CI7-3770 + external card to be tested.
Hope this answers at leas part of your questions.
Looking forward you comments.
Thank you.
0 Kudos
Neal_Pierman
Valued Contributor I
737 Views
Hello Mikhail,

So re-reading your comments, I believe that what you really care about the time to complete a specific image-processing task; that is, how long from "start" to "end" for a specific task on different architectures. You also indicated that you hope to improve performance by pushing more of the processing to the GPU -- by keeping the CPU fixed what's the overall performance of this task on Intel, nVidia, and AMD GPU's?

But I'm still not quite clear about your original question: are you looking for benchmarks for the various GPU's with OpenCL, or are you really wanting to analyze and optimize your workloads? I think you are really looking for both. If you want benchmarks, your favorite search engine can help find various OpenCL benchmark tools; if you want to help understand what's happening across both the CPU and GPU cores in order to improve performance, then that's where Intel GPA can help -- this link provides information about using Intel GPA with OpenCL.

One last thing -- have you seen IPP (Intel® Integrated Performance Primitives) -- this may help in optimizing the total image processing task (independent of the GPU).

Is this the kind of information you wanted?

Regards,

Neal






0 Kudos
Mikhail_Smirnov
Beginner
737 Views
Hi Neal, thank you very much for the infotrmation provided. This is very close to what I was looking for!
0 Kudos
Neal_Pierman
Valued Contributor I
737 Views
Hello,

I'm glad that you found this helpful -- after you've read through some of this information please let me know if you have any follow-up questions.

Regards,

Neal
0 Kudos
sureshgupta22
Beginner
737 Views
I would like some insurance during the render that at least the current progress is recoverable by performing a periodic write to disk. In outputmode it is not rendering the entire image each pass and instead rendering individual tiles to the max SPP


0 Kudos
Neal_Pierman
Valued Contributor I
737 Views
I would like some insurance during the render that at least the current progress is recoverable by performing a periodic write to disk. In outputmode it is not rendering the entire image each pass and instead rendering individual tiles to the max SPP

Hello,

I'm not sure how your comment/question is related to the topic of this thread (that is, "checking gpu performance") -- did you mean to have this included this here or start a new thread?

Also, can you provide more specific information on your comment/question -- including your hardware and graphics platform, your software and API's in use, and more details on the exact problem that you are seeing (screen shots, error logs, etc.).

Thanks!

Neal



0 Kudos
Reply