- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a Sycl lambda function which takes much memory on host side (more than 500MB) just after that is submitted to GPU, and also took a few 10 seconds before run into its forall loop. By checking the program using Vtune, I found the zeModuleCreate is doing something during the period. Is there anyway to find out what kind of objects it is creating?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
zeModuleCreate creates a module either by Just-in-time (JIT) compiling intermediate language code or loading native binary that was compiled ahead of time (AOT). If you are concerned about JIT compilation time, you can try options for AOT compilation. You may see here for how to do AOT compilation. If your concern is strictly benchmarking kernel performance, you can try including a warm-up phase since the JIT compilation time would only impact the first kernel invocation. Then you may time subsequent calls to the kernel and that will exclude JIT compilation time.
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thank you for the reply. My program uses pseudo double precision math on device side. It seems host(ZeModuleCreating) is creating some large data structure on the host side before submit kernel to device. When I see it runs, usage of host memory start increaing up to a few hundred mega bytes (it take 20 -30 seconds for it) then start running kernel quickly. I wanted to know what is causing this behaviour. Any way to find out?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page