- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Is there a disassembler for the OpenCL SDK that will display the GPU code produced for a kernel?
링크가 복사됨
3 응답
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Ho Jerome,
We do not provide any disassembler for displaying GPU code and currently don't have any plans to provide one. I am curious why you want to look at the disasm. Do you just want to understand what code gets generated or may be for debugging purposes? Please let me know your reasons and I will pass this on to the graphics guys.
Thanks,
Raghu
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Hi, Raghu,
I am interested in seeing what optimizations the compiler is performing. For example, I have some code that says
X = (cos(a) - cos(b))*(cos(a) + cos(b)). Looking through the disassembly, I was surprised to discover that the VS 2010 C++ compiler generates four calls to cos for that code. I would have expected it to cache and reuse the results of the first two. I am moving this code into a kernel, and I am wondering what optimizations I can expect the kernel compiler to perform.
Jerome
- 신규로 표시
- 북마크
- 구독
- 소거
- RSS 피드 구독
- 강조
- 인쇄
- 부적절한 컨텐트 신고
Hi Raghu
Alike Jerome I just would like to see the generated device dependent asm to get the chance to optimze manually, i.e. change the C code and compile again. Observing the LLVM code in the .ir file was already a good help.
It would be great to have a --gpu_disasm option for the compiler. AMD, NVIDIA have such already.
Best regards, Stephan
