- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Hello,
I work with the CLI tool of the Intel OpenCL SDK 1.2 on Scientific Linux. I'm interested in optimize my kernels (1) with the oclopt program and (2) with assembly code for CPU or MIC.
Question (1): How I understand the tool oclopt currently: the tool takes a builded spir code and some optimization methods like prefetching or loop-unrolling and produces an optimized version of it. Example:
oclopt -O3 -prefetch -loop-unroll kernel_x64.spir > kernel_x64.spir
Can I do more with it like giving a hint about the prefetching distance? In my case, these flags does not influence the kernel! Is it possible to access the IMCI instruction set somehow?
Question (2): The ico64 kernel builder creates assembly files. Is there any way to build binary code from assembly code or is it just used for analysing the generated binary kernel file?
Thanks a lot!
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Link kopiert
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Simon,
1. your understanding is correct: if you build your SPIR kernel firs like thist:
ioc64 -cmd=build -input=drawbox.cl -device=gpu -spir64=drawbox.bc -bo="-cl-std=CL1.2"
Then, you can apply optimizations to it, e.g.
oclopt -strip drawbox.bc > drawbox_stripped.bc
You can check available options via oclopt --help - haven't tried them all, so not sure how to hint about prefetching distance . The kernels are SPIR kernels. Not sure what do you mean by IMCI instruction set.
2. ioc64 could build many things, but if you use -spir64= flag, it will generate the right SPIR file, that you should be able to load with clCreateProgramWithBinary.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Thanks for your reply!
(1) Please don't refer to GPU. I'm interessted in optimize kernels espezially for the MIC (Xeon Phi) with OpenCL. To optimize low level, it is necessary to influence the optimization somehow. Non of the offered options does show any changes in the generated code. Does the oclopt really apply to the Xeon Phi?
(2) The IMCI is Intels Initial Many Core Instructions set for the Xeon Phi. Is there any assembly optimization with OpenCL possible?
ioc64 works good with spir and clCreateProgramWithBinary. Thats not the problem. I want so know, what else is possibe!
Thanks
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Simon,
I'll forward your question to Xeon Phi pros.
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Hello Yuri,
thanks for the reply!
The OpenCL driver of Intel does an vectorization of 16 on Xeon Phi implicitly. Can I turn it off?
Thanks, Simon
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
Hi Yuri,
can I also influence the vectorization for double?
export CL_CONFIG_CPU_VECTORIZER_MODE=8
does not work for me. vectorization of 16 or no vectorization...
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
- Als neu kennzeichnen
- Lesezeichen
- Abonnieren
- Stummschalten
- RSS-Feed abonnieren
- Kennzeichnen
- Anstößigen Inhalt melden
I can not confirm, that the variable CL_CONFIG_USE_VECTORIZER disables the implizit vectorizer. The output is says that the code is not vectorized, but the assembly code is the same and the execution of the kernel shows no difference.

- RSS-Feed abonnieren
- Thema als neu kennzeichnen
- Thema als gelesen kennzeichnen
- Diesen Thema für aktuellen Benutzer floaten
- Lesezeichen
- Abonnieren
- Drucker-Anzeigeseite