now to my surprise even AMD is supporting inline assembly (using AMDIL) in OpenCL kernels using asm("") statements and having access to non OpenCL standard features like cycle counters registers.. with that now AMD and NV GPUs can me programmed mainly on OCL syntax and with minor modifications can access special features on their GPUs so now I'm asking for Intel for the key steps to bring in parity to competition:
1.to finally release some sort of stable assembly language of their GPUs which is extensible for future GPUs (like NV PTX or AMD AMDIL and HSAIL in near future) basically some sort of documentation..
2.committing backend support to LLVM (like for PTX and AMDIL and soon HSAIL).. seems INtel drivers have called GHAL3D and now IGIL (integrated graphics IL?)
3.supporting OpenCL asm statements with GHAL3D/IGIL syntax..
PS: I know about SPIR plans but that are for exposing an ISA for existing features so for example will not expose cycle counters or other things not?
Thanks for the inputs,
We are looking on various options and directions and your feedback is interesting. I don't have public information to share with you at this time, but again thanks for the feedback
Thanks for respone..
happy that you are taking my suggestions seriously..
just adding another suggestion on top of that:
If you finally take into consideration that option would also be good to add another feature and that is possibility of generating binaries in assembly format and that allows tweaking of generated binaries.. note that is also possible in OpenCL on NV as binaries are plain PTX (his assembly language) files.. similar in CUDA toolchain.. and even AMD also even before adding inline assembly allowed generation of binaries containing AMDIL sections (is ELF file) so you can modify that assembly and building from binary allows same feature..
also seems OCL should need to be extended good to expose multiple binary formats each one for diferent reasons (as OGL supports right now) (could be have that in OCL 2.0?):
*one would be SPIR for portable binaries
*another propietary binary for fastest building (i.e. pure device binaries)
*last one vendor text assembly format (for tweaking and using advanced ISA of each arch while functionality gets standarized in OCL) (NV:PTX AMD:AMDIL Intel:IGIL)