If ioc64 is used without specifying a device, the default is GPU. The CPUs and GPUs have different instruction sets.
Here is more info on Gen assembly: https://software.intel.com/en-us/articles/introduction-to-gen-assembly
And more about the underlying architecture: https://software.intel.com/en-us/file/the-compute-architecture-of-intel-processor-graphics-gen9-v1d...
While I'm not an expert on why the Intel compiler doesn't have more OpenCL integration, I agree that it is a very interesting idea. For now though, the rich set of abstractions for memory management in OpenCL make it a great choice for taking advantage of Gen GPU hardware. You can of course use both together: icc for host-side code, ioc64 for device-side.
Thanks for that. You said: "You can of course use both together: icc for host-side code, ioc64 for device-side."
How would that be possible if icc does not have an OpencCL switch?
OpenCL does not need to be compiled from source at runtime. You can also load several forms of pre-compiled kernels. The kernel (pre)compiles would happen as a separate step.
Developer guide: https://software.intel.com/en-us/node/539388