I am interested in gathering application properties by reading GPU hardware counters for GPU-based workloads running on i7-6700K (skylake) processor in a Linux based machine. I should note that these are not graphic workloads but general purpose workloads written using opencl to run on GPU using beignet runtime.
To be specific I would interested in knowing metrics like EU utilization, stall cycles etc. Intel VTune like tools provide these metrics but I need to be able to perform online profiling and adapt the system accordingly. The intel_perf_counter tool (source code available) seem to read some of these counters but the format is not compatible with latest generation processor. Can anyone point me to any documentation on how to do this?