Recently I need to know exactly the data communication(exactly how much and when) between the Host and MIC while running an offload application.
Unlike Nvidia, Intel doesn't provide a tool like NV Profiler.
So I guess maybe Vtune Amplifier XE can do this job. But unfortunately I get to know that when Amplxe only analyze MIC's performance while I analyse an offload application.
So I come to get help from the forum. Is there anyone who can help me?
For more complete information about compiler optimizations, see our Optimization Notice.