- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
from compiler-assisted offload code, I get the following offload report:
[Offload] [HOST] [Tag 2] [CPU Time] 5.053540(seconds)
[Offload] [MIC 0] [Tag 2] [CPU->MIC Data] 1080 (bytes)
[Offload] [MIC 0] [Tag 2] [MIC Time] 6.122002(seconds)
[Offload] [MIC 0] [Tag 2] [MIC->CPU Data] 1032 (bytes)
However, I expected that the total CPU time on the host is always greater than the MIC time on the mic, since it includes the execution time on the mic according to:
"[CPU Time]
The total time measured for that offload pragma on the host.
[MIC Time]
The total time measured for executing the offload on the target. This excludes the data transfer time between the host and the target, and counts only the execution time on the target." [ https://software.intel.com/en-us/node/522521 ]
Is my assumption wrong? If so, I do not quite get which time is measured on the host exactly. Could you clarify that for me?
Thank you very much!
[The program calls Intel MKL's pardiso in offload mode.]
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please post the source code from which this report was obtained.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A basic example is the attached modified example pardiso_unsym.c shipped with Parallel Studio 2016.0.109. Here, the single call of pardiso with phase 13 is split into three calls with phases 11, 22 and 33.
The offload report is:
phase = 11
[Offload] [MIC 0] [File] pardiso_unsym.c
[Offload] [MIC 0] [Line] 166
[Offload] [MIC 0] [Tag] Tag 0
[Offload] [HOST] [Tag 0] [CPU Time] 0.974192(seconds)
[Offload] [MIC 0] [Tag 0] [CPU->MIC Data] 1120 (bytes)
[Offload] [MIC 0] [Tag 0] [MIC Time] 0.161118(seconds)
[Offload] [MIC 0] [Tag 0] [MIC->CPU Data] 556 (bytes)
phase = 22
[Offload] [MIC 0] [File] pardiso_unsym.c
[Offload] [MIC 0] [Line] 190
[Offload] [MIC 0] [Tag] Tag 1
[Offload] [HOST] [Tag 1] [CPU Time] 0.003326(seconds)
[Offload] [MIC 0] [Tag 1] [CPU->MIC Data] 1128 (bytes)
[Offload] [MIC 0] [Tag 1] [MIC Time] 0.000986(seconds)
[Offload] [MIC 0] [Tag 1] [MIC->CPU Data] 556 (bytes)
phase = 33
[Offload] [MIC 0] [File] pardiso_unsym.c
[Offload] [MIC 0] [Line] 214
[Offload] [MIC 0] [Tag] Tag 2
[Offload] [HOST] [Tag 2] [CPU Time] 0.113857(seconds)
[Offload] [MIC 0] [Tag 2] [CPU->MIC Data] 1128 (bytes)
[Offload] [MIC 0] [Tag 2] [MIC Time] 0.134102(seconds)
[Offload] [MIC 0] [Tag 2] [MIC->CPU Data] 556 (bytes)
In phase 33 the cpu time is smaller than the mic time.
This is an example for my general question above: What are the exact times measured in the offload report?
Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The "CPU Time" and "MIC Time" are as described in the documentation.
They are currently timed using __rdtsc. More recent host processors vary their CPU frequency dynamically. Measured on these CPUs __rdtsc is not a reliable measure of elapsed time. We will change the way elapsed time is measured in a future compiler release.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you saying that my assumption in my first post (CPU Time should always be larger than MIC Time) is correct? And only the way of measuring it is unreliable?

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page