- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear forum,
I've got a question about the energy consumption of my mic code. Here are the specs of my system:
cpu code running on: i7-4700 mq 4core (haswell) on my laptop, tdp=47w, windows os
mic code running on: 5110p 60core (knights corner) on my server with an xeon x5650 6core cpu, tdp=225w, gnu/linux os
I found that with 8 threads my cpu usage is 100% and the power always hits its tdp, but my mic with 240 threads only spends ~150w, much lower than its tdp. since my mic code does not contain vectorized part, so can i conclude that the vpus on mic are not used, hence the power is much lower than the tdp? Thanks for clarifications!
Best,
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Intel compiler will generate vector instructions for scalar arithmetic on Xeon Phi, with the mask set so that only the 0 element of each vector is actually updated. So unless you have reviewed the assembly code and verified that there are actually no vector instructions present, you should probably not assume that there are not any there.
Power consumption on Xeon Phi depends on lots of factors, including the rate at which instructions are graduated. This depends not only on the visible instruction sequences, but also on the fraction of stall cycles due to cache misses (and other long-latency operations). Your Haswell processor has more cache per core, lower memory latency, and much more aggressive hardware prefetchers than the Xeon Phi -- all of which will impact the fraction of stall cycles & hence the power consumption. The out-of-order core on Haswell may also burn more power "spinning" while waiting on cache misses than the in-order Xeon Phi core. (That is something I probably ought to test some day...)
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Question - does the code you were running on the host contain vector operations?
I suspect there is more going on here than just not doing vector operations but right at the moment I don't know what. What were you using to monitor power and cpu utilization? My simplistic approach would be to start up micsmc and just watch the screen for a bit while the program ran, but you are probably being much more sophisticated than that.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Intel compiler will generate vector instructions for scalar arithmetic on Xeon Phi, with the mask set so that only the 0 element of each vector is actually updated. So unless you have reviewed the assembly code and verified that there are actually no vector instructions present, you should probably not assume that there are not any there.
Power consumption on Xeon Phi depends on lots of factors, including the rate at which instructions are graduated. This depends not only on the visible instruction sequences, but also on the fraction of stall cycles due to cache misses (and other long-latency operations). Your Haswell processor has more cache per core, lower memory latency, and much more aggressive hardware prefetchers than the Xeon Phi -- all of which will impact the fraction of stall cycles & hence the power consumption. The out-of-order core on Haswell may also burn more power "spinning" while waiting on cache misses than the in-order Xeon Phi core. (That is something I probably ought to test some day...)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page