- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I'm writing a simple kernel module to test monitror/mwait instructions on my machine, which has i7-7700K processor. I use a char[64] for each core, so the false wakeup should be minimized.
I was expecting 2 things. First is low enough wakeup latency. Second is (maybe slight) performance improvement on other cores. However, the result kind of surprises me as neither of two is fully satisfied.
For the wakeup latency, I get 1200 cycles(threads are set affinity to different cores), which is only a few hundreds of cycles less than IPI. Even though the latency is not every low, it is comparatively lower anyway, so it is OK.
For the performance of remaining cores, the performance is not improving at all. I use stress-ng (compile from latest source) with the following command
Result when core 5 and core 7 are in mwait
$ ./stress-ng --matrix 4 -t 10 --taskset 0,1,2,3 --metrics # core 5 and core 7 is in mwait state, while core 4 and core 6 unaffected stress-ng: info: [7809] dispatching hogs: 4 matrix stress-ng: info: [7809] successful run completed in 10.00s stress-ng: info: [7809] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s stress-ng: info: [7809] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [7809] matrix 136206 10.00 39.98 0.00 13620.61 3406.85
Below is the result in normal case when no core is in mwait. There is even slight degradation(if I use -c 4, the cpu bogo ops degradation is more obvious than matrix):
$ ./stress-ng --matrix 4 -t 10 --taskset 0,1,2,3 --metrics stress-ng: info: [7893] dispatching hogs: 4 matrix stress-ng: info: [7893] successful run completed in 10.00s stress-ng: info: [7893] stressor bogo ops real time usr time sys time bogo ops/s bogo ops/s stress-ng: info: [7893] (secs) (secs) (secs) (real time) (usr+sys time) stress-ng: info: [7893] matrix 137242 10.00 39.99 0.00 13724.22 3431.91
I suspect that mwait might get woken up too frequently by some events, but I don't see how to disable some of them. I try to set extensions and hints to non-zero values, but it either causes GP or stuck my PC.
=============== Here are my questions ================
1. Why is mwait not bringing performance improvement, am I using it in a wrong way?
2. Even if I only mwait on a single core(e.g., core 5), the whole machine's disk IO seems completely 'stuck'. When I switch to a new git repo in oh-my-zsh(which will scan the repo automatically), it gets 'stuck', and my firefox gets 'stuck', then everything 'stucks' and I have to reboot my machine. By saying 'stuck', I mean I can click something/switch tabs in the GUI, but they are not responding. Does anyone know why this would happen? Did I missing anything?
3. The MONITOR instructions allows extension and hints, but where can I find these instructions and hints? Has intel disabled these extensions since Pentium 4?
Thanks!
Zihan
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. Why is mwait not bringing performance improvement, am I using it in a wrong way?
I think you may be expecting more than it can give. Use of mwait can allow OoO resources to be shuffled between logicalCPUs which share the same physical CPU ("core"), (i.e. between "hyper-threads" if you prefer that name). Thus the only gains on your 4 core, 8 thread) machine would occur when you are using two threads/core and one of them is in mwait. If you're running four threads, the OS will almost certainly distribute them across the four cores before using the second thread within a core, therefore you'd expect no gain (to first order, ignoring power consumption and turbo effects, of course).
As for how you are hanging up your machine by hacking in kernel space... I'm afraid that's one you need to solve yourself!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Cownie, James H (Intel) wrote:
I think you may be expecting more than it can give. Use of mwait can allow OoO resources to be shuffled between logicalCPUs which share the same physical CPU ("core"), (i.e. between "hyper-threads" if you prefer that name). Thus the only gains on your 4 core, 8 thread) machine would occur when you are using two threads/core and one of them is in mwait. If you're running four threads, the OS will almost certainly distribute them across the four cores before using the second thread within a core, therefore you'd expect no gain (to first order, ignoring power consumption and turbo effects, of course).
As for how you are hanging up your machine by hacking in kernel space... I'm afraid that's one you need to solve yourself!
Thanks for the reply, I think I misunderstood the meaning of 'reallocate'.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey, this article is very good for all.SO for that I have some different thought for it. Because for today's date so many people will very important. So if you interest on it then from my suggestion just Toshiba Customer Service just browse it.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page