Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Zihan_Y_
Beginner
204 Views

MWAIT is not improving performance and why my machine stucks?

Hi, I'm writing a simple kernel module to test monitror/mwait instructions on my machine, which has i7-7700K processor. I use a char[64] for each core, so the false wakeup should be minimized.

I was expecting 2 things. First is low enough wakeup latency. Second is (maybe slight) performance improvement on other cores. However, the result kind of surprises me as neither of two is fully satisfied.

For the wakeup latency, I get 1200 cycles(threads are set affinity to different cores), which is only a few hundreds of cycles less than IPI. Even though the latency is not every low, it is comparatively lower anyway, so it is OK.

For the performance of remaining cores, the performance is not improving at all. I use stress-ng (compile from latest source) with the following command

Result when core 5 and core 7 are in mwait

$ ./stress-ng --matrix 4 -t 10 --taskset 0,1,2,3 --metrics  # core 5 and core 7 is in mwait state, while core 4 and core 6 unaffected
stress-ng: info:  [7809] dispatching hogs: 4 matrix
stress-ng: info:  [7809] successful run completed in 10.00s
stress-ng: info:  [7809] stressor       bogo ops real time  usr time  sys time   bogo ops/s   bogo ops/s
stress-ng: info:  [7809]                           (secs)    (secs)    (secs)   (real time) (usr+sys time)
stress-ng: info:  [7809] matrix           136206     10.00     39.98      0.00     13620.61      3406.85

Below is the result in normal case when no core is in mwait. There is even slight degradation(if I use -c 4, the cpu bogo ops degradation is more obvious than matrix):

$ ./stress-ng --matrix 4 -t 10 --taskset 0,1,2,3 --metrics
stress-ng: info:  [7893] dispatching hogs: 4 matrix
stress-ng: info:  [7893] successful run completed in 10.00s
stress-ng: info:  [7893] stressor       bogo ops real time  usr time  sys time   bogo ops/s   bogo ops/s
stress-ng: info:  [7893]                           (secs)    (secs)    (secs)   (real time) (usr+sys time)
stress-ng: info:  [7893] matrix           137242     10.00     39.99      0.00     13724.22      3431.91

I suspect that mwait might get woken up too frequently by some events, but I don't see how to disable some of them. I try to set extensions and hints to non-zero values, but it either causes GP or stuck my PC.

 

=============== Here are my questions ================

1. Why is mwait not bringing performance improvement, am I using it in a wrong way?

2. Even if I only mwait on a single core(e.g., core 5), the whole machine's disk IO seems completely 'stuck'. When I switch to a new git repo in oh-my-zsh(which will scan the repo automatically), it gets 'stuck', and my firefox gets 'stuck', then everything 'stucks' and I have to reboot my machine. By saying 'stuck', I mean I can click something/switch tabs in the GUI, but they are not responding. Does anyone know why this would happen? Did I missing anything?

3. The MONITOR instructions allows extension and hints, but where can I find these instructions and hints? Has intel disabled these extensions since Pentium 4?

Thanks!

Zihan

0 Kudos
3 Replies
James_C_Intel2
Employee
204 Views

1. Why is mwait not bringing performance improvement, am I using it in a wrong way?

I think you may be expecting more than it can give. Use of mwait can allow OoO resources to be shuffled between logicalCPUs which share the same physical CPU ("core"), (i.e. between "hyper-threads" if you prefer that name). Thus the only gains on your 4 core, 8 thread) machine would occur when you are using two threads/core and one of them is in mwait. If you're running four threads, the OS will almost certainly distribute them across the four cores before using the second thread within a core, therefore you'd expect no gain (to first order, ignoring power consumption and turbo effects, of course).

As for how you are hanging up your machine by hacking in kernel space... I'm afraid that's one you need to solve yourself!

 

Zihan_Y_
Beginner
204 Views

Cownie, James H (Intel) wrote:

I think you may be expecting more than it can give. Use of mwait can allow OoO resources to be shuffled between logicalCPUs which share the same physical CPU ("core"), (i.e. between "hyper-threads" if you prefer that name). Thus the only gains on your 4 core, 8 thread) machine would occur when you are using two threads/core and one of them is in mwait. If you're running four threads, the OS will almost certainly distribute them across the four cores before using the second thread within a core, therefore you'd expect no gain (to first order, ignoring power consumption and turbo effects, of course).

As for how you are hanging up your machine by hacking in kernel space... I'm afraid that's one you need to solve yourself!

Thanks for the reply, I think I misunderstood the meaning of 'reallocate'.

alex__henry
Beginner
204 Views

Hey, this article is very good for all.SO for that I have some different thought for it. Because for today's date so many people will very important. So if you interest on it then from my suggestion just Toshiba Customer Service just browse it.