while I was trying to solve some particular multi-thread problem, it occurred to me that it could be solved more efficiently with special assistance from the CPU.
The situation is as follows: say one thread needs to block until the content of a particular 4-byte (or can be other size) location in the memory is changed. (It think the usefulness of this is very obvious and there is no need to give concrete examples to demonstrate it).
What are the currently available options:
1) the thread can spin-wait, but this is very inefficient because wastes CPU time/power.
2) spin-wait but with sleeping for some small time in the spin loop. This would preserve CPU time and power, but is not good if fast reaction is needed .
3) the thread can use some OS primitive (e.g. semaphore) to sleep and any other thread that writes to the memory location will also signal the OS primitive. The problems here are that :
a) the signaling thread must know about this primitive. This can be a problem depending on the case. In my particular case it is impossible because although the 2 threads see the same memory, they reside in different OSes (one is inside a virtual machine and the other is on the host).
b) the signaling thread will pay performance price for the signaling (which has to be a system-call)
In theory this can be done even now by the OS-es by using the x86 debug registers, but no OS provides such service mainly because the debug registers are too scarce resource - they are only 4.
Could it be feasible their number to be increased? If we can have say 128 or more such registers it would be wonderful. I don't have any idea what the hardware price would be though.
In my previous post, the reasoning of b) is not quite valid because the price will be virtually the same it the CPU rise exception in response to the memory-access (it is a switch to kernel-mode, the same as a manual system call)
Intel C/C++ Compiler Intrinsic Equivalent
MONITOR: void _mm_monitor(void const *p, unsigned extensions,unsigned hints)
Intel C/C++ Compiler Intrinsic Equivalent
MWAIT: void _mm_mwait(unsigned extensions, unsigned hints)
The MONITOR instruction arms address monitoring hardware using an address specified in EAX (the address range that the monitoring hardware checks for store operations can be determined by using CPUID). A store to an address within the specified address range triggers the monitoring hardware. The state of monitor hardware is used by MWAIT.
The content of EAX is an effective address (in 64-bit mode, RAX is used). By default, the DS segment is used to create a linear address that is monitored. Segment overrides can be used.
ECX and EDX are also used. They communicate other information to MONITOR. ECX specifies optional extensions. EDX specifies optional hints; it does not change the architectural behavior of the instruction. For the Pentium 4 processor (family 15, model 3), no extensions or hints are defined. Undefined hints in EDX are ignored by the processor; undefined extensions in ECX raises a general protection fault.
The address range must use memory of the write-back type. Only write-back memory will correctly trigger the monitoring hardware. Additional information on determining what address range to use in order to prevent false wake-ups is described in Chapter 8, “Multiple-Processor Management” of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A.
The MONITOR instruction is ordered as a load operation with respect to other memory transactions. The instruction is subject to the permission checking and faults associated with a byte load. Like a load, MONITOR sets the A-bit but not the D-bit in page tables.
CPUID.01H:ECX.MONITOR[bit 3] indicates the availability of MONITOR and MWAIT in the processor. When set, MONITOR may be executed only at privilege level 0 (use at any other privilege level results in an invalid-opcode exception). The operating system or system BIOS may disable this instruction by using the IA32_MISC_ENABLE MSR; disabling MONITOR clears the CPUID feature flag and causes execution to generate an invalid-opcode exception. The instruction’s operation is the same in non-64-bit modes and 64-bit mode.
MONITOR sets up an address range for the monitor hardware using the content of EAX (RAX in 64-bit mode) as an effective address and puts the monitor hardware in armed state. Always use memory of the write-back caching type. A store to the specified address range will trigger the monitor hardware. The content of ECX and EDX are used to communicate other information to the monitor hardware.
MWAIT instruction provides hints to allow the processor to enter an implementation-dependent optimized state.
There are two principal targeted usages: address-range monitor and advanced power management. Both usages of MWAIT require the use of the MONITOR instruction. CPUID.01H:ECX.MONITOR[bit 3] indicates the availability of MONITOR and MWAIT in the processor. When set, MWAIT may be executed only at privilege level 0 (use at any other privilege level results in an invalid-opcode exception). The operating system or system BIOS may disable this instruction by using the IA32_MISC_ENABLE MSR; disabling MWAIT clears the CPUID feature flag and causes execution to generate an invalid-opcode exception.
This instruction’s operation is the same in non-64-bit modes and 64-bit mode.
ECX specifies optional extensions for the MWAIT instruction. EAX may contain hints such as the preferred optimized state the processor should enter. The first processors to implement MWAIT supported only the zero value for EAX and ECX. Later processors allowed setting ECX to enable masked interrupts as break events for MWAIT (see below). Software can use the CPUID instruction to determine the extensions and hints supported by the processor.
MWAIT for Address Range Monitoring
For address-range monitoring, the MWAIT instruction operates with the MONITOR instruction. The two instructions allow the definition of an address at which to wait (MONITOR) and a implementation-dependent-optimized operation to commence at the wait address (MWAIT). The execution of MWAIT is a hint to the processor that it can enter an implementation-dependent-optimized state while waiting for an event or a store operation to the address range armed by MONITOR.
The following cause the processor to exit the implementation-dependent-optimized state: a store to the address range armed by the MONITOR instruction, an NMI or SMI, a debug exception, a machine check exception, the BINIT# signal, the INIT# signal, and the RESET# signal. Other implementation-dependent events may also cause the processor to exit the implementation-dependent-optimized state.
In addition, an external interrupt causes the processor to exit the implementation-dependent-optimized state either (1) if the interrupt would be delivered to software (e.g., as it would be if HLT had been executed instead of MWAIT); or (2) if ECX = 1. Software can execute MWAIT with ECX = 1 only if CPUID.05H:ECX[bit 1] = 1.
(Implementation-specific conditions may result in an interrupt causing the processor to exit the implementationdependent-optimized state even if interrupts are masked and ECX = 0.)
Following exit from the implementation-dependent-optimized state, control passes to the instruction following the MWAIT instruction. A pending interrupt that is not masked (including an NMI or an SMI) may be delivered before execution of that instruction. Unlike the HLT instruction, the MWAIT instruction does not support a restart at the MWAIT instruction following the handling of an SMI.
If the preceding MONITOR instruction did not successfully arm an address range or if the MONITOR instruction has not been executed prior to executing MWAIT, then the processor will not enter the implementation-dependent-optimized state. Execution will resume at the instruction following the MWAIT.
A hint that allow the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events.
INSTRUCTION SET REFERENCE, A-M
Vol. 2A 3-603
MWAIT for Power Management
MWAIT accepts a hint and optional extension to the processor that it can enter a specified target C state while waiting for an event or a store operation to the address range armed by MONITOR. Support for MWAIT extensions for power management is indicated by CPUID.05H:ECX[bit 0] reporting 1.
EAX and ECX are used to communicate the additional information to the MWAIT instruction, such as the kind of optimized state the processor should enter. ECX specifies optional extensions for the MWAIT instruction. EAX may contain hints such as the preferred optimized state the processor should enter. Implementation-specific conditions may cause a processor to ignore the hint and enter a different optimized state. Future processor implementations may implement several optimized “waiting” states and will select among those states based on the hint argument.