Intel® ISA Extensions
Use hardware-based isolation and memory encryption to provide more code protection in your solutions.

monitor/mwait performance differs in different memory addresses

Hamid_b_
Beginner
396 Views

Hi everybody,

We are working on a new research operating system. To do message passing, we use different mechanisms, including polling, IPIs, and monitor/mwait. To benchmark the performance, we send a ping-pong message between two processes running on two different cores, and count the number of cycles for this round-trip message on sender core. The thing that confuses us is that it seems monitor/mwait's performance differs few hundred cycles if we change the address of monitor area. I have to mention that we use WriteBack cache policy, and the processor is Intel(R) Xeon(R) CPU E31270 @ 3.40GHz which is not NUMA. We used two addresses which are relatively close to each other. first one was 0x100, and the second one 0xA0800.

Is monitor/mwait performance address dependent?

0 Kudos
7 Replies
Bernard
Valued Contributor I
396 Views

Maybe because the address is at very low memory range?

0 Kudos
jimdempseyatthecove
Honored Contributor III
396 Views

Are these physical addresses or virtual addresses?

Do you "touch" memory in the same page as the monitored location prior to entering monitor in both cases? (i.e. preload page table in event not loaded)

Can you setup a test to use addresses 0x100 and 0xA0100? (i.e. same relative offset within a page).

Jim Dempsey

0 Kudos
SergeyKostrov
Valued Contributor II
396 Views
>>...Is monitor/mwait performance address dependent? Take a look at a description for MONITOR instruction in a latest Instruction Set Reference on a page 560. There is a statement: ... the address range that the monitoring hardware checks for store operations can be determined by using CPUID... ... If one of the addresses you've mentioned is outside of allowed memory range ( let's say 0x100 ) than it is absolutely not clear what happens according to the manual.
0 Kudos
perfwise
Beginner
396 Views

That range is 64B on HW.. I recently tested that.  Cpuid provides the minimum and maximum range.. and it's what I said.

Perfwise

0 Kudos
jimdempseyatthecove
Honored Contributor III
396 Views

Presumably one logical processor is monitoring 0x100 and a different logical processor is monitoring 0xA0800, and each have a monotor "window" within one cache line. Hamid, would you comment on this?

Jim Dempsey

0 Kudos
Bernard
Valued Contributor I
396 Views

Is such low memory address 0x100( if it is physical address ) available for user mode software?

0 Kudos
Bernard
Valued Contributor I
396 Views

I suppose that those are really relative offsets within process memory page(s)

0 Kudos
Reply