- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am running Vtune on a Dual Xeon Processor System. I would like to measure the coherency misses between the two processors. Are there any parameters in VTune to do the same?.
Also is it possible to use the CPUID assembly instruction to uniquely find out the processor ID (in case of a DP system)?. Basically I would like my user level p-threads running on the DP Xeon processors to be able to identify which processor it is running in. Is it possible?
Thanks
Gautham
Also is it possible to use the CPUID assembly instruction to uniquely find out the processor ID (in case of a DP system)?. Basically I would like my user level p-threads running on the DP Xeon processors to be able to identify which processor it is running in. Is it possible?
Thanks
Gautham
Link Copied
5 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gautham you can measure read/writes to the same cache line by two different processors w/ the Memory Order Machine Clear event. You can always read from the same cache line from multiple processors. However if you read/write (ie read w/ one thread while writing w/ another thread to the same 128 bytes) to the same cache line w/ multiple processors you will pay a pretty high performance penalty. The Memory Order Clear event will fire every time this happens.
For OS threads you can set which processor you are running on with SetThreadAffinityMask(). However this is not possible on user mode threads since they have their own scheduler and each user mode thread may not map to a seperate os thread. You can determine this by looking at the thread view for your process in VTune and see if the number corresponds w/ the number of pthreads you are using. As far as programmatically determining the processor you are executing I am not sure if there is a way to do this. In kernel mode you can call KeGetCurrentProcessorNumber. I am not sure if there is a user mode equivalent.
For OS threads you can set which processor you are running on with SetThreadAffinityMask(). However this is not possible on user mode threads since they have their own scheduler and each user mode thread may not map to a seperate os thread. You can determine this by looking at the thread view for your process in VTune and see if the number corresponds w/ the number of pthreads you are using. As far as programmatically determining the processor you are executing I am not sure if there is a way to do this. In kernel mode you can call KeGetCurrentProcessorNumber. I am not sure if there is a user mode equivalent.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When I first saw this post, I was thinking of the P-III Xeon, and waiting for someone with knowledge of the past to answer! Many people continue to post questions about MT for those older models.
We have verified experimentally that false sharing occurs between logical processors, as well as separate processors, when one thread reads and the other writes to the same 128 byte line, as Birju pointed out. When 2 threads write to the same cache line, the problem is restricted to a 64 byte line.
I'm not certain whether there may be possible BIOS variations which would affect these conclusions. I assume that Birju's tip about MOMC events should help when diagnosing any of these false sharing cases.
We have verified experimentally that false sharing occurs between logical processors, as well as separate processors, when one thread reads and the other writes to the same 128 byte line, as Birju pointed out. When 2 threads write to the same cache line, the problem is restricted to a 64 byte line.
I'm not certain whether there may be possible BIOS variations which would affect these conclusions. I assume that Birju's tip about MOMC events should help when diagnosing any of these false sharing cases.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your replies. One more question.I launch 4 threads on a MP Xeon system (2 logical processors per proc = 4 processors). So I guess even in this case the OS will keep switching my 4 threads between the 4 logical procs?. Is there any way I could pin them to a particular logical proc to ensure that a thread always runs on only one logical proc and the OS does not keep switching it?.
Thanks
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, assuming you're using one of the more recent versions of Windows. There are OS calls to set processor affinity.
You ought to be able to find some working examples on the developer.intel.com index to Hyper-Threading Technology.
For example, Khang Nguyen's CPU Counting Utility Code Sample may help.
You ought to be able to find some working examples on the developer.intel.com index to Hyper-Threading Technology.
For example, Khang Nguyen's CPU Counting Utility Code Sample may help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes you can do this using SetThreadAffinityMask on Windows.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page