- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I have multi-core processor (Core -2 duo) which has 4 logical processors. I would like to switch processor to run the code. For example,
;Assuming code is running on CPU-0
mov eax, ecx
; I want to switch to CPU-1 (don't know how to go about it)
mov ebx, eax
; Switch back to CPU-0 here
mov ecx, eax
Note: Above is just a sample code. My intentions are to learn how to switch b/w CPU's and to set code affinity to a specific CPU
Help appreciated
Regards
Gupta
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
and you desire for it to migrate from one CPU to a different CPU (hardware thread)
This is an operating system request for most O/S's.
In Windows, the function is SetThreadAffinityMask
In Linux, the function is pthread_set_affinity_np (the "_np" indicates that this is non-portable and may or may not be supported on your operating system).
The affinity mask is a bit mask of the logical processor number (hardware thread number) on your system. There are additional function calls with somewhat different behaviors. (search documentation for "affinity")
The two function calls above can restrict the current software thread to run on any of the specified bit/bits... provided the process (program) has permission to run on the specified bits (implicitly the list of logical processors available to the system or subset thereof).
mov eax, ecx
{
push eax
push ecx
push other registers that need saving
mov eax, [hThread]
push eax
mov eax, [bitMask]
push eax
call SetThreadAffinityMask
pop other registers that were saved
pop ecx
pop eax
}
mov ebx,eax
{
push eax
push ecx
push other registers that need saving
mov eax, [hThread]
push eax
mov eax, [otherbitMask]
push eax
call SetThreadAffinityMask
pop other registers that were saved
pop ecx
pop eax
}
mov ecx,eax
The cost of the call SetThreadAffinitMask can be 1000's of clock ticks. So thread migration has to be important.
Jim Dempsey
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[bash]#include "mpi.h" int main( int argc, char *argv[] ) { int rank MPI::Init(argc,argv); size = MPI::COMM_WORLD.Get_size(); rank = MPI::COMM_WORLD.Get_rank(); if (rank == 0){ Run Code } MPI::Finalize(); return 0; }[/bash]You can automatically switch rank using a counter.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
and you desire for it to migrate from one CPU to a different CPU (hardware thread)
This is an operating system request for most O/S's.
In Windows, the function is SetThreadAffinityMask
In Linux, the function is pthread_set_affinity_np (the "_np" indicates that this is non-portable and may or may not be supported on your operating system).
The affinity mask is a bit mask of the logical processor number (hardware thread number) on your system. There are additional function calls with somewhat different behaviors. (search documentation for "affinity")
The two function calls above can restrict the current software thread to run on any of the specified bit/bits... provided the process (program) has permission to run on the specified bits (implicitly the list of logical processors available to the system or subset thereof).
mov eax, ecx
{
push eax
push ecx
push other registers that need saving
mov eax, [hThread]
push eax
mov eax, [bitMask]
push eax
call SetThreadAffinityMask
pop other registers that were saved
pop ecx
pop eax
}
mov ebx,eax
{
push eax
push ecx
push other registers that need saving
mov eax, [hThread]
push eax
mov eax, [otherbitMask]
push eax
call SetThreadAffinityMask
pop other registers that were saved
pop ecx
pop eax
}
mov ecx,eax
The cost of the call SetThreadAffinitMask can be 1000's of clock ticks. So thread migration has to be important.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In above code example, you pushed some registers on to the stack for CPU-0. Assuming I have 4 logical CPU's and wish to switch to CPU-1. Will CPU-0's & CPU-1 have same stack segment?. If it's different then how OS handles CPU's switch for a thread. Does it copy stack frame for a given thread from CPU-0 to CPU-1.
If SS is same for CPU-1 & CPU-0, then wouldn't Stack contents can get corrupted because CPU-1 & CPU-0 can override each other shared stack. In my view, that must be handled by providing different stack for each logical CPU. Please correct me if I am wrong
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm looking for a way to run some test with the CPUID instruction on every logical CPU in the system, but the CPU that runs my program will be "randomly" selected by the scheduler... pls help me
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In an SMP system, a software thread can run on anylogical processor, but may be restricted to a selected subset of logical processors (SetThreadAffinityMask). IOW, the O/S is free to migrate the thread from logical processor to logical processor as it sees fit subject to affinity mask restrictions. In the sample code I posted earlier, the application changed its affinity mask restrictions (to move from one logical processor to another).
The SetThreadAffinityMask will (when required) eventually issue an interrupt, push the thread state (registers),then issue an inter-processor interrupt to the target processor (or waits for the scheduler to lazily perform the context switch), then switches thread context to an available thread for the first processor. The second processor, upon receiving the inter-processor interrupt, (saves other thread context on that processor), enters scheduler, then resumes your thread on the second processor.
As to if this occures immediately or is deferred, this depends on the O/S scheduler.
When your thread (eventually) resumes on the alternate logical processor, it resumes with a copy of the same stack pointer, and pops the saved context off the stack.
While there are circumstances when you will want to migrate a single thread from logical processor to logical processor, it is often more advantageous to use multiple threads within the same application. You do (may) need to add code to coordinate the threads to avoid conflicts in memory access that may occur.
I suggest you begin looking at multi-threading using OpenMP or Cilk++, then as needs arise, look Threading Building Blocks (TBB), or QuickThread. Start with simple concepts then work up to more complicated capabilities. There are several basic concepts you will need to become aware of. If you start with simple capabilities, then you will acclimate yourself with these basic concepts.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let's assume following code is running in kernel mode (ring-0)
function MySwitchFunction()
{
int a, b, c;
a =100;
b = 200;
c = 300;
;At this point code is running in CPU-0's stack & a/b/c are placed onto the stack
SwitchCPU(1); //THis code save context (I don't know what other context besides thread registers such eax, ebx, etc....
//c must be CPU-1 stack (How OS handles this?)
c = 100; //for example, mov 4[ebp], 100
//Switch back to CPU-0
SwitchCPU(0);
OutputContents(a, b, c); //What c should be at this point?
}
When threads are switched from CPU-0 to CPU-1, does OS saves/restores the whole stack frame for a given thread?
Any context switching source-code example will help though.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Isuggest you consult:
IA-32 Intel Architecture Software Developers Manual Volume 3: System Programming Guide
for information relating to this subject.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have, however, figured out how OS (Linux) handles context switching b/w different threads. Each taks (thread) is allocated 4KB/8KBkernel (ring-0) stack at the time of creation. Switch b/w thread is simply a matter of changing TSS (Task state segment) ESP0 to switching thread's top of kernel stack. There more to it task switching but above outlines the main concept behind Linux context switching.
When going through Intel documentation, pentium 4+ provide task switch capability implemented in it's hardware. Since, hardware task switch doesn't save all CPU registersis the reason OS'es (Linux, windows) relies of software logic for task switching.
I, sort of got confused as to which way to proceed, software vs. hardware. One downside (please correct me if I am wrong) of going through hardware way is the limitation of 1024 entries in GDT table, which means at the maximum only 1024 TSS segment descriptors can be defined (TSS must exist in GDT table as per Intel documentation). Now if I have more than 1024 tasks, surely OS will have to make changes in GDT all the time to support them. I don't know why TSS had to be in GDT table. Surely must havevery strong reason for Intel-Chip designers.
Well, I thanks Jim & everybody in the group for clarifying some of my doubts.
Regards
Gupta
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"Task" is a relative term.
To the O/S task could mean "process", and said process may contain multiple software threads (process threads, each running a "task" in the process).
Alternitively, the O/S need not (frequently) use the hardware "task" system to manage processes and/or software tasks.
Therefore therequired number ofTSS descriptors in the GDT could potentialy be reduced to a working set (approximately) equal to the number of hardware threads supported by the CPU (typically 2, 4, 6, 8, 12, 16). And with the O/S having possibility of seperate GDT per CPU (processor chip).
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page