Solved: Re: Circumventing inherited process affinity

capens__nicolas · ‎06-12-2009

Hi all,

I'm creating a multi-threaded Windows DLL, and face a problem with process affinity. Certain applications which are not multi-core safe set the affinity mask to just one core, but this means that my DLL also only runs on just one core.

So does anyone know how to circumvent this and have my DLL use all cores while the application itself is restricted to one core? Note that I do not have control over the application code.

Thanks for any ideas,

Nicolas

gaston-hillar · ‎06-13-2009

Quoting - c0d1f1ed

Mainly games. Even some triple-A titles set process affinity to just one core to 'fix' dodgy multi-threading. But this prevents any DLL it uses from using any other core as well.

Do you know any way to circumvent this? Thanks.

Hi Nicolas,

I see... You can get more help from other great experts in this forum.
One possible solution is using named pipes to create an IPC (Inter-Process Communication). However, this is really slower than directly calling DLL functions.

If you're working with C#, there is a nice article about how to use IPC and named pipes, written by Danny Barackhere: http://www.codeproject.com/KB/DLL/MultiProcess.aspx

Using named pipes, you can work in an independen process. It won't inherit the affinity. However, your clients will have to change the code.

Other solution: The DLL can start an independent process and wrap the IPCs to the independen process. This will be transparent to the games. However, you'll face a performance problem. It depends on the number of times your functions are called.

View solution in original post

gaston-hillar · ‎06-12-2009

Quoting - c0d1f1ed

Hi all,

I'm creating a multi-threaded Windows DLL, and face a problem with process affinity. Certain applications which are not multi-core safe set the affinity mask to just one core, but this means that my DLL also only runs on just one core.

So does anyone know how to circumvent this and have my DLL use all cores while the application itself is restricted to one core? Note that I do not have control over the application code.

Thanks for any ideas,

Nicolas

Hi Nicolas,

What kind of applications are using your DLL? It sounds really strange. Most applications do not modify their process affinity.

capens__nicolas · ‎06-13-2009

Quoting - Gastn C. Hillar

What kind of applications are using your DLL? It sounds really strange. Most applications do not modify their process affinity.

Mainly games. Even some triple-A titles set process affinity to just one core to 'fix' dodgy multi-threading. But this prevents any DLL it uses from using any other core as well.

Do you know any way to circumvent this? Thanks.

gaston-hillar · ‎06-13-2009

Quoting - c0d1f1ed

Mainly games. Even some triple-A titles set process affinity to just one core to 'fix' dodgy multi-threading. But this prevents any DLL it uses from using any other core as well.

Do you know any way to circumvent this? Thanks.

Hi Nicolas,

I see... You can get more help from other great experts in this forum.
One possible solution is using named pipes to create an IPC (Inter-Process Communication). However, this is really slower than directly calling DLL functions.

If you're working with C#, there is a nice article about how to use IPC and named pipes, written by Danny Barackhere: http://www.codeproject.com/KB/DLL/MultiProcess.aspx

Using named pipes, you can work in an independen process. It won't inherit the affinity. However, your clients will have to change the code.

Other solution: The DLL can start an independent process and wrap the IPCs to the independen process. This will be transparent to the games. However, you'll face a performance problem. It depends on the number of times your functions are called.

capens__nicolas · ‎06-15-2009

Quoting - Gastn C. Hillar

Other solution: The DLL can start an independent process and wrap the IPCs to the independen process. This will be transparent to the games. However, you'll face a performance problem. It depends on the number of times your functions are called.

Hi Gastn,

Thanks for the suggestion. I was hoping fora simpler solution that has no performance impact, but it looks like I'll have to bite the bullet and find the best compromise if I really want this feature.

I assume that creating shared memory is the fastest approach to inter-process communication, but it also leaves synchronization and such entirely to be implemented?

Cheers,

Nicolas

gaston-hillar · ‎06-15-2009

Quoting - c0d1f1ed

Hi Gastn,

Thanks for the suggestion. I was hoping fora simpler solution that has no performance impact, but it looks like I'll have to bite the bullet and find the best compromise if I really want this feature.

I assume that creating shared memory is the fastest approach to inter-process communication, but it also leaves synchronization and such entirely to be implemented?

Cheers,

Nicolas

Hi Nicolas,

Shared memory should work faster.

This is a very old article, but it briefly explains the most important principles of shared memory IPC:
http://www.codeproject.com/KB/threads/sharedmemipc.aspx?display=PrintAll&fid=154&df=90&mpp=25&noise=3&sort=Position&view=Quick&fr=26&select=572188

Chris_M__Thomasson · ‎06-17-2009

Quoting - Gastn C. Hillar

Hi Nicolas,

Shared memory should work faster.

This is a very old article, but it briefly explains the most important principles of shared memory IPC:
http://www.codeproject.com/KB/threads/sharedmemipc.aspx?display=PrintAll&fid=154&df=90&mpp=25&noise=3&sort=Position&view=Quick&fr=26&select=572188

AFAICT, the article does not mention anything about robust data recovery techniques, which is, IMHO, _easily_ the most _important_ aspect of shared memory programming. What will happen to the shared data if a "client" process happens to die while it holds the mutex? The code in the article does not even reference WAIT_ABANDONED! The code has a _major_ flaw in that respect.

What a shame.

;^(...

gaston-hillar · ‎06-17-2009

Quoting - Chris M. Thomasson

AFAICT, the article does not mention anything about robust data recovery techniques, which is, IMHO, _easily_ the most _important_ aspect of shared memory programming. What will happen to the shared data if a "client" process happens to die while it holds the mutex? The code in the article does not even reference WAIT_ABANDONED! The code has a _major_ flaw in that respect.

What a shame.

;^(...

Hi Chris,

You're right. I should have said it was an introductory and old article about the topic.
I think that nobody should create a shared memory IPC just reading an article. That's the main point. You can just begin working on it using an article as the first steps. Most articles and blogs show just a small part ofbigger problems. Any serious developer should know that.
However, I didn't use the right words when linking to the article.

levicki · ‎06-17-2009

Quoting - c0d1f1ed

Hi all,

I'm creating a multi-threaded Windows DLL, and face a problem with process affinity. Certain applications which are not multi-core safe set the affinity mask to just one core, but this means that my DLL also only runs on just one core.

So does anyone know how to circumvent this and have my DLL use all cores while the application itself is restricted to one core? Note that I do not have control over the application code.

Thanks for any ideas,

Nicolas

I presume that if you change the process affinity from your thread you will change it for the whole application which is not safe. The only reasonable solution would be to create a process which isn't affinity restricted and schedule/queue the work using some sort of messaging. Async I/O and event signaling come to mind.

Chris_M__Thomasson · ‎06-19-2009

Quoting - Gastn C. Hillar

Hi Chris,

You're right. I should have said it was an introductory and old article about the topic.
I think that nobody should create a shared memory IPC just reading an article. That's the main point. You can just begin working on it using an article as the first steps. Most articles and blogs show just a small part ofbigger problems. Any serious developer should know that.
However, I didn't use the right words when linking to the article.

Sorry for coming across so harshly. I think I should of post a brief article on the subject.IMVHO, if your programming __shared_memory__, and you actually _care_ about __robustness__ properties/fault-tolerance in general, go ahead and think about how an _unfortunate_ "process death" could possibly __influence_other_concurrent/subsequent_observers__ of said memory...

jimdempseyatthecove · ‎06-20-2009

Chris,

I think you need to make a distinction between

Inter-process shared memory (between different processes)
Intra-process shared memory (between different threads within a singleprocess)

Process affinity is usually controlled on a large system as a measure of load balancing. This is usualy done by the system administrator (by way of software tool). Usualy, but not always, the affinities, once assigned to the process, remain with the process and are not exclusive to the process. Affinities may change for the process under unusual circumstances (e.g. shutdown/startup of a hot-swap processor card). User applications do not control the process affinity restrictions, the sysadmin does. (or O/S will do when signaled with fault).

Thread affinity control, within a process affinity set, is in the domain of the application.

Should a thread crash within a process, (and not bring down the process) then the process itself could have defensive code written to account for this (kill, cleanup, restart thread,...).

Should a process crash (a thread within the process kills all threads within the process), then unless you have a seperate control process monitoring progress, then the process is dead for good. When monitoring process, the process is usualy written to provide checkpointing for restarts.

Jim Dempsey

Chris_M__Thomasson · ‎06-20-2009

Quoting - jimdempseyatthecove

Chris,

I think you need to make a distinction between

Inter-process shared memory (between different processes)
Intra-process shared memory (between different threads within a singleprocess)

Process affinity is usually controlled on a large system as a measure of load balancing. This is usualy done by the system administrator (by way of software tool). Usualy, but not always, the affinities, once assigned to the process, remain with the process and are not exclusive to the process. Affinities may change for the process under unusual circumstances (e.g. shutdown/startup of a hot-swap processor card). User applications do not control the process affinity restrictions, the sysadmin does. (or O/S will do when signaled with fault).

Thread affinity control, within a process affinity set, is in the domain of the application.

Should a thread crash within a process, (and not bring down the process) then the process itself could have defensive code written to account for this (kill, cleanup, restart thread,...).

Should a process crash (a thread within the process kills all threads within the process), then unless you have a seperate control process monitoring progress, then the process is dead for good. When monitoring process, the process is usualy written to provide checkpointing for restarts.

Jim Dempsey

Sorry about that... I was referring to inter-process shared memory. More specifically, how can one repair critical data-structures when a process dies while it holds a mutex protecting said structure. The next process that comes along and acquires the mutex will get a `WAIT_ABANDONED' in Windows, or an `EOWNERDEAD' in PThread robust mutexs. There are many techniques, however I do think that they are off topic wrt this specific thread.

jimdempseyatthecove · ‎06-20-2009

The "repair" of the critical structures can be made

a) roll back to checkpoint
b) avoid use of mutex and write code in wait-free manner
c) use non-traditional mutex where the mutex contains an exception handler to be handled by any process that is not the process that dies.

Note, in a), the checkpoint need not be a rollback of the entire (collection of) process. e.g. transaction processing with exception handling.

The repair information would have to reside in a place where the dieing process memory reclamation by the O/S does not affect the repair capability (resides in the shared memory, on disk, in multiple/all processes, ...).

In all cases, process affinity (or circumventing of inherited process affinity) would have no effect on the repair-ability of the critical data.

In the case where a process dies (in multi-process application), the recovery routine (run by one of the other processes or daemon) would start a replacement process (with restart information).

If you want to keep an extra copy of the process loaded (hot stand by) then I would suggest that the process not be affinity bound at start up. Then, on normal start up, have the process itself perform the affinity selection. On run for hot standby, defer setting affinity, wait for recovery directions, then set affinity accordingly.

You might consider reading papers relating to systems with hot swapable processors. Affinity restricted threads and process's have to be (strongly suggested to have) capable of being evicted from a desiredhardware thread (node, processor, core, thread)as any given processor could be shut down (usually with notice).

Jim

gaston-hillar · ‎06-22-2009

Quoting - Chris M. Thomasson

Sorry for coming across so harshly. I think I should of post a brief article on the subject.IMVHO, if your programming __shared_memory__, and you actually _care_ about __robustness__ properties/fault-tolerance in general, go ahead and think about how an _unfortunate_ "process death" could possibly __influence_other_concurrent/subsequent_observers__ of said memory...

Hi Chris,

I think that the repplies added by you, Jim and Igorare very clear. I do believe Nicolas should do some research about the perfomance goals and the desired fault tolerance he is looking for.
I thought he was just looking for some ideas about how tosolve the affinity problem. I think that there should be another post discussing about alternative for safe or fault tolerant IPC.
However, IMVHO, if Nicolas is looking for performance, using IPC is not the way to go. I'd rather convince developer to not change process affinity. It is really incredible that some developers change processor affinity instead of learning how to develop for multi-core. I'm not talking about Nicolas. I'm talking about the developers using Nicolas' DLLs. Do you really believe that someone can change a process' affinity to a single core to avoid learning multi-core programming. Ahhh. Too lazy developers... :)

capens__nicolas · ‎06-23-2009

Quoting - Gastn C. Hillar

I thought he was just looking for some ideas about how tosolve the affinity problem.

Exactly. I vaguely knew that creating a separate process was a solution, but I was really hoping for a simpler one. But you quickly made it clear that it doesn't exist. And I agree it's probably not worth the effort to go with IPC,and instead application developers should be made more aware of the consequences of setting process affinity.

Thanks all for the discussion about IPC safety and performance though. Itwill be really useful for asituations where I absolutely must create a separate process and communitcate with it.

Chris_M__Thomasson · ‎06-24-2009

Quoting - jimdempseyatthecove

The "repair" of the critical structures can be made

a) roll back to checkpoint
b) avoid use of mutex and write code in wait-free manner
c) use non-traditional mutex where the mutex contains an exception handler to be handled by any process that is not the process that dies.

Note, in a), the checkpoint need not be a rollback of the entire (collection of) process. e.g. transaction processing with exception handling.

The repair information would have to reside in a place where the dieing process memory reclamation by the O/S does not affect the repair capability (resides in the shared memory, on disk, in multiple/all processes, ...).

In all cases, process affinity (or circumventing of inherited process affinity) would have no effect on the repair-ability of the critical data. [...]

Yes; those techniques do work well. I am quite fond of using an efficient shared-memory based message passing algorithm and completely decoupling client processes from any critical shared state. If your interested in a brief outline of a working design, I could start another thread in order to discuss it.