Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Multi-threaded app not using all available cores

mtraudt
Beginner
2,631 Views
We have a multi-threaded app written in C# running on Windows. On most PC's, we see that the app is able to use all the cores on the PC effectively. However, we have a customer running the app on a PC with Windows XP and an Intel Xeon "Conroe" CPU, and even though the specs say this is a dual-core CPU,the app seems to only use one core. In fact, even if we run two instances of the app simultaneously, it seems that both are running on one core. Could this be due to OS configuration?

We are using the .NET Threading API's to perform calculations on multiple concurrent threads, and in our development environment we see all available cores being used.

Also, Task Manager reports four processors for this PC. How is it that a dual-core CPU can show four processors?
0 Kudos
20 Replies
gaston-hillar
Valued Contributor I
2,607 Views
We have a multi-threaded app written in C# running on Windows. On most PC's, we see that the app is able to use all the cores on the PC effectively. However, we have a customer running the app on a PC with Windows XP and an Intel Xeon "Conroe" CPU, and even though the specs say this is a dual-core CPU,the app seems to only use one core. In fact, even if we run two instances of the app simultaneously, it seems that both are running on one core. Could this be due to OS configuration?

We are using the .NET Threading API's to perform calculations on multiple concurrent threads, and in our development environment we see all available cores being used.

Also, Task Manager reports four processors for this PC. How is it that a dual-core CPU can show four processors?

A dual-core CPU can show four processors when it includes Hyper-Threading combined with multicore. For example, the new Core i7 CPUs combine Hyper-Threading with multicore, hence they double the logical cores. You can use CPU-Z software to test the number of logical and physical cores.

You can also use Process Explorer and Intel Concurrency Checker 2.1. I've recently uploaded some videos in youtube about this tools, you can check my channel: www.youtube.com/gastonhillar2009

You can also check the Process affinity settings. I've uploaded a video about that (you should see 4 CPUs there and they should be checked for that process).

Cheers,

Gastn Hillar
0 Kudos
mtraudt
Beginner
2,607 Views
Thanks for the response. You are correct that hyper-threading is enabled, but this does not seem to relevant to the problem we are seeing. To replicate, we ran our application on a PC running Vista, and then ran again on the same PC but with XP. With Vista, we are seeing all cores being used, but with XP we are only seeing one core being used, no matter what. We get the same results whether HT is enabled or not.

The application is built using .NET 3.0 and the threading is done via standard .NET Threading API.

Any ideas as to what might explain this difference in behavior?

0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views
Thanks for the response. You are correct that hyper-threading is enabled, but this does not seem to relevant to the problem we are seeing. To replicate, we ran our application on a PC running Vista, and then ran again on the same PC but with XP. With Vista, we are seeing all cores being used, but with XP we are only seeing one core being used, no matter what. We get the same results whether HT is enabled or not.

The application is built using .NET 3.0 and the threading is done via standard .NET Threading API.

Any ideas as to what might explain this difference in behavior?


I would use Process Explorer to see how XP is running the threads. Process Explorer is a free tool. http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx

I would use Process Explorer in both XP and Vista and I'd compare results. Maybe something is wrong with XP configuration.
0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views
I had forgotten.

You can also test installing this patch in Windows XP: http://support.microsoft.com/kb/896256

Sometimes, it helps, specially in certain dual-core CPUs.

Cheers,

Gastn
0 Kudos
robert-reed
Valued Contributor II
2,607 Views
We have a multi-threaded app written in C# running on Windows. On most PC's, we see that the app is able to use all the cores on the PC effectively. However, we have a customer running the app on a PC with Windows XP and an Intel Xeon "Conroe" CPU, and even though the specs say this is a dual-core CPU,the app seems to only use one core. In fact, even if we run two instances of the app simultaneously, it seems that both are running on one core. Could this be due to OS configuration?

We are using the .NET Threading API's to perform calculations on multiple concurrent threads, and in our development environment we see all available cores being used.

Also, Task Manager reports four processors for this PC. How is it that a dual-core CPU can show four processors?

Does Task Manager report four processors on both Vista and XP? The Microsoft knowledge base article to which Gastn refers is about performance limitations that need additional patches once SP2 is installed. Does your customer have SP2 installed on their XP system? If Task Manager can see the processors then I don't see any reason why a multi-threaded .NET application should not see them all as well.

One other, small clarification I might offer: "Conroe" was the internal project name for the first Intel CoreTM2 Duo processor, which did not have Intel Hyper-Threading Technology, which first makes its reappearance in the Intel Core i7 processor.

0 Kudos
mtraudt
Beginner
2,607 Views

Just to clarify and add further details after some more investigation:

Two customers have reported this behavior, one using a dual-core PC running XP, and another with a dual-core server running Windows Server 2003. I do not have the exact chip specs and OS patch levels handy, but could get them if necessary. The problem is not related to hyper-threading (we see the same problem whether HT is enabled or not) and so far does not appear to be specific to a particular CPU type.

We can reproduce the problem internally on a dual-core (Presler) PC running XP SP2. We disabled HT, just to eliminate that variable. If we then run on the same PC but with Vista (by swapping out the hard-drive containing the OS) then we see both cores being used. Applying the patch in the KB article did not help. We ran ProcessExplorer and the "affinity" settings showed both cores enabled, however the performance statistics show only one core being used (consistent with Task Manager).

In case it's relevant, the application does a large amount of floating-point calculations.

For me the strangest part is that if we run the two instances of the app, they both seem to run on the same core.
0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views

Just to clarify and add further details after some more investigation:

Two customers have reported this behavior, one using a dual-core PC running XP, and another with a dual-core server running Windows Server 2003. I do not have the exact chip specs and OS patch levels handy, but could get them if necessary. The problem is not related to hyper-threading (we see the same problem whether HT is enabled or not) and so far does not appear to be specific to a particular CPU type.

We can reproduce the problem internally on a dual-core (Presler) PC running XP SP2. We disabled HT, just to eliminate that variable. If we then run on the same PC but with Vista (by swapping out the hard-drive containing the OS) then we see both cores being used. Applying the patch in the KB article did not help. We ran ProcessExplorer and the "affinity" settings showed both cores enabled, however the performance statistics show only one core being used (consistent with Task Manager).

In case it's relevant, the application does a large amount of floating-point calculations.

For me the strangest part is that if we run the two instances of the app, they both seem to run on the same core.

No doubt you are facing a "software" problem. The problem is in the operating system. You have to work there.
For example, if you have access to these PCs, you can install the Visual Studio IDE and test the number of cores that it returns and test some for loops and see what's going on there.
0 Kudos
robert-reed
Valued Contributor II
2,607 Views

Just to clarify and add further details after some more investigation:

Two customers have reported this behavior, one using a dual-core PC running XP, and another with a dual-core server running Windows Server 2003. I do not have the exact chip specs and OS patch levels handy, but could get them if necessary. The problem is not related to hyper-threading (we see the same problem whether HT is enabled or not) and so far does not appear to be specific to a particular CPU type.

We can reproduce the problem internally on a dual-core (Presler) PC running XP SP2. We disabled HT, just to eliminate that variable. If we then run on the same PC but with Vista (by swapping out the hard-drive containing the OS) then we see both cores being used. Applying the patch in the KB article did not help. We ran ProcessExplorer and the "affinity" settings showed both cores enabled, however the performance statistics show only one core being used (consistent with Task Manager).

In case it's relevant, the application does a large amount of floating-point calculations.

For me the strangest part is that if we run the two instances of the app, they both seem to run on the same core.

This seems very strange. I have several multi-core machines, running both XP and Windows Server 2003, all running at least their respective SP2s and I don't have any problem seeing all the cores, either in Task Manager or in my Visual Studio .NET work, although all of my VS work is with native code. Mostly when I see performance peaking on one processor or another, not all, it's because of serialization limits in the application itself. However you report a difference merely by switching from XP (or Server 2003) to Vista, which seems to blow most theories about serialization in the app. Perhaps it's some idiosyncracy of the CLR side of .NET?

One more sanity check: You say, "the performance statistics show only one core being used (consistent with Task Manager)." Does that mean that Task Manager displays the correct number of processors but you only see activity on one of them? Or do you not even see the correct number of processors displayed in Task Manager?

0 Kudos
mtraudt
Beginner
2,607 Views

The correct number of processors (2) is displayed, but only one of them is used according to Task Manager.

We are going to try an experiment with a standalone .NET app that spins off a couple of worker threads and see if we can reproduce the problem that way.
0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views

The correct number of processors (2) is displayed, but only one of them is used according to Task Manager.

We are going to try an experiment with a standalone .NET app that spins off a couple of worker threads and see if we can reproduce the problem that way.

Hey, perhaps the problem is that the PC is using an ilegallytunned Windows XP... For example, there are many Windows Phenix or Fenix out there, and Windows uE which are illegal Windows XP versions with DLLs and with less services running. Their intention is to optimize performance. However, I repeat, they are illegal and really, really horrible. You can have that kind of unexpected behavior in these ilegally cracked Windows XP versions. You should check that.
0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views
Hey,

You can use another free software that can help you with this problem: PC WIZARD. You can download it from http://www.cpuid.com/pcwizard.php
This excellent hardware detection application includes a multithreading benchmark. If it doesn't take advantage of more than 1 core when running with more than 1 thread, you are facing an horrible version of the operating system or a hacker was working on that computer :)
I think it can help you.

Cheers,

Gastn
0 Kudos
mtraudt
Beginner
2,607 Views

We are closer to understanding what is causing the problem. Our worker threads capture per-host CPU statistics using the PerformanceCounter class. It turns out that a side-effect of this on XPis that the threads end up bound to one core. The problem is easily reproducible for us with the code sample below. Ifthis program is compiled and run with anargument of False, theall available cores are used, but if run with an argument of True, thenonlyone core is used.

This was tested using .NET 2.0 on several difference dual-core PC's running XP (SP2 or SP3).

My theory is that the XP Process Scheduler will assign all the worker threads initially to one core. Normally the threadswould move between cores as context switches occur, however something about the PerformanceCounter stuff bindsthe thread permanentlyto the initial core.

Note that the problem only occurs if the PerformanceCounter logic is called within the worker thread. If we move this to the main thread (which makes more sense anyway) then we use all available cores. So there is any easy workaround, but I wanted to document this in case anybody else comes across it in the future.

using System;
using System.Threading;
using System.Diagnostics;

namespace ThreadingSample
{
  class Program
  {
    static bool usePerfCounter;

    static void Main(string[] args)
    {
      if (args.Length != 1 || !bool.TryParse(args[0], out usePerfCounter))
      {
        Console.WriteLine("usage: OneOff ");
        return;
      }

      int numThreads = Environment.ProcessorCount;
      Console.WriteLine("Starting {0} Threads.", numThreads);
      for(int x=0; x= 0; y--)
            d = d*y;
        }
      }
    }
  }
}
0 Kudos
robert-reed
Valued Contributor II
2,607 Views
We are closer to understanding what is causing the problem. Our worker threads capture per-host CPU statistics using the PerformanceCounter class. It turns out that a side-effect of this on XPis that the threads end up bound to one core. The problem is easily reproducible for us with the code sample below. Ifthis program is compiled and run with anargument of False, theall available cores are used, but if run with an argument of True, thenonlyone core is used.

This was tested using .NET 2.0 on several difference dual-core PC's running XP (SP2 or SP3).

My theory is that the XP Process Scheduler will assign all the worker threads initially to one core. Normally the threadswould move between cores as context switches occur, however something about the PerformanceCounter stuff bindsthe thread permanentlyto the initial core.

Note that the problem only occurs if the PerformanceCounter logic is called within the worker thread. If we move this to the main thread (which makes more sense anyway) then we use all available cores. So there is any easy workaround, but I wanted to document this in case anybody else comes across it in the future.

Oooo, Danger Will Robinson! According to Microsoft's documentation:

It is strongly recommended that new performance counter categories be created during the installation of the application, not during the execution of the application. This allows time for the operating system to refresh its list of registered performance counter categories. If the list has not been refreshed, the attempt to use the category will fail.

So the code that you shared with the PerformanceCounterCategory instance creation in the worker codeappears to be problem even if the code is correct. I presume you mean by moving "this" to the main thread, you're talking just about the object creation?

0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views
Having the code is easier!!!
Hey, the problem is not in the PerformanceCountCategory instance creation. It isn't OK to create it in a thread, I agree with the other comment. However, the problem is in the following lines:
//foreach (string instanceName in cat.GetInstanceNames())
// Console.WriteLine("Processor [{0}]", instanceName);

If you comment those lines, you'll see all your cores 100%. I've checked that in a Core i7 with HT (8 threads) and without HT (4 threads).

You are getting instances and writing to console in many worker threads. You should update the UI using delegates. You shouldn't update the UI from independent threads.

If you need some code regarding delegates and how to update the UI, you can download the code from my book "C# 2008 and 2005 threaded programming", the examples works with .Net 2.0 and you'll find some patterns to solve your problem. This is the link: http://www.packtpub.com/beginners-guide-for-C-sharp-2008-and-2005-threaded-programming/book, then, click Code download.

You'll find some samples about updating the UI from many concurrent threads. Hence, you'll be able to run this application using a delegate. Delegates are complex, therefore, I use the hyperlink, as there is code there that can help you.

Cheers,

Gastn
0 Kudos
mtraudt
Beginner
2,607 Views
I am not creating a new PerformanceCounterCategory, I am creating an instance of an object to access the "Processor" category, which is provided by the OS. I believe your quote refers to application-specific categories, which would presumably need to be registered during the install.
0 Kudos
mtraudt
Beginner
2,607 Views
Quoting - Gastn C. Hillar
Having the code is easier!!!
Hey, the problem is not in the PerformanceCountCategory instance creation. It isn't OK to create it in a thread, I agree with the other comment. However, the problem is in the following lines:
//foreach (string instanceName in cat.GetInstanceNames())
// Console.WriteLine("Processor [{0}]", instanceName);

If you comment those lines, you'll see all your cores 100%. I've checked that in a Core i7 with HT (8 threads) and without HT (4 threads).

You are getting instances and writing to console in many worker threads. You should update the UI using delegates. You shouldn't update the UI from independent threads.

If you need some code regarding delegates and how to update the UI, you can download the code from my book "C# 2008 and 2005 threaded programming", the examples works with .Net 2.0 and you'll find some patterns to solve your problem. This is the link: http://www.packtpub.com/beginners-guide-for-C-sharp-2008-and-2005-threaded-programming/book, then, click Code download.

You'll find some samples about updating the UI from many concurrent threads. Hence, you'll be able to run this application using a delegate. Delegates are complex, therefore, I use the hyperlink, as there is code there that can help you.

Cheers,

Gastn

This is a very simpleexampleintended onlyto demonstratean issue when using PerformanceCounterCategory within a worker thread. The real code does not write to the console and does use delegates to communicate from the worker threads backto the main thread.
0 Kudos
robert-reed
Valued Contributor II
2,607 Views
I am not creating a new PerformanceCounterCategory, I am creating an instance of an object to access the "Processor" category, which is provided by the OS. I believe your quote refers to application-specific categories, which would presumably need to be registered during the install.

Caught me! I guess I've demonstrated that I'm not a .NET programmer.Still, I'm curious about the code snippet you provided. It does appear that you've run into some XP bug fixed in Vista. Beyond that, I'll need to leave you in the hands of someone more expert in this environment.
0 Kudos
gaston-hillar
Valued Contributor I
2,607 Views
Hi,

There's nothing else I can do without the real code. I've tried to help you in many ways.

Hope to see you here again, and you can find my contact information in my profile. Feel free to drop me a few lines if you consider I can be helpful.

Cheers,

Gastn

0 Kudos
mtraudt
Beginner
2,607 Views

Using the previously posted code sample, Microsoft was able to reproduce theproblem. We just got back their response, which I am passing along in case anybody is interested:

===
Weve investigated this and this is definitely a bug in XP. There is a particular code path that sets the threads affinity when it queries for process and/or processor information. It is supposed to roll back the affinity to its previous mask before returning, but that was accidentally left out. This issue has been addressed in Server 2003, and the issue never existed on Vista and on.

The product group is looking to get this into the next Service Pack for XP. If you require a fix for this now, we will need to file a hotfix request. For that you will need to provide a business impact statement. If this is something you would like to pursue, please let me know. Note, filing the bug with the product group doesnt necessarily mean it will be approved to be fixed, it is nothing more than a request.
===
0 Kudos
gaston-hillar
Valued Contributor I
2,108 Views

Using the previously posted code sample, Microsoft was able to reproduce theproblem. We just got back their response, which I am passing along in case anybody is interested:

===
Weve investigated this and this is definitely a bug in XP. There is a particular code path that sets the threads affinity when it queries for process and/or processor information. It is supposed to roll back the affinity to its previous mask before returning, but that was accidentally left out. This issue has been addressed in Server 2003, and the issue never existed on Vista and on.

The product group is looking to get this into the next Service Pack for XP. If you require a fix for this now, we will need to file a hotfix request. For that you will need to provide a business impact statement. If this is something you would like to pursue, please let me know. Note, filing the bug with the product group doesnt necessarily mean it will be approved to be fixed, it is nothing more than a request.
===

Hi,

It should be great if you could post the exact part of the code that generates this problem. It could help many others. :)
I'm not sure if Windows XP is going to have a SP 4... Now that Windows 7 is round the corner.
0 Kudos
Reply