Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

NPTL Support on IA-64 Linux ?

chilywily
Beginner
687 Views
Does anyone know the availability of NPTL threads for 2.4.x
linux kernels? Is it available and how do I tell?

TIA.
0 Kudos
6 Replies
Intel_C_Intel
Employee
687 Views
Hi chilywily,
NPTL will be released as part of Red Hat Enterprise Linux 3.0 (Taroon), which is now in beta 1 release.
It is based on kernel 2.4.21 and comes in two flavors, Advanced Server and Workstation, both will support Itanium2 as well as IA32.

On another note, NPTL is already part of Red Hat Linux 9 professional release earlier this year.

Could you share with us what features you need from NPTL? Is it something that you could not get from LinuxThread?

Shuo
0 Kudos
chilywily
Beginner
687 Views
Shuo,

I'm looking forward to the M:N thread scheduling and futex's that are going to be provided by NPTL - basically, I have an application that does a lot of (slow) disk I/O on one end and sends/receives the results over the network. I'm hoping to split the work up such that the disk and network portions run as separate threads. Oh and my machine is running two 1.4GHz Pentium4s in an SMP configuration so I'm hoping I can load share the work.

Is this possible via LinuxThreads?

Thanks,

Chily
0 Kudos
Intel_C_Intel
Employee
687 Views
Hi Chily,

I am not sure you can get M:N scheduling from NPTL. NPTL use 1:1 threading model, which means that it will creates one kernel thread for each user level thread. In order to get M:N, you will have to find a M:N threading library that can run with NPTL. There isn't such thing yet, partly because NPTL is very new, party because people want to give the implementation of 1:1 scheduling in NPTL a try. The general consensus so far seems to be that M:N is a good idea but difficult to get right. The combination of a good 1:1 kernel scheduling policy and thread pooling will deliver most of the benefits M:N promises.

By the way, Futex, the fast user-level mutex, should be part of Red Hat Enterprise Linux 3.0 release.

As for your application, disk I/O and network data transfer operation are good candidates for threads. Since neither of them is computational intensive, neither of them shares the execution resource utilization pattern, you will expect good application performance on processor with the Hyper-Threaded technology. Don't be surprised if a single hyper-threading enabled system will perform just as good as your DP SMP.

For a long time, LinuxThread has been the standard POSIX thread implementation on Linux. The biggest weakness of LinuxThread are the scalability and signal handling. For example, you cannot create more than 1025 threads; thread creation are slow. Non-standard signal delivery behavior etc.

If your application does not create more than 100 threads and does not expect signal to be delivered to the process, you should still be able to use LinuxThread. Although I would expect the same application will have better performance with NPTL.

Shuo
0 Kudos
chilywily
Beginner
687 Views
Hi Shuo,

I thought M:N was available but thanks for the clarification - I've seen some benchmarks for 1:1 models and they look pretty impressive.

I agree about some of the missing stuff - My understanding is that most of that stuff has been addressed (like the non-standard signal behavior, getpid() returning thread pids vs. the real (parent) pid). Last I investigated about 2/3 months back) it was slated for 2.5.x testing.

My application won't have 100s of threads, more like 7-8 threads. Does LinuxThreads provide a means to allocate these threads to specific CPUs? One of my concerns is that the two most I/O intensive threads will end up on the same CPU and potentially compete with each other while the other threads which end up on the other CPU will be bottlenecked by these two. I guess the only way I know to get out of that situation is to ensure they run on different CPUs - Is there a better way?

Thanks,

Chily
0 Kudos
ClayB
New Contributor I
687 Views
Chily -

> My application won't have 100s of threads, more like
> 7-8 threads. Does LinuxThreads provide a means to
> allocate these threads to specific CPUs? One of my
> concerns is that the two most I/O intensive threads
> will end up on the same CPU and potentially compete
> with each other while the other threads which end up
> on the other CPU will be bottlenecked by these two. I
> guess the only way I know to get out of that
> situation is to ensure they run on different CPUs -
> Is there a better way?

You bring up valid concerns and the easiest advice to give is "let the operating system take care of this for you." If two processors have each been assigned two threads, but one pair of threads doesn't require their CPU much, the OS should realize this and switch off one of the highly active threads to the lightly loaded CPU. There are some problems with the Linux scheduler that may prevent this from happening as well as we'd like. There's a link to an article that details this "fault" and some discussion about Windows processor affinity functions in this thread.

No one has posted any corresponding Linux functions for thread affinity and I've never used or seen any either. I'm doubting that anything exists in the Linux Threads for thread affinity, and I don't remember seeing any such functionality in the NPTL or heard of any plans to include affinity functions.

-- clay

Message Edited by intel.software.network.support on 12-09-2005 02:18 PM

0 Kudos
Intel_C_Intel
Employee
687 Views
> Hi Shuo,
>
> I thought M:N was available but thanks for the
> clarification - I've seen some benchmarks for 1:1
> models and they look pretty impressive.
>
As you probably know it already, not long ago, there was a debate on what threading model the next POSIX threading package should be on Linux, M:N or 1:1.
One camp was led by Bill Abt from IBM Cambridge lab. He, who favors M:N model, got an early start on this problem, patched Linux 2.4.18+ kernel, introduced a few new system calls and implemented a M:N threading library known as NGPT, Next Generation Posix Thread.
Almost all of his kernel fixes to the Linux was accepted and made into 2.5.*. The expectation are NGPT will be integrated into GNU C runtime library and become the default threading library.
However Ultrich Drepper, the owner of C runtime was not totally sold on M:N idea. He took all kernel changes from NGPT, got ride of LinuxThread oddities (removed manager thread and kludged signal handling), drastically improved synchronization and scheduling efficiencies. He called the new threading package NPTL. NPTL uses 1:1 threading model, for each user thread creation, NPTL turns around and creates a kernel thread.

Some benchmarks showed that NPTL is better than NGPT, Bill Abt quoted many instances NGPT performs and scales better than NPTL. The results are variable and are highly application and platform dependent. What's less controversial is that M:N model is more complicated and more difficult to debug than 1:1. Thus M:N was eliminated by Occam's Razor.

> I agree about some of the missing stuff - My
> understanding is that most of that stuff has been
> addressed (like the non-standard signal behavior,
> getpid() returning thread pids vs. the real (parent)
> pid). Last I investigated about 2/3 months back) it
> was slated for 2.5.x testing.
>
> My application won't have 100s of threads, more like
> 7-8 threads. Does LinuxThreads provide a means to
> allocate these threads to specific CPUs? One of my
> concerns is that the two most I/O intensive threads
> will end up on the same CPU and potentially compete
> with each other while the other threads which end up
> on the other CPU will be bottlenecked by these two. I
> guess the only way I know to get out of that
> situation is to ensure they run on different CPUs -
> Is there a better way?
>

NPTL also brought thread affinity to C runtime library. These two calls were non-exist on Red Hat Linux 8.0 or older nor on RHEL 2.1. They are support now on RHL 9.0 and RHEL 3.0.

#include

// set and get a process's CPU affinity mask

int sched_setaffinity(pid_t pid, unsigned int len, unsigned long *mask);

int sched_getaffinity(pid_t pid, unsigned int len, unsigned long *mask);

0 Kudos
Reply