oreilly.comSafari Books Online.Conferences.


Linux Multithreading Advances

by Jerry Cooperstein

Recent advances in Linux's threading implementation are expected to continue to ease migration from other Unix-like operating systems. These advancements have arrived with intense activity on two fronts. First, thread-handling improvements have greatly enhanced the kernel's scalability even to thousands of threads. Second, there are now two fresh, competing implementations of the POSIX pthreads standard (NGPT and NPTL) set to replace the aging LinuxThreads library.

In typical open source fashion, only time will tell exactly who wins out in which arena. However, both new library implementations should be API-compatible with the standard, so the choice will depend on performance and stability. The required changes will appear in the upcoming Linux 2.6 (or 3.0) kernel and can already be tested in late development versions.

Multithreading on Linux

Threading implementations typically have components in both user and kernel space. It is possible to do everything from the one side or the other, but each approach has problems. With everything on the user side, all related threads are part of one single process (which can only run on one CPU at a time), and multi-processor systems are underutilized. With everything on the kernel side, the kernel scheduler must bear a heavy load.

Related Reading

Understanding the Linux Kernel
By Daniel P. Bovet, Marco Cesati

Approaches have ranged from the 1:1 pure kernel thread model in which each user thread has its own kernel thread, to the M:1 model in which the kernel sees only one normal process, with an arbitrary number of threads with which to schedule in user space. The M:N model falls in between, associating M user threads with each of N kernel threads.

The Linux kernel uses the clone() function to create new processes. Flags control parent/child resource sharing, where resources range from everything (memory, signal handlers, file descriptors, etc.) to nothing. While the usual fork() inherits resources from the parent, it may share nothing. Copy-on-write techniques ensure each process gets its own copy as soon as either one tries to modify a shared resource.

Programs can call the clone() function as a system call, using it directly to produce multithreaded programs. However, it is completely Linux-specific and non-portable. Since there is no external standard, there is no guarantee that its interface will be stable. Threading library implementations do in fact use the clone() system call, and it is the job of library maintainers to keep up with kernel changes.

The LinuxThreads implementation of the POSIX threads standard (pthreads), originally written by Xavier Leroy, has been the dominant one for years and is now incorporated and maintained in glibc. It has two problems on Linux:

  • Compliance issues as compared to the POSIX standard
  • Performance issues, especially when dealing with many threads; i.e., a lack of scalability

Compliance Issues

Virtually all compliance problems can be traced to the decision to use lightweight processes (LWPs, or the 1:1 model described above) as the basis of the implementation. New processes are created by clone(), with a shared-everything approach. While the new process is lighter due to the sharing, fundamentally it is still a process in its own right, with its own process identifier (pid), process descriptor, etc.

This led to the following standards compatibility problems:

  • Signal handling is incorrect.
  • An extra management thread is created by the pthreads library.
  • ps shows all threads in a process.
  • Core dumps don't contain the stack and machine register information for all threads.
  • getpid() returns a different result for each thread.
  • A thread cannot wait for a thread created by another thread.
  • Threads have parent-child, not peer, relationships.
  • Threads don't share user and group IDs.

If a pthreads application were written for Linux, one could expect easy portability. However, the inverse process, porting to Linux, was more difficult and slowed Linux deployment since important applications were now broken.

Some problems were resolvable by relatively minor kernel adjustments. For example, by modifying the basic data structure describing each process, (struct task_struct) to store a thread group identifier and some other bookkeeping, and then modifying the getpid() system call to return this identifier rather than the process identifier, one problem could be solved.

However, many key kernel developers resisted attempts to modify the kernel for compliance sake. On one hand, their taste runs to technically superior solutions rather than to "cut the toes to fit the shoes" to comply with standards. On the other hand, there was an aversion to creating many threads. Sentiments like "there is no need to create more threads than there are processors" were common. Thread-prolific languages such as Java were looked at with contempt for many reasons.

Performance Issues

The main performance problem has been scalability with growing numbers of threads. These difficulties are not unique to threads, but apply in all cases where the number of processes grows large.

Consider the process of obtaining a new pid. In the 2.4 kernel, Linux has to loop through all processes to ensure a candidate pid is not already assigned. With an outer loop on possible candidates, the time spent may scale quadratically; if there are thousands of processes, the system can slow down to a crawl. Since each thread has its own pid, creating zillions of threads is poisonous.

Pages: 1, 2

Next Pagearrow

Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!

Linux Resources
  • Linux Online
  • The Linux FAQ
  • Linux Kernel Archives
  • Kernel Traffic

  • Sponsored by: