I read that
realpath(3) is now thread-safe. Does this release include any improvements for SMP systems?
Niklas Hallqvist: There is often confusion between threading issues and MP issues. These are actually orthogonal, at least in userland. No matter how many (or few) actual executing units that are available, if the execution model contains threads (lightweight execution contexts), the code executed need to be "thread safe" in order to be correct. This is true even for uniprocessor systems. On the other hand, if you have an execution model that does not contain threads but rather just processes (i.e., execution contexts that do not share main memory resources), you don't need to have thread-safe code in order to execute your processes on many processors.
In OpenBSD, so far, this orthogonality is even clearer, since we don't map threaded processes onto several processors. The scheduling entity is the process, not the thread. This may seem suboptimal, and it is, but it is the conservative approach. You don't need to make code thread-safe in order to take advantage of multiprocessing. Thus, performance comes at a low cost. I am not saying we are not going to provide thread libraries that will take advantage of SMP; I'm just saying that was not the primary target. My impression is that people have been happy that as few things broke as it did when we implemented SMP. We are not here to provide users with the fanciest performance figures on earth, we are here because we personally want better performance without risking functionality.
As to the improvement part, yes, there have been bug fixes done, making SMP machines even more stable, but no new functionality has been added, as far as I can recall.
Quoting from the talk that Ted Unangst will give at EuroBSDCon: "The existing userland pthreads library in use by OpenBSD is hampered by poor performance, inability to utilize multiple CPUs, and unnecessary complexity. A replacement library, rthreads, utilizes a modified
rfork() system call to create kernel threads. It is both simpler and more scalable than the library it replaces." Could you share some details about it?
Ted Unangst: First, rthreads is not included in 3.8; it's not clear when it will be incorporated into OpenBSD. rthreads started as an experiment to see how much effort would be involved in developing support for kernel-aware threads. It turns out that if you don't overcomplicate things, it's remarkably simple. Initially it seemed that we should support the M:N (or scheduler activation) model, because it was the "right way" to do things. After some more consideration, it became clear that you can get 95 percent of the way there with 1:1 threads, at about 20 percent of the complexity. Although rthreads is not finished, it currently provides a substantial portion of the pthreads API.
Could you talk about the new
malloc(3) implementation and how it improves security?
Theo de Raadt: Traditionally, Unix
malloc(3) has always just "extended the
brk", which means extending the traditional Unix process data segment to allocate more memory.
malloc(3) would simply extend the data segment, and then calve off little pieces to requesting callers as needed. It also remembered which pieces were which, so that
free(3) could do its job.
The way this was always done in Unix has had a number of consequences, some of which we wanted to get rid of. In particular,
free have not been able to provide strong protection against overflows or other corruption.
malloc implementation is a lot more resistant (than Linux) to "heap overflows in the
malloc arena", but we wanted to improve things even more.
Starting a few months ago, the following changes were made:
- We made the
mmap(2)system call return random memory addresses. As well, the kernel ensures that two objects are not mapped next to each other; in effect, this creates unallocated memory, which we call a "guard page."
- We have changed
mmap(2)instead of extending the data segment via
- We also changed
free(3)to return memory to the kernel, unallocating them out of the process.
- As before, objects smaller than a page are allocated within shared pages that
malloc(3)maintains. But their allocation is now somewhat randomized as well.
- A number of other similar changes which are too dangerous for normal software or cause too much of a slowdown are available as
mallocoptions as described in the manual page. These are very powerful for debugging buggy applications.
- When you free an object that is >= 1 page in size, it is actually returned to the system. Attempting to read or write to it after you free is no longer acceptable. That memory is unmapped. You get a SIGSEGV.
- For a decade and a bit, we have been fixing software for buffer overflows. Now we are finding a lot of software that reads before the start of the buffer, or reads too far off the end of the buffer. You get a SIGSEGV.
To some of you, this will sound like what the Electric Fence toolkit used to be for. But these features are enabled by default. Electric Fence was also very slow. It took nearly three years to write these OpenBSD changes, since performance was a serious consideration. (Early versions caused a nearly 50 percent slowdown).
Our changes have tremendous benefits, but until some bugs in external packages are found and fixed, there are some risks as well. Some software making incorrect assumptions will be running into these new security technologies.
We expect that our
malloc will find more bugs in software, and this might hurt our user community in the short term. We know that what this new
malloc is doing is perfectly legal but that realistically some open source software is of such low quality that it is just not ready for these things to happen.
We ask our users to help us uncover and fix more of these bugs in applications. Some will even be exploitable. Instead of saying that OpenBSD is busted in this regard, please realize that the software which is crashing is showing how shoddily it was written. Then help us fix it. For everyone ... not just OpenBSD users.
Do you plan to make other modifications to memory management functions?
malloc has been changed to use the
munmap pair of memory-mapping functions, but the page's accounting is still done by keeping a private list of page mappings. And page guarding is still done the old way, by mapping an extra page that gets
mprotect(2)ed to forbid any access. We think that page guarding can simply be ensured by modifying
mmap(2) so that it always returns nonadjacent memory regions, thus removing that responsibility from
malloc(3). Moreover, the page mapping list keeping is an expensive operation, and we should look at some way to improve it.
Federico Biancuzzi is a freelance interviewer. His interviews appeared on publications such as ONLamp.com, LinuxDevCenter.com, SecurityFocus.com, NewsForge.com, Linux.com, TheRegister.co.uk, ArsTechnica.com, the Polish print magazine BSD Magazine, and the Italian print magazine Linux&C.
Return to the BSD DevCenter.