A BSD Rootkit Primer
Pages: 1, 2
How much effort is needed to write a rootkit that works on multiple branches (4.x, 5.x, 6.x, 7.x)?
Joseph Kong: Honestly, not much (or at least in my opinion, not much). In general, I find that it doesn't require much effort to write a rootkit that is backwards/forwards compatible. Of course, this is all subjective and dependent on what gets changed between versions. For example, in FreeBSD 4 system call functions had the following function prototype:
typedef int sy_call_t __P((struct proc *, void *));
This got changed to the following in FreeBSD 5:
typedef int sy_call_t(struct thread *, void *));
The main difference here is that the first parameter got changed from struct proc (a process structure) to struct thread (a thread structure). So, if you had a kernel-mode rootkit that hooked one or more system calls on FreeBSD 4, you would have to, at the bare minimum, change one or more parameters/arguments to get it to run on FreeBSD 5. On the other hand, if your rootkit patched some data within struct proc on FreeBSD 4, you now have to take into account that FreeBSD 5 executes processes at thread granularity, and that you may have to patch struct thread instead. This might sound like a lot of work, but it's really not that hard or time consuming. The main thing is identifying how the objects you interact with differ between versions (and having the full kernel source makes that pretty easy).
Now that multi-CPU/core systems are common, does it mean that rootkits must be carefully designed to be SMP safe? Maybe some sort of regression suite might spot syscalls that stop being SMP safe because they are handled by a non-SMP-safe rootkit...
Joseph Kong: It does, and it doesn't. On a symmetric multiprocessing (SMP) system, if a thread running on one CPU is manipulating an object and a thread on another CPU begins manipulating the same object, data corruption can result. Therefore, all kernel code, including kernel-mode rootkits, must prevent this situation from occurring. Typically, this is achieved by employing some sort of synchronization scheme (e.g., mutexes).
However, this same type of situation can occur on a uniprocessor (UP) system, because kernel code (like use-mode code) can be interrupted. For example, if a thread is manipulating an object, gets interrupted, and another thread gets scheduled to run, which manipulates the same object, data corruption can result. As you can see, this situation is, in effect, identical to the SMP situation described above, and as such, it can be prevented in the same way.
So yes, rootkits need to be designed SMP safe, but in effect, that's the same as designing a thread-safe UP rootkit, which rootkit developers have had to do for a while now. Thus, from a rootkit development standpoint, the prevalence of SMP systems isn't that big of an issue. (On the other hand, from an OS development standpoint, SMP is a very big issue--but that's a completely different discussion altogether).
On a SMP system each processor has its own interrupt descriptor table (IDT). Thus, a UP designed rootkit that manipulates the IDT, when retooled for a SMP system, will have to take this into account. This is the only point that I can think of (i.e., know about) in which a SMP system might mess up a UP designed rootkit that is thread-safe.
Some people need to use binary-only kernel modules to support their hardware. For example some NVIDIA or ATI graphic cards. This means loading something like 2 MB of code in your kernel space. Is there anything that could be done to reduce the risk of running that code?
Joseph Kong: I guess you could use super verbose audit logs (and that would help), but as an overall, I don't think there is much you can do to reduce the risk of running that code. For example, assuming the code is malicious and you load it into kernel space, it can now disable/mitigate any user/kernel space protection schemes you have installed (that it's aware of).
Obviously with this specific case, prevention is (completely) useless, and you have to depend on detection.
There is a port of DTrace for FreeBSD. Would it help to analyze binary only code and maybe spot anomalies by profiling code performances?
Joseph Kong: DTrace is a very powerful tracing framework, so it can be used to analyze binary blobs, "pinpoint" system anomalies (e.g., performance bottlenecks), profile code performance (e.g., the elapsed time of each function call), and so on.
However, the problem is "properly" designed malware wouldn't (or shouldn't) introduce any sort of system anomaly (e.g., performance bottlenecks)--it wouldn't be too stealthy if it did. This makes detecting "malicious" binaries with DTrace somewhat difficult.
So, while DTrace is great for analyzing your system, for detecting malware, it really depends on the malware itself, and how well you know your system.
You can get an accurate measure of how much time a function or syscall takes to complete with DTrace. You just set it to take a timestamp whenever the desired function or syscall gets called/entered and when it completes/returns (and subtract the first time from the second), but, the added overhead of a call hook is so minute (assuming the hook is written properly) that you probably wouldn't notice. Also, there are lots of factors that affect how long a function takes to complete (e.g., branching code paths, sleeps, and so on), so you would need to know your system's (average) performance times in order to realize that an additional millisecond was the result of a hook and not something else.
At the end of the day, it's just much easier to scan for hooks/patches over profiling code performance, in order to find a rootkit--especially if you are looking for (basic) syscall hooks.
FreeBSD handles Linux emulation pretty well. I am wondering if it handles it so well that some rootkits for Linux could work on FreeBSD too...
Joseph Kong: Well a kernel-mode rootkit for Linux would not work on FreeBSD (because of different data structures/kernel internals), but, a Linux user-mode rootkit could work. For example, if the rootkit simply trojaned the output of ps(1) or ls(1) that could work, however, if the user-mode rootkit modified some specific Linux kernel data structure (e.g., the system call table, i.e., SucKIT) that would fail. So, I guess the answer is, it depends on the Linux rootkit.
I had this question when I heard that Luigi Rizzo is working on a framework to use Linux drivers on FreeBSD. I thought that this framework, if/when integrated, might open the way to malicious code designed for Linux.
Joseph Kong: First off, thanks for pointing this framework out to me. I've been playing around with it for the last day or so. It's actually very cool from a device driver developer's standpoint, as it lets you take Linux LKM source code and compile it directly under FreeBSD, without any modifications, to produce a native FreeBSD binary--that's awesome!
However, from a rootkit developer's standpoint, it's not that useful (at least in my opinion). To understand why, you need to understand how the framework is designed. Essentially, the framework is for building Linux device drivers on FreeBSD and it works in two ways. First, it remaps Linux functions, header files, structures, and so on to equivalents in FreeBSD. If that's inadequate or infeasible, the various Linux functions are reimplemented in FreeBSD. (Of course, this description is an oversimplification, but you get the idea.) The problem is most LKM rootkits hook/patch various low-level functions/data structures that, more often than not, don't get explicitly called/interacted with from within a device driver. Thus, the framework doesn't have a remapping for these items, and as such, it's not going to pave the way for Linux LKM rootkits to be run under FreeBSD. Of course, that doesn't mean I didn't at least give it a try. :)
I wrote a simple Linux LKM which hooked the Linux system call table (sys_call_table), compiled it under the framework (with no errors), but when I loaded it, I got the following error message:
link_elf: symbol sys_call_table undefined
Basically, the framework didn't remap the Linux system call table to the FreeBSD system call table. Of course, it wouldn't--when are you ever going to need to directly manipulate the system call table when writing a device driver?
Does loading our packet filter (pf, ipfw, ...) as a KLD let a rootkit mess with it more than if it was statically built in the kernel?
Joseph Kong: No, it does not; while an attacker could Trojan your KLD packet filter, I just don't see anyone going through all that trouble. Allow me to explain.
Let's assume that an attacker does have a Trojan KLD and that you don't use a KLD for your packet filtering. Now, this means that their Trojan is useless, and that they are going to need (or have to write) another program to attack your packet filter. But, what's the point? When you load a KLD into your system, for all intents and purposes, it's as if it was statically built into the kernel. In other words, after a KLD is loaded, you can list/dump/find the address of its' functions/symbols by examining the currently running kernel image (i.e., the image in main memory). Thus, the code to hook/patch a loaded KLD is the same as the code to hook/patch the currently running kernel. In other words, writing a Trojan KLD is unnecessary.
How can an attacker control a rootkit on a remote machine?
Joseph Kong: In general, there are only two reasons to communicate with a rootkit on a remote machine:
- For data exfiltration.
- For remote command/control.
Both of these tasks are typically achieved through a covert channel. A covert channel is defined (by the US DOD) as follows:
"Any communication channel that can be exploited by a process to transfer information in a manner that violates the systems' security policy."
Typically, covert channels work in one of two ways:
- Information is transferred by placing it within an unused field inside a network protocol (e.g., the optional data segment of an ICMP echo request message).
- Information is transferred by changing some field inside a network protocol (e.g., manipulating the IP header identification field).
The following is an example covert channel that can be used for data exfiltration:
Whenever a TCP connection is attempted (i.e., a SYN packet is sent out) on the owned machine (i.e., the machine with the rootkit installed), the Initial Sequence Number (ISN) is modified, such that it stores the data the rootkit wishes to push out.
In order to achieve this, the rootkit must sit in between the TCP/IP stack of the system and the network interface. Additionally, the covert channel engine (within the rootkit) must constantly change the sequence (SEQ) and acknowledgment (ACK) numbers generated by the kernel and by the packets received from the wire. This is because if the covert channel engine only changed the first SEQ number (within the first SYN packet), the kernel would not understand the corresponding ACK number (within the second SYN | ACK packet); which would break the connection. Remember, the kernel believes that the ISN hasn't been modified.
Additionally, the covert channel engine should encrypt the ISN so that it looks like a random number and not data.
Notice that with this scheme, the rootkit doesn't create/establish a connection, it simply rides on one going out, which makes it stealthier and allows it to workaround stateful packet filters. But, in order to gather the data, the attacker must also own a machine in between the compromised computer and the one it's communicating with.
For more on this particular covert channel see "The Implementation of Passive Covert Channels in the Linux Kernel" by Joanna Rutkowska.
Creating a covert channel for remote command/control is similar to creating one for data exfiltration, except that it must be able to send and receive data, as opposed to just sending data. The following is an example covert channel that can be used for remote command/control. It is assumed that the "owned" machine is behind a stateful packet filter.
Periodically (or at some set time), the rootkit on the owned machine will send out a DNS request that's arranged as follows:
This will cause the owned machine's local DNS server to connect to a nameserver for
domainname.tld. The nameserver(s) at
domainname.tld are, of course, under the rootkit owner's control. Thus, the response is controlled. Essentially, this is a connect back, covert, remote shell. The rootkit can send data up in the form of queries, and get control information in the form of responses. Naturally, the queries and responses are designed to look legitimate.
Of course, DNS isn't the only option, any request/response protocol can be used as a covert channel for remote command/control. For more information on covert channels see the Gray-World Team website.
The only real criteria when choosing a protocol to use, is to make sure that its packets are fairly prolific on the network, so that you can hide within plain sight.
Federico Biancuzzi is a freelance interviewer. His interviews appeared on publications such as ONLamp.com, LinuxDevCenter.com, SecurityFocus.com, NewsForge.com, Linux.com, TheRegister.co.uk, ArsTechnica.com, the Polish print magazine BSD Magazine, and the Italian print magazine Linux&C.
Return to ONLamp.com.