[Misc] PCTF2020 - golf.so

Golf.so
Solves: 104

Points: 500

Description:
Upload a 64-bit ELF shared object of size at most 1024 bytes. It should spawn a shell (execute execve(“/bin/sh”, [“/bin/sh”], …)) when used like
LD_PRELOAD= /bin/true

golf.so.pwni.ng

The objective of this challenge is to create an ELF shared library that, when running like this:

1
$ LD_PRELOAD=<upload> /bin/true

It should spawn a shell, there is a requirement that the shared library must be less than 1024 bytes to pass the first level. The first thing I tried to do was to use the classic GCC.

First, I use ghidra to look up the binary /bin/true, and it appears that /bin/true automatically exits if the arguments are less than 2, so our options are to overwrite the entry point or _libc_start_main.

After searching online for the function signature of _libc_start_main I wrote this c file:

1
2
3
4
5
6
7
8
9
10
11
int __libc_start_main(
void *func_ptr,
int argc,
char* argv[],
void (*init_func)(void),
void (*fini_func)(void),
void (*rtld_fini_func)(void),
void *stack_end){
char* args[] = {"/bin/sh",0x0};
execve("/bin/sh", args, 0x0);
}

Compiling it using gcc:

1
2
3
4
$ gcc -shared lol.c -o lol.so
$ LD_PRELOAD=./lol.so /bin/true
$ id
uid=0(root) gid=0(root) groups=0(root)

We got a shell, but unfortunately the file is too big:

1
2
ls -ltah lol.so
-rwxr-xr-x 1 root root 16K Apr 20 10:08 lol.so*

16k is a large number, and we need to find a way to reduce it. After some reading on the man page of gcc and some recommendations online, I tried to use the following GCC options:

  • norelro compile option.
  • Stripping the binary.
  • Activate no start files option for gcc.
  • nodefault libraries.
  • Turning on optimizations with -O3

This reduced the file size by a considerable amount:

1
2
3
$ gcc -shared -nostartfiles -nodefaultlibs -shared -Wl,-z,norelro -s lol.c -O3
$ ls -ltah a.out
-rwxr-xr-x 1 root root 9.5K Apr 20 10:13 a.out

And 9.5k was the max I could get by just using gcc. We needed less than 1k. Following that, I discovered this post online about creating tiny elf binaries by hand using assembly. Perhaps the post is for elfs of the type ET_EXEC and we need ET_DYN. The post was for 32 bits, and we need 64 bits. The possible file types of an ELF are:

1
2
3
4
5
ET_NONE         An unknown type.     (0x0)
ET_REL A relocatable file. (0x1)
ET_EXEC An executable file. (0x2)
ET_DYN A shared object. (0x3)
ET_CORE A core file. (0x4)

We want ET_DYN to be a shared object, so I did some smart searching on github for examples of shared objects in assembly and found this template, the string I used to find this was:

1
db    0x7f, "ELF" ET_DYN

To open a shell, run the syscall execve, then set the registers RAX to 0x3b, RDI to a pointer to the string /bin/sh, and RSI to a pointer to an array [“/bin/sh”, 0x0].

My first shell code was:

1
2
3
4
5
6
7
8
9
10
11
12
_start:
mov rdi,0x68732f6e69622f ; /bin/sh to RDI
push rdi ; push /bin/sh to the stack
push rsp ; push current stack pointer to the stack
pop rdi ; put the pointer of /bin/sh to RDI
push 59 ; push 0x3b to the stack
pop rax ; get 0x3b from the stack to RAX
push 0 ; constructing the the finaly argument of the array
push rdi ; push a pointer of /bin/sh to the stack
mov rsi,rsp ; put a pointer to ["/bin/sh",0x0] to RSI
cdq ; Convert Doubleword to Quadword https://www.aldeid.com/wiki/X86-assembly/Instructions/cdq
syscall ; execve("/bin/sh",["/bin/sh",0x0],0x0)

Putting this code in the template:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

; build with:
; nasm elf_dll_x64_template.s -f bin -o template_x64_linux_dll.bin

BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd shdr - $$ ; e_shoff
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
dw shentsize ; e_shentsize
dw 2 ; e_shnum
dw 1 ; e_shstrndx
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0xDEADBEEF ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dq dynsection ; p_offset
dq dynsection ; p_vaddr
dq dynsection ; p_vaddr
dq dynsz ; p_filesz
dq dynsz ; p_memsz
dq 0x1000 ; p_align

shdr:
dd 1 ; sh_name
dd 6 ; sh_type = SHT_DYNAMIC
dq 0 ; sh_flags
dq dynsection ; sh_addr
dq dynsection ; sh_offset
dq dynsz ; sh_size
dd 0 ; sh_link
dd 0 ; sh_info
dq 8 ; sh_addralign
dq 7 ; sh_entsize
shentsize equ $ - shdr
dd 0 ; sh_name
dd 3 ; sh_type = SHT_STRTAB
dq 0 ; sh_flags
dq strtab ; sh_addr
dq strtab ; sh_offset
dq strtabsz ; sh_size
dd 0 ; sh_link
dd 0 ; sh_info
dq 0 ; sh_addralign
dq 0 ; sh_entsize
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
; DT_STRSZ
dq 0x0a
dq 0
; DT_SYMENT
dq 0x0b
dq 0
; DT_NULL
dq 0x00
dq 0
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Compiling it:

1
2
3
4
5
6
7
8
$ nasm -f bin -o a.out full.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 427 Apr 20 11:28 a.out
$ nasm -f bin -o a.out full.asm
$ LD_PRELOAD=./a.out ./true
$ id
uid=0(root) gid=0(root) groups=0(root)
$ exit

So with this, we got a shared file with 427 bytes! more than half of the requested 1024 bytes, so let’s upload it to the site:

1
You made it to level 1: considerable! You have 127 bytes left to be thoughtful. This effort is worthy of 0/2 flags.

So this effort, as expected, is not enough for a flag. We need to save at least 127 bytes for the first flag. What I did next was to remove unnecessary sections from the elf, something that would not break the binary. The first thing I did was to remove the Section header (shdr).

It’s not really required, so the changes made to full.asm were:

  • e_shoff in the elf header(ehdr) to point to the program header (phdr)
  • e_shentsize in the elf header(ehdr) value to zero
  • e_shnum in the elf header(ehdr) value to zero (the number of section headers set to zero because we completly removed this section)

The full script to cuted.asm:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82

; build with:
; nasm elf_dll_x64_template.s -f bin -o template_x64_linux_dll.bin

BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd phdr - $$ ; e_shoff (chaged to phdr instead of shdr)
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
dw 0 ; e_shentsize (changed to 0)
dw 0 ; e_shnum (changed to 0)
dw 1 ; e_shstrndx
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0xDEADBEEF ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dq dynsection ; p_offset
dq dynsection ; p_vaddr
dq dynsection ; p_vaddr
dq dynsz ; p_filesz
dq dynsz ; p_memsz
dq 0x1000 ; p_align
; shdr header removed here
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
; DT_STRSZ
dq 0x0a
dq 0
; DT_SYMENT
dq 0x0b
dq 0
; DT_NULL
dq 0x00
dq 0
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

This was enough to get us the first flag:

1
2
3
4
You made it to level 2: thoughtful! 
You have 75 bytes left to be hand-crafted.
This effort is worthy of 1/2 flags.
PCTF{th0ugh_wE_have_cl1mBed_far_we_MusT_St1ll_c0ntinue_oNward}

Following this, many improvements can be made, such as removing unnecessary entries in the dynamic section such as DT_NULL, DT_SYMENT, and DT_STRSZ. We can remove that a save a lot of bytes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
...truncated...
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall
1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 evilgod evilgod 251 Apr 20 11:58 a.out

We reduced it to 251 bytes, still far from obtaining the necessary 194 for the 2nd flag. More improvements can be made. For example, we can cut the last 3 fields of the elf header, which are related to the section header that we previously removed (e_shentsize, e_shnum, and e_shstrndx).

We saved 6 bytes by doing so.

It is possible to save even more bytes by removing the last fields of the PT_DYNAMIC entry from the program header (phdr). This, thankfully, will not break the lib; in the end, this entry will overlap with the dynamic section, which is perfectly fine. So the next fields to remove are p_vaddr,p_filesz,p_memsz,p_align.

The assembly file looks like this right now:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

; build with:
; nasm elf_dll_x64_template.s -f bin -o template_x64_linux_dll.bin

BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd phdr - $$ ; e_shoff (chaged to phdr instead of shdr)
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0xDEADBEEF ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dq dynsection ; p_offset
dq dynsection ; p_vaddr
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Compiling it, we can see we got this into a file of size 213 bytes:

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 213 Apr 20 12:13 a.out

We still need to save 19 bytes for the final flag, so the next step for me is to optimise the shell code at the beginning. We have some fields we can control without breaking the binary, so the next step for me was to include the /bin/sh string in these kinds of fields, so we don’t need to put it in the stack and manipulate those pointers. This can save some bytes.

/bin/sh string was saved in the p_filesz field of the PT_LOAD entry in the program header.

One thing that helped me a lot while debugging a shell wast o put a int 3 instruction before my shell code, which would stop gdb and act as a breakpoint (SIG TRAP):

1
2
3
4
5
6
7
8
9
10
11
12
13
_start:
db 0xcc ; SIGTRAP (int 3 instruction)
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Now we’ll modify the p_filesz entry in the /bin/sh string.

1
2
3
4
5
6
7
8
9
10
11
...
phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0x68732f6e69622f ; p_filesz (now has /bin/sh here)
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
...

I also need to get the offset for this entry. Like libc, this is also a shared library and a space will be assigned for this lib to be located. Fortunately, when the entry code is executed, a pointer is saved in the RAX register. We can calculate the offset from there by using gdb:

1
2
pwndbg> set environment LD_PRELOAD ./a.out
pwndbg> r

The following address is found in rax:

So we can verify where the /bin/sh is located by doing:

1
2
pwndbg> x/s $rax-0x62
0x7fff194f205a: "/bin/sh"

After this, we can use the lea assembly instruction to get the address of binsh and save a lot of bytes:

1
2
3
4
5
6
7
8
9
_start:
lea rdi,[rax-0x62]
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Let’s check how much is left:

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 204 Apr 20 12:46 a.out

Also, because we don’t have a reserved space for strtab, we can make it point to _start instead of creating a label with two dbs.

Updating the script from:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab

To:

1
2
3
4
5
6
7
8
9
10
11
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq _start
; DT_SYMTAB
dq 0x06
dq _start
dynsz equ $ - dynsection

Two bytes are now saved:

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 202 Apr 20 12:49 a.out

We now need one final tweak for our script to be able to get the final flag… We can control the p_offset field without breaking the elf, so we can use it as an index of the dynsection and make a fake DT_STRTAB entry, so the dynamic section will be overlapped with PT_DYNAMIC, saving us something like 0x10 bytes (the old entry DT_STRTAB is removed to save 0x10 bytes).

Due to this action, we also need to update the offset in the _start(updated to 0x50).
My final payload was:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd phdr - $$ ; e_shoff (chaged to phdr instead of shdr)
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0x68732f6e69622f ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dynsection:
; DT_STRTAB
dq 0x5 ; p_offset (OVERLAPPED)
dq dynsection ; p_vaddr
; DT_INIT
dq 0x0c
dq _start
; DT_SYMTAB
dq 0x06
dq _start
global _start
_start:
lea rdi,[rax-0x50]
push 59
pop rax
push 0
push rdi
mov rsi,rsp
;cdq ; this may be needed locally but in the website accepts anyway without this (1 byte save)
syscall

We get a file of 185 bytes :) more than enough to get the final flag.

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 185 Apr 20 12:57 a.out

The flag was:

1
2
3
You made it to level 5: record-breaking! You have 9 bytes left to be astounding.
This effort is worthy of 2/2 flags.
PCTF{th0ugh_wE_have_cl1mBed_far_we_MusT_St1ll_c0ntinue_oNward} PCTF{t0_get_a_t1ny_elf_we_5tick_1ts_hand5_in_its_ears_rtmlpntyea}