PyDroid
Description:
The objective of this challenge is to find the correct login/password that leads to “Access Granted”.Attachment:
download
a0b07e97197e2dfe48bb7df65dba4f145d485660ecf4bd0d3ab65b14039ec8d6Author: romainthomas
The application has a simple login screen:
Checking the source code in jadx:
1 | $ jadx-gui apks/challenge-pydroid.apk 2>&1 >/dev/null & |
We see that the code behind the check is inside a native function:
Installing the app and running frida:1
2
3
4
5$ adb install apks/challenge-pydroid.apk
$ adb root
$ curl -L https://github.com/frida/frida/releases/download/15.1.13/frida-server-15.1.13-android-arm64.xz | unxz | adb shell "cat > /data/local/frida-server-15.1.13 && chmod 755 /data/local/frida-server-15.1.13"
$ adb shell "/data/local/frida-server-15.1.13 &"
$ pip install frida==15.1.13
Searching with ctrl+shift+f for system.load
in jadx we can find where the lib is being loaded:
Let’s write a script to decrypt the string and see what the name of the lib that is being loaded:
1 | Java.perform(function(){ |
Injecting the script on boot:1
2$ frida -l decryptString.js -f re.obfuscator.challenge01 --no-pause
[Pixel 4 XL::re.obfuscator.challenge01 ]-> a1re03
It seems like the library name is a1re03
, since it’s using the api call system.loadLibrary
we should find a file with the prefix lib liba1re03.so
:
1 | $ apktool d apks/challenge-pydroid.apk -o challenge-pydroid |
Openning the library in ghidra we can see and check the entrypoints, and we can see the .init_array
is not initialised:
I tried to search for functions in the symbol-tree with the prefix java_
but didn’t find any, so I believe the linking between Java and the native code should be done with the registerNatives
function somewhere in the JNI_OnLoad
function:
It seems like there will be an indirect call, so instead of diving into the code, I used jnitrace to trace the function JNI->registerNatives
to locate in ghidra the respective code related to the native function in Java:
1 | $ jnitrace -l liba1re03.so re.obfuscator.challenge01 -i RegisterNatives |
We can see where register natives is being called at 0x1720ac
. To see the code in ghidra, we can just go to the address 0x1720ac + 0x100000
(We need to add 100k because ghidra by default will load the lib at that address).
The logic we truly want to check on is the function PGPyIMEWUxFr
, jnitrace will give us the base address of the lib and the address of the start of the function, so basically, to calculate its real offset in ghidra we could just do 0x7b66211428-0x7b66039000+0x100000 = 0x2d8428
.
A lot of functions are not decompiled in ghidra and didn’t perform the backtrack references through the code, mostly because of some of the techniques used described here.
Due to this problem, I decided to dump the library from memory and fix the elf and in some way solve some of the problems generated by this, also we know omvll is based of o-llvm and some versions uses globals for the strings, based on experience the fastest way to circuvent string encryption for global variables is to use a dump, this is also described in the documentation.
We could write our own frida script to dump from memory, but to save time. There are already some scripts that perform the dump and fix the elf for us. One example of such is this frida_dump
Perhaps we will encounter a problem while trying to dump (the code will dump the specified lib in the frontmost application):1
2
3
4
5
6
7
8$ python dump_so.py liba1re03.so
...
frida.core.RPCException: Error: access violation accessing 0x7b6cdcf000
at <anonymous> (frida/runtime/core.js:138)
at dumpmodule (/script1.js:12)
at apply (native)
at <anonymous> (frida/runtime/message-dispatcher.js:13)
at c (frida/runtime/message-dispatcher.js:23)
Seems like there is a section of the lib that doesn’t have read permissions, to solve this we must adapt the dump_so.js
to change the memory region, also this line of code doesn’t seem to fully work:
1 | ... |
If we investigate the address mapping:
1 | $ adb shell "ps | grep -i 're.obfuscator.challenge01'" |
Maybe because changing the entire permissions of lib may cause some problems to solve this, we just adapt that special region of memory and do this:
1 | Memory.protect(ptr(0x7b6cdcf000), libso.size-(0x7b6cdcf000-libso.base), 'rwx'); |
Since we are attaching to the process, we don’t need to update the address 0x7b6cdcf000
but if you are trying to do the same, you will need to update your address depending on the error.
1 | $ python dump_so.py liba1re03.so |
Now if we view .init_array
section we can see a bunch of pointers to functions that will initialize globals and important stuff for the lib:
_INIT_4
seems to have some python code related to the flag:
Extracting the code from the string we get:
1 | import android |
It seems flag check is being done here, and the flag is the combination of login and password, looks like the function hash
is from a custom module named android
, for now we still don’t know what is the value of android.__FLAG__ and what the function hash
does, but if look into adb logcat
we can actually see the function print
is just some logging function which will appear in the logcat:
1 | adb shell logcat | grep 'omvll' |
It looks like the concatenation of the login and password should be e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
after applying the hash function, from the size of the hash it looks like this is some kind of sha256 but we need confirmation.
We could try to look in the native lib where the module is being initiated or loaded, but since we know that the global variable is located at 0x548778 - 0x100000
we can just write a frida script and inject our own python code to inspect this module!
1 | var libname = "liba1re03.so"; |
The output:1
2
3
4
5$ echo -n 'abc' | sha256sum
ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
$ frida -Ul inj_k.js -F --no-pause
$ adb shell logcat | grep 'omvll' # login in the app to trigger the print
07-23 01:54:37.928 24105 24105 I omvll : ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad
This confirms that we indeed are dealing with sha256 hash. When I got this confirmation, I said to myself that there is no way this challenge is to bruteforce the login and password with a dictionary attack or something. I started to believe that maybe the dev left something within the custom android
module that is not being used in the main script that could give us some tips about how the hash got generated or something:
1 | var libname = "liba1re03.so"; |
And we saw 3 interesting fields MvtKNJXCOGJe
, __bc__
and __doc__
.1
07-23 15:28:56.809 27698 27698 I omvll : ['MvtKNJXCOGJe', '__FLAG__', '__bc__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'decode', 'hash', 'print']
MvtKNJXCOGJe
is a function that receives a string and returns bytes:
1 | android.print(str(android.MvtKNJXCOGJe.__doc__)); |
The documentation of the function:1
I omvll : MvtKNJXCOGJe(arg0: str) -> bytes
__bc__
seems to be a sequence of python bytecode which, after removing the new lines and decode hex data we get something very similar to a pyc file ? (header seems to be different and decompilers won’t work)
1 | android.print(str(android.__bc__)); |
1 | 7-23 21:24:54.840 29800 29800 I omvll : 700d0d0a000000004aaf626335010000e300000000000000000000000000000000040000004 |
__doc__
This contains some hash similar to the sha256 but we don’t know yet for what it used.
1 | android.print(str(android.__doc__)); |
1 | 07-23 21:25:44.407 29800 29800 I omvll : 9c16a9c3017d2b3876323bc4f9dad2b7530c |
My next step was to see what code is behind MvtKNJXCOGJe
we tried using the built-in module dis
to get the disassemble code but it seems the function returns an error:
1 | Abort message: 'terminating with uncaught exception of type pybind11::error_already_set: TypeError: don't know how to disassemble builtin_function_or_method objects |
This means that this module is being loaded in the native code using cpython or pybind11.
To understand a little better I did some research on google and I learned that you could create a python module using cpython like this:
1 |
|
In a main program we could do something like this:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24int main(int argc, char* argv[]) {
wchar_t** wide_argv = (wchar_t**)malloc(sizeof(wchar_t*) * argc);
for (int i = 0; i < argc; i++) {
wide_argv[i] = Py_DecodeLocale(argv[i], NULL);
if (wide_argv[i] == NULL) {
fprintf(stderr, "Error decoding argument %d\n", i);
return 1;
}
}
// Add the "android" module to the pyinittab
PyImport_AppendInittab("android", &PyInit_android);
// Initialize the Python interpreter
Py_Initialize();
// Start the interpreter
Py_Main(argc, wide_argv);
// Finalize the Python interpreter
Py_Finalize();
return 0;
}
After running:
1 | $ gcc main.c -o interpreter -I/usr/include/python3.11 -L/usr/lib/python3.11/config-3.11-x86_64-linux-gnu -lpython3.11 |
A good strategy here is to actually find where the string “android” is being called in the android code:
This already looks promissing:
Diving in FUN_00280c08 we can see that there is a function that looks like is adding somekind of variable __flag__
to the module:
Searching for xrefs to those functions lead me to more assignments of __bc__
and __doc__
:
But the most important one was the function FUN_002c22b0
contains the print string, which probably means that this function might be responsible for function attribution,
searching for xrefs didn’t find anything which means this is probably some kind of proxy call, so we might need to check some of the internal calls:
Searching for MvtKNJXCOGJe
I didn’t find anything, so this means that the author might have used StringEncOptStack
instead of StringEncOptGlobal
to hide this string, so I assumed that these internal functions are related to function attributions to the python module, probably related to pybind11 so I decided to hook FUN_002ce5cc
we know that the second parameter is the name of the function so we can write a frida script to hook this:
1 | var do_dlopen = null; |
The code above is not entirely necessary. I added this in case you want to hook something before some function in .init_array executes. This involves hooking some android linker functions and stuff, but it’s not necessary if you really want, you could just attach to the app and only contain the code inside of before_init_initarray function.
1 | $ frida -Ul inj_k2.js -f re.obfuscator.challenge01 --no-pause |
Looking at the address call 0x1c24c4 + 0x100000
in ghidra:
If we instruct ghidra to disassemble the code:
Let’s hook that line and trigger the call by injecting python:
1 | var libname = "liba1re03.so"; |
We get the address:
1 | $ frida -Ul inj_k2.js -F --no-pause |
After disassembling the function, we get a huge function:
I didn’t want to dive in into this function before understanding the context of this, I could end up reversing an entire function for nothing. So, after analysing the application with more attention, we noticed some files that were dropped into the cache folder /data/data/re.obfuscator.challenge01/cache
:
1 | $ adb shell "ls /data/data/re.obfuscator.challenge01/cache/WebView/Default/Web3" |
By reading the license file, we realized this seems to be the source code of python. Some of the files here are python built-ins. After finding this, we pulled the folder:1
$ adb pull /data/data/re.obfuscator.challenge01/cache/WebView
By searching for one of the strange variables we found in android module with recursive grep, we found it was referenced in one of the files:
1 | $ grep -ria '__bc__' WebView |
The python file:
1 | import importlib |
Looks like the __bc__
is a hidden module, like I said before we tried before to decompile this specific variable, but it looks like Romain Thomas did change the python source code, making it harder for us to recover the original code. Running this code on our machine also wouldn’t work because of these modifications. The bytecode would throw errors, then I had the idea of actually injecting this code into the interpreter in the application like we did before for other purposes, then we could list all objects in the module and maybe use the builtin dis
on the functions to view a better representation of the bytecode:
1 | var libname = "liba1re03.so"; |
1 | $ adb logcat | grep 'omvll' |
I found this interesting function named check
, so let’s use dis
to disassemble the function and view the code:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36var libname = "liba1re03.so";
var moduleBaseAddress = Module.findBaseAddress(libname);
var ghidra_base = 0x100000;
const inject_python = `import importlib
from importlib.machinery import SourcelessFileLoader
from importlib.util import spec_from_file_location
import sys
import android,dis,string
class FileLoader(SourcelessFileLoader):
def __init__(self):
super().__init__("checker", "checker.cpython-310.pyc")
def get_data(self, path: str):
import android
return bytes.fromhex(android.__bc__.replace("\\n", "").strip().replace(" ", ""))
loader = FileLoader()
spec = spec_from_file_location('checker', "checker.cpython-310.pyc",loader=loader)
module = importlib._bootstrap._load(spec)
def get_instruction_repr(instruction):
import dis
opcode, arg, lineno = instruction.opname,instruction.argval, instruction.starts_line
if instruction.arg is not None:
arg_str = f" {arg}"
return f"{lineno}: {opcode}{arg_str}"
else:
return f"{lineno}: {opcode}"
bytecode = dis.Bytecode(module.check)
for instruction in bytecode:
android.print(get_instruction_repr(instruction))
android.print("android.__doc__ -> "+android.__doc__)`;
const python_addr = moduleBaseAddress.add(0x548778-ghidra_base);
python_addr.writeUtf8String(inject_python);
The code is very simple to understand and we can see a very similar code to the code we saw in the global string comparison with the sha256 hash:1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
5907-24 01:35:30.516 31523 31523 I omvll : 5: LOAD_GLOBAL json
07-24 01:35:30.516 31523 31523 I omvll : None: LOAD_METHOD_ENC loads
07-24 01:35:30.516 31523 31523 I omvll : None: LOAD_FAST data
07-24 01:35:30.516 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:35:30.516 31523 31523 I omvll : None: UNPACK_SEQUENCE 2
07-24 01:35:30.516 31523 31523 I omvll : None: STORE_FAST login
07-24 01:35:30.516 31523 31523 I omvll : None: STORE_FAST password
07-24 01:35:30.516 31523 31523 I omvll : 6: LOAD_GLOBAL android
07-24 01:35:30.516 31523 31523 I omvll : None: LOAD_METHOD_ENC decode
07-24 01:35:30.516 31523 31523 I omvll : None: LOAD_FAST login
07-24 01:35:30.516 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:35:30.516 31523 31523 I omvll : None: STORE_FAST login
07-24 01:35:30.517 31523 31523 I omvll : 7: LOAD_GLOBAL android
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_METHOD_ENC decode
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_FAST password
07-24 01:35:30.517 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:35:30.517 31523 31523 I omvll : None: STORE_FAST password
07-24 01:35:30.517 31523 31523 I omvll : 8: LOAD_GLOBAL android
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_METHOD_ENC __obfuscated__
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_FAST login
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_FAST password
07-24 01:35:30.517 31523 31523 I omvll : None: BINARY_ADD
07-24 01:35:30.517 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_METHOD_ENC hex
07-24 01:35:30.517 31523 31523 I omvll : None: CALL_METHOD 0
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_GLOBAL android
07-24 01:35:30.517 31523 31523 I omvll : None: LOAD_ATTR __doc__
07-24 01:35:30.517 31523 31523 I omvll : None: COMPARE_OP ==
07-24 01:35:30.517 31523 31523 I omvll : None: RETURN_VALUE
07-24 01:38:36.452 31523 31523 I omvll : 5: LOAD_GLOBAL json
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_METHOD_ENC loads
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_FAST data
07-24 01:38:36.453 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:38:36.453 31523 31523 I omvll : None: UNPACK_SEQUENCE 2
07-24 01:38:36.453 31523 31523 I omvll : None: STORE_FAST login
07-24 01:38:36.453 31523 31523 I omvll : None: STORE_FAST password
07-24 01:38:36.453 31523 31523 I omvll : 6: LOAD_GLOBAL android
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_METHOD_ENC decode
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_FAST login
07-24 01:38:36.453 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:38:36.453 31523 31523 I omvll : None: STORE_FAST login
07-24 01:38:36.453 31523 31523 I omvll : 7: LOAD_GLOBAL android
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_METHOD_ENC decode
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_FAST password
07-24 01:38:36.453 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:38:36.453 31523 31523 I omvll : None: STORE_FAST password
07-24 01:38:36.453 31523 31523 I omvll : 8: LOAD_GLOBAL android
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_METHOD_ENC __obfuscated__
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_FAST login
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_FAST password
07-24 01:38:36.453 31523 31523 I omvll : None: BINARY_ADD
07-24 01:38:36.453 31523 31523 I omvll : None: CALL_METHOD 1
07-24 01:38:36.453 31523 31523 I omvll : None: LOAD_METHOD_ENC hex
07-24 01:38:36.454 31523 31523 I omvll : None: CALL_METHOD 0
07-24 01:38:36.454 31523 31523 I omvll : None: LOAD_GLOBAL android
07-24 01:38:36.454 31523 31523 I omvll : None: LOAD_ATTR __doc__
07-24 01:38:36.454 31523 31523 I omvll : None: COMPARE_OP ==
07-24 01:38:36.454 31523 31523 I omvll : None: RETURN_VALUE
07-24 01:38:36.454 31523 31523 I omvll : android.__doc__ -> 9c16a9c3017d2b3876323bc4f9dad2b7530c
The most important part is the fact the function is using a function __obfuscated__
which we believe to be the same as MvtKNJXCOGJe
and, instead of comparing the input with android.__flag__
it will compare with android.__doc__
which was the hash we didn’t know what was its purpose.
Again, before going deep into the native code of MvtKNJXCOGJe
I did some tests with a few inputs and I realized that the function was a simple encryption function that was encrypting the input byte by byte. Knowing this, I knew we could just bruteforce and get the password:
1 | var libname = "liba1re03.so"; |
After running we got the password:
1 | 07-24 01:45:27.152 31523 31523 I omvll : 0MvLL_And_dPr0t3ct |