SingPass RASP Analysis
🛡️

SingPass RASP Analysis

📅 [ Archival Date ]
Oct 18, 2022 11:49 AM
🏷️ [ Tags ]
ReverseiOS
✍️ [ Author ]
Romain Thomas
image

Introduction

I started to dig into the SingPass application which turned out to be obfuscated and protected with Runtime Application Self-Protection (RASP).

Retrospectively, this application is pretty interesting to analyze RASP functionalities since:

  1. It embeds advanced RASP functionalities (Jailbreak detection, Frida Stalker detection, …).
  2. The native code is lightly obfuscated.
  3. The application starts by showing an error message which is a good oracle to know whether we managed to circumvent the RASP detections.
image
⚖️
Context

All the findings and the details of this blog post has been shared with the editor of the obfuscator. The overall results have also been shared with SingPass. In addition, SingPass is part of a bug bounty program on HackerOne. Bypassing these RASP checks are a prerequisite to go further in the security assessment of this application.

By grepping some keywords in the install directory of the application, we actually get two results which reveal the name of the obfuscator:

iPhone:/private/[...]/SingPass.app root# grep -Ri +++++++ *
Binary file Frameworks/NuDetectSDK.framework/NuDetectSDK matches
Binary file SingPass matches

The NuDetectSDK binary also uses the same obfuscator but it does not seem involved in the early jailbreak detection shown in the previous figure. On the other hand, SingPass is the main binary of the application and we can observe strings related to threat detections:

$ SingPass.app strings ./SingPass|grep -i +++++++
+++++++ThreatLogAPI(headers:)
+++++++CallbackHandler(context:)
📄
Binary

For those who would like to follow this blog post with the original binary, you can download the decrypted SingPass Mach-O binary here. The name of the obfuscator has been redacted but it does not impact the content of the code.

Unfortunately, the binary does not leak other strings that could help to identify where and how the application detects jailbroken devices but fortunately, the application does not crash …

If we assume that the obfuscator decrypts strings at runtime, we can try to dump the content of the __data section when the error message is displayed. At this point of the execution, the strings used for detecting jailbroken devices are likely decoded and clearly present in the memory.

This is actually quite the same technique used in PokemonGOWhat About LIEF
  1. We run the application and we wait for the jailbreak message
  2. We attach to SingPass with Frida and we inject a library that:
    • Parses in-memory the SingPass binary (thanks to LIEF)
    • Dumps the content of the __data section
    • Write the dump in the iPhone’s /tmp directory

Once the data section is dumped, we end up with the following changes in some parts of the __data section:

image
image

Fig 1. Slices of the __data section before and after the dump

ℹ️
Note The string encoding routines will be analyzed in the second part of this series of blog posts

In addition, we can observe the following strings which seem to be related to the RASP functionalities of the obfuscator:

image

Fig 2. Strings Related to the RASP Features

All the EVT_* strings are referenced by one and only one function that I named on_rasp_detection. This function turns out to be the threat detection callback used by the app’s developers to perform action(s) when a RASP event is triggered.

To better understand the logic of the checks behind these strings, let’s start with EVT_CODE_PROLOGUE which is used to detect hooked functions.

EVT_CODE_PROLOGUE: Hook Detection

While going through the assembly code closes to the cross-references of on_rasp_detection, we can spot several times this pattern:

image

To detect if a given function is hooked, the obfuscator loads the first byte of the function and compares this byte with the value 0xFF. 0xFF might seem – at first glance – arbitrary but it’s not. Actually, regular functions start with a prologue that allocates space on stack for saving registers defined by the calling convention and stack variables required by the function. In AArch64, this allocation can be performed in two ways:

stp REG, REG, [SP, 0xAA]!
; or
sub SP, SP, 0xBB
stp REG, REG, [SP, 0xCC]

These instructions are not equivalent, but somehow and with the good offsets, they could lead to the same result. In the second case, the instruction sub SP, SP, #CST is encoded with the following bytes:

image

As we can see, the encoding of this instruction starts with 0xFF. If it is not the case, then either the function starts with a different stack-allocation prologue or potentially starts with a hooking trampoline. Since the code of the application is compiled through obfuscator’s compiler, the compiler is able to distinguish these two cases and insert the right check for the correct function’s prologue.

If the first byte of the instruction of the function does not pass the check, it jumps to the red basic block. The purpose of this basic block is to trigger a user-defined callback that will process the detection according to the application’s design and the developers’ choices:

  • Printing an error
  • Crashing the application
  • Corrupting internal data

From the previous figure, we can observe that the detection callback is loaded from a static variable located at #hook_detect_cbk_ptr. When calling this detection callback, the obfuscator provides the following information to the callback:

  1. A detection code: 0x400 for EVT_CODE_PROLOGUE
  2. corrupted pointer which could be used to crash the application.

Let’s now take a closer look at the design of the detection callback(s) as a whole.

Detection Callbacks

As explained in the previous section, when the obfuscator detects tampering, it reacts by calling a detection callback stored in the static variable at the address: 0x10109D760

By statically analyzing hook_detect_cbk, the implementation seems to corrupt the pointer provided in the callback’s parameters. On the other hand, when running the application we observe a jailbreak detection message and not a crash of the application.

If we look at the cross-references which read or write at this address, we get this list of instructions:

image

So actually only one instruction – init_and_check_rasp+01BC – is overwriting the default detection callback with another function:

image

Compared to the default callback: hook_detect_cbk, the overridden function, hook_detect_cbk_user_def does not corrupt a pointer that would make the application crash. Instead, it calls the function on_rasp_detection which references all the strings EVT_CODE_TRACING, EVT_CODE_SYSTEM_LIB, etc, listed in the figure 2.

👉
 hook_detect_cbk_user_def is called on a RASP event. That’s why this application does not crash.

By looking at the function init_and_check_rasp as a whole, we can notice that the X23 register is also used to initialize other static variables:

image

Fig 3. X23 Writes Instructions

These memory writes mean that the callback hook_detect_cbk_user_def is used to initialize other static variables. In particular, these other static variables are likely used for the other RASP checks. By looking at the cross-references of these static variables #EVT_CODE_TRACING_cbk_ptr, #EVT_ENV_JAILBREAK_cbk_ptr etc, we can locate where the other RASP checks are performed and under which conditions they are triggered.

EVT_CODE_SYSTEM_LIB

image

EVT_ENV_DEBUGGER

image

EVT_ENV_JAILBREAK

image

Thanks to the #EVT_* cross-references, we can go statically through all the basic blocks that use these #EVT_* variables and highlight the underlying checks that could trigger the RASP callback(s). Before detailing the checks, it is worth mentioning the following points:

  1. Whilst the application uses a commercial obfuscator which provides native code obfuscation in addition to RASP, the code is lightly obfuscated which makes static assembly code analysis doable very easily.
  2. As it will be discussed in "RASP Weaknesses", the application setups the same callback for all the RASP events. Thus, it eases the RASP bypass and the dynamic analysis of the application.

Anti-Debug

The version of the obfuscator used by SingPass implements two kinds of debug check. First, it checks if the parent process id (ppid) is the same as /sbin/launchd which should be 1.

static constexpr pid_t LAUNCHD_PID = 1;
pid_t ppid = getppid();
if (ppid != LAUNCHD_PID) {
  // Trigger EVT_ENV_DEBUGGER
}

If it is not the case, it triggers the EVT_ENV_DEBUGGER event. The second check is based on sysctl which is used to access the extern_proc.p_flag value. If this flag contains the P_TRACED value, the RASP routine triggers the EVT_ENV_DEBUGGER event.

int names[] = {
  CTL_KERN,
  KERN_PROC,
  KERN_PROC_PID,
  getpid(),
};
kinfo_proc info;
int sizeof_info = sizeof(kinfo_proc);
int ret = sysctl(names, 4, &info, &sizeof_info, nullptr, nullptr);
if (info.kp_proc.p_flag  & P_TRACED) {
  // Trigger EVT_ENV_DEBUGGER
}

In the SingPass binary, we can find an instance of these two checks in the following ranges of addresses:

ppid:   0x10071F420 – 0x10071F474
sysctl: 0x100151668 – 0x100151730

Jailbreak Detection

As for most of the jailbreak detections, the obfuscator tries to detect if the device is jailbroken by checking if some files exist (or not) on the device.

Files or directories are checked with syscalls or a regular functions thanks to the following helpers:

pathconf:    0x100008EB0 -- 0x100008F28
utimes:      0x10000D8D4 -- 0x10000D948
stat:        0x100012188 -- 0x10001221C
open:        0x10002D478 -- 0x10002D4D8
fopen:       0x1000474E4 -- 0x100047554
stat64:      0x10006AA30 -- 0x10006AAD8
getfsstat64: 0x10047E82C -- 0x10047E914

While in the introduction, I mentioned that a dump of the section __data reveals strings related to jailbreak detection, the dump does not reveal all the strings used by the obfuscator.

By looking closely at the strings encoding mechanism, it turns out that some strings are decoded just-in-time in a temporary variable. I’ll explain the strings encoding mechanism in the second part of this series of blog posts but at this point, we can uncover the strings by setting hooks on functions like fopen, utimes and dumping the __data section right after these calls. Then, we can iterate over the different dumps to see if new strings appear.

$ python dump_analysis.py
Processing __data_0.raw
0x01010b935c h/.installed_unc0ver
0x01010b986a w/taurine/pspawn_payload.dylib
Processing __data_392.raw
0x01010b910e y__TEXT
0x01010b91b3 /System/Library/dyld/dyld_shared_cache_arm64e
0x01010b9174 /System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e
0x01010b9136 /System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64
0x01010b9126 dyld_v1  arm64e
0x01010b9116 dyld_v1   arm64
Processing __data_393.raw
0x01010afb90 /Users/xxxxxxx/Desktop/Xcode/ndi-sp-mobile-ios-swift/SingPass/[...]
0x01010b942c /var/jb
0x01010af910 https://bio-stream.singpass.gov.sg
0x01010af6a0 https://api.myinfo.gov.sg/spm/v3
0x01010b93b0 /.mount_rw

In the end, the approach does not enable to have all the strings decoded but it enables to have a good coverage. The list of the files used for detecting jailbreak is given the Annexes.

There is also a particular check for detecting the unc0ver jailbreak which consists in trying to unmount /.installed_unc0ver:

0x100E4D814: _unmount("/.installed_unc0ver")

Environment

The obfuscator also checks environment variables that trigger the EVT_ENV_JAILBREAK event. Some of these checks seem to be related to code lifting detection while still triggering the EVT_ENV_JAILBREAK event.

if (strncmp(_dyld_get_image_name(0), "/private/var/folders", 0x14)) {
  -> Trigger EVT_ENV_JAILBREAK
}
if (strncmp(getenv("HOME"), "/Users", 6) == 0) {
  -> Trigger EVT_ENV_JAILBREAK
}
if (strncmp(getenv("HOME"), "mobile", 6) != 0) {
  -> Trigger EVT_ENV_JAILBREAK
}
char buffer[0x400];
size_t buff_size = 0x400;
_NSGetExecutablePath(buffer, &buff_size);
if (buffer.startswith("/private/var/folders")) {
  -> Trigger EVT_ENV_JAILBREAK
}
⚙️
startswith()

From a reverse engineering perspective, startswith() is actually implemented as a succession of xor that are “or-ed” to get a boolean. This might be the result of an optimization from the compiler. You can observe this pattern in the basic block located at the address: 0x100015684.

Advanced Detections

In addition to regular checks, the obfuscator performs advanced checks like verifying the current status of the SIP (System Integrity Protection), and more precisely, the KEXTS code signing status.

🧐
From my weak experience in iOS jailbreaking, I think that no Jailbreak disables the CSR_ALLOW_UNTRUSTED_KEXTS  flag. Instead, I guess that it is used to detect if the application is running on an Apple M1 which allows such deactivation.
csr_config_t buffer = 0;
if (__csrctl(CSR_ALLOW_UNTRUSTED_KEXTS, buffer, sizeof(csr_config_t)) {
  /*
   * SIP is disabled with CSR_ALLOW_UNTRUSTED_KEXTS
   * -> Trigger EVT_ENV_JAILBREAK
   */
}
Assembly range: 0x100004640 – 0x1000046B8

The obfuscator also uses the Sandbox API to verify if some paths exist:

int ret = __mac_syscall("Sandbox", /* Sandbox Check */ 2,
                        getpid(), "file-test-existence", SANDBOX_FILTER_PATH,
                        "/opt/homebrew/bin/brew");

The paths checked through this API are OSX-related directories, so I guess it is also used to verify that the current code has not been lifted on an Apple Silicon. Here is, for instance, a list of directories checked with the Sandbox API:

/Applications/Xcode.app/Contents/MacOS/Xcode
/System/iOSSupport/
/opt/homebrew/bin/brew
/usr/local/bin/brew
 Assembly range: 0x100ED7684 (function)

In addition, it uses the Sandbox attribute file-read-metadata as an alternative to the stat() function.

 Assembly range: 0x1000ECA5C – 0x1000ECE54

The application uses the sandbox API through private syscalls to determine whether some jailbreak artifacts exists. This is very smart but I guess it’s not really compliant with the Apple policy.

Code Symbol Table

The purpose of this check is to verify that the addresses of the resolved imports point to the right library. In other words, this check verifies that the import table is not tampered with pointers that could be used to hook imported functions.

 Initialization: part of sub_100E544E8
 Assembly range: 0x100016FC4 – 0x100017024

During the RASP checks initialization (sub_100E544E8), the obfuscator manually resolves the imported functions. This manual resolution is performed by iterating over the symbols in the SingPass binary, checking the library that imports the symbol, accessing (in-memory) the __LINKEDIT segment of this library, parsing the exports trie, etc. This manual resolution fills a table that contains the absolute address of the resolved symbols.

In addition, the initialization routine setups – what I called – a metadata structure that follows this layout:

image

symbols_index is a kind of translation table that converts an index known by the obfuscator into an index in the __got or the __la_symbol_ptr section. The index’s origin (i.e __got or __la_symbol_ptr) is determined by the origins table which contains enum-like integers:

enum SYM_ORIGINS : uint8_t {
  NONE      = 0,
  LA_SYMBOL = 1,
  GOT       = 2,
};

The length of both tables: symbols_index and origins, is defined by the static variable nb_symbols which is set to 0x399. The metadata structure is followed by two pointers: resolved_la_syms and resolved_got_syms which point to the imports address table manually filled by the obfuscator.

ℹ️
 There is a dedicated table for each section: __got and __la_symbol_ptr.

Then, macho_la_syms points to the beginning of the __la_symbol_ptr section while macho_got_syms points to the __got section.

Finally, stub_helper_start / stub_helper_end holds the memory range of the __stub_helper section. I’ll describe the purpose of these values later.

All the values of this metadata structure are set during the initialization which takes place in the function sub_100E544E8.

In different places of the SingPass binary, the obfuscator uses this metadata information to verify the integrity of the resolved import(s). It starts by accessing the symbols_index and the origins with a fixed value:

image
👉
Since the symbols_index table contains uint32_t values, #0xCA8 matches #0x32A (index for the origins table) when divided by sizeof(uint32_t): 0xCA8 = 0x32A * sizeof(uint32_t)

In other words, we have the following operations:

const uint32_t sym_idx   = metadata.symbols_index[0x32a];
const SYM_ORIGINS origin = metadata.origins[0x32a]

Then, given the sym_idx value and depending on the origin of the symbol, the function accesses either the resolved __got table or the resolved __la_symbol_ptr table. This access is done with a helper function located at sub_100ED6CC0. It can be summed up with the following pseudo-code:

uintptr_t* section_ptr       = nullptr;
uintptr_t* manually_resolved = nullptr;
if      (origin == /* 1 */ SYM_ORIGINS::LA_SYMBOL) {
  section_ptr       = metadata.macho_la_syms;
  manually_resolved = metadata.resolved_la_syms;
}
else if (origin == /* 2 */ SYM_ORIGINS::GOT) {
  section_ptr       = metadata.macho_got_syms;
  manually_resolved = metadata.resolved_got_syms;
}

The entries at the index sym_idx of section_ptr and manually_resolved are compared and if they don’t match, the event #EVT_CODE_SYMBOL_TABLE is triggered.

Actually, the comparison covers different cases. First, the obfuscator handles the case where the symbol at sym_idx is not yet resolved. In that case, section_ptr[sym_idx] points to the symbols resolution stub located in the section __stub_helper. That’s why the metadata structure contains the memory range of this section:

const uintptr_t addr_from_section = section_ptr[sym_idx];
if (metadata.stub_helper_start <= addr && addr < metadata.stub_helper_end) {
  // Skip
}

In addition, if the pointers do not match, the function verifies their location using dladdr:

const uintptr_t addr_from_section    = section_ptr[sym_idx];
const uintptr_t addr_from_resolution = manually_resolved[sym_idx];
if (addr_from_section != addr_from_resolution) {
  Dl_info info_section;
  Dl_info info_resolution;
  dl_info(addr_from_section,    &info_section);
  dl_info(addr_from_resolution, &info_resolution);
  if (info_section.dli_fbase != info_resolution.dli_fbase) {
    // --> Trigger EVT_CODE_SYMBOL_TABLE;
  }
}
👉
 Two pointers might not match if, for instance, an imported function is hooked with Frida.

In the case where the origin[sym_idx] is set to SYM_ORIGINS::NONE the function skips the check. Thus, we can simply disable this RASP check by filling the original table with 0. The number of symbols is close to the metadata structure and the address of the metadata structure is leaked by the ___atomic_load and ___atomic_store functions.

image

Code Tracing

The Code Tracing check aims to verify that the current is not traced. By looking at the cross-references of #EVT_CODE_TRACING_cbk_ptr, we can identify two kinds of verification.

GumExecCtx

EVT_CODE_TRACING seems able to detect if the Frida’s Stalker is running. It’s the first time I can observe this kind of check and it’s very smart. For those who would like to follow this analysis with the raw assembly code, I will use this range of addresses from the SingPass binary:

 0x10019B6FC – 0x10019B82C

Here is the graph of the function that performs the Frida Stalker check:

image

Code associated with Frida Stalker Detection

Yes, this code is able to detect the Stalker. How? Let’s start with the first basic block. _pthread_mach_thread_np(_pthread_self()) aims at getting the thread id of the function that invokes this check.

Then more subtly, MRS(TPIDRRO_EL0) & #-8 is used to manually access the thread local storage area. On ARM64, Apple uses the least significant byte of TPIDRRO_EL0 to store the number of CPU while the MSB contains the TLS base address.

{ } See also: dyld – threadLocalHelpers.s

Then, the second basic block – which is the loop’s entry – accesses the thread local variable with the key tlv_idx which ranges from 0x100 to 0x200 in the loop:

*(tlv_table + (tlv_idx << 3))

The following basic block which calls _vm_region_64(…) is used to verify that the tlv_addr variable contains a valid address with a correct size (i.e. larger than 0x30). Under these conditions, it jumps into the following basic block with these strange memory accesses:

image

Condition that (somehow) Triggers EVT_CODE_TRACING

To figure out the meaning of these memory accesses, let’s remind that this function is associated with the EVT_CODE_TRACING event. Which well-known public tool could be associated with code tracing? Without too much risk, we can assume the Frida’s Stalker.

If we look at the implementation of the Stalker, we can notice this kind of initialisation (in gumstalker-arm64.c):

void gum_stalker_init (GumStalker* self) {
  [...]
  self->exec_ctx = gum_tls_key_new();
  [...]
}
void* _gum_stalker_do_follow_me(GumStalker* self, ...) {
  GumExecCtx* ctx = gum_stalker_create_exec_ctx(...);
  gum_tls_key_set_value (self->exec_ctx, ctx);
}

So the Stalker creates a thread local variable that is used to store a pointer to the GumExecCtx structure which has the following layout:

struct _GumExecCtx {
  volatile gint state;
  gint64 destroy_pending_since;
  GumStalker * stalker;
  GumThreadId thread_id;
  GumArm64Writer code_writer;
  GumArm64Relocator relocator;
  [...]
}

If we add the offsets of this layout and if we virtually inline the GumArm64Writer structure, we can get this representation:

struct _GumExecCtx {
  /* 0x00 */ volatile gint state;
  /* 0x08 */ gint64 destroy_pending_since;
  /* 0x10 */ GumStalker * stalker;
  /* 0x18 */ GumThreadId thread_id;
  GumArm64Writer code_writer {
  /* 0x20 */ volatile gint ref_count;
  /* 0x24 */ GumOS target_os;
  /* 0x28 */ GumPtrauthSupport ptrauth_support;
  ...
  };
}
👉
destroy_pending_since is located at the offset 0x08 and not 0x04 because of the alignment enforced by the compiler.

With this representation, we can observe that:

  • *(tlv_table + 0x18) effectively matches the GumThreadId thread_id attribute.
  • *(tlv_table + 0x24) matches GumOS target_os
  • *(tlv_table + 0x28) matches GumPtrauthSupport ptrauth_support

GumOS and GumPtrauthSupport are enums defined in gumdefs.h and gummemory.h with these values:

enum _GumOS {
  GUM_OS_WINDOWS,
  GUM_OS_MACOS,
  GUM_OS_LINUX,
  GUM_OS_IOS,
  GUM_OS_ANDROID,
  GUM_OS_QNX
};
enum _GumPtrauthSupport {
  GUM_PTRAUTH_INVALID,
  GUM_PTRAUTH_UNSUPPORTED,
  GUM_PTRAUTH_SUPPORTED
};

GumOS contains 6 entries starting from GUM_OS_WINDOWS = 0 up to GUM_OS_QNX = 5 and similarly, GUM_PTRAUTH_INVALID = 0 while the last entry is associated with GUM_PTRAUTH_SUPPORTED = 2

Therefore, the previous strange conditions are used to fingerprint the GumExecCtx structure:

image

One way to prevent this Stalker detection would be to recompile Frida with swapped fields in the _GumExecCtx structure.

Thread Check

An alternative to the previous Frida stalker check consists in accessing the current thread status through the following call:

thread_read_t target = pthread_mach_thread_np(pthread_self());
uint32_t count = ARM_UNIFIED_THREAD_STATE_COUNT;
arm_unified_thread_state state;
thread_get_state(target, ARM_UNIFIED_THREAD_STATE, &state, &count);

Then, it checks if state->ts_64.__pc is within the libsystem_kernel.dylib thanks to the following comparison:

const auto mach_msg_addr = reinterpret_cast<uintptr_t>(&mach_msg);
const uintptr_t delta = abs(state->ts_64.__pc - mach_msg_addr)
if (delta > 0x4000) {
  rasp_event_info info;
  info.event = 0x2000; // EVT_CODE_TRACING;
  info.ptr = (uintptr_t*)0x13b71a24724edfe;
  EVT_CODE_TRACING_cbk_ptr(info);
}

In other words, state->ts_64.__pc is considered to be in libsystem_kernel.dylib, if its distance from &mach_msg is smaller than 0x4000.

At first sight, I was a bit confused by this RASP check but since the previous checks, associated with EVT_CODE_TRACING, aims at detecting the Frida Stalker, this check is also likely designed to detect the Frida Stalker.

To confirm this hypothesis, I developed a small test case that reproduces this check, in a standalone binary and we can observe a difference depending on whether it runs through the Frida stalker or not:

image

Output of the Test Case with the Stalker

image

Output of the Test Case without the Stalker

This check can be bypassed without too much difficulty by using the function gum_stalker_exclude to exclude the library libsystem_kernel.dylib from the stalker:

GumStalker* stalker = gum_stalker_new();
exclude(stalker, "libsystem_kernel.dylib");
{
  // Stalker Check
}

As a result of this exclusion, state->ts_64.__pc is located in libsystem_kernel.dylib:

image

Output of the Test Case with Excluded Memory Ranges

App Loaded Libraries

The RASP event EVT_APP_LOADED_LIBRARIES aims at checking the integrity of the Mach-O’s dependencies. In other words, it checks that the Mach-O imported libraries have not been altered.

 Assembly ranges: 0x100E4CDF8 – 0x100e4d39c

The code associated with this check starts by accessing the Mach-O header thanks to the dladdr function:

Dl_info dl_info;
dladdr(&static_var, &dl_info);

Dl_info contains the base address of the library which encompasses the address provided in the first parameter and since, a Mach-O binary is loaded with its header, dl_info.dli_fbase actually points to a mach_header_64.

Then the function iterates over the LC_ID_DYLIB-like commands to access dependency’s name:

image

This name contains the path to the dependency. For instance, we can access this list as follows:

import lief
singpass = lief.parse("./SingPass")
for lib in singpass.libraries:
  print(lib.name)
# Output:
/System/Library/Frameworks/AVFoundation.framework/AVFoundation
/System/Library/Frameworks/AVKit.framework/AVKit
...
@rpath/leveldb.framework/leveldb
@rpath/nanopb.framework/nanopb

The dependency’s names are used to fill a hash table in which a hash value in encoded on 32 bits:

// Pseudo code
uint32_t TABLE[0x6d]
for (size_t i = 0; i < 0x6d; ++i) {
  TABLE[i] = hash(lib_names[i]);
}

Later in the the code, this computed table is compared with another hash table – hard-coded in the code – which looks like this:

image

Fig 4. Examples of Hashes

If some libraries have been modified to inject, for instance, FridaGadget.dylib then the hash dynamically computed will not match the hash hard-coded in the code.

Whilst the implementation of this check is pretty “standard”, there are a few points worth mentioning:

  • Firstly, the hash function seems be a derivation of the MurmurHash.
  • Secondly, the hash is encoded on 32 bits but the code in the Figure 4 references the X11/X12 registers which are 64 bits. This is actually a compiler optimization to limit the number of memory accesses.
  • Thirdly, the hard coded hash values are duplicated in the binary for each instance of the check. In SingPass, this RASP check is present twice thus, we find these values at the following locations: 0x100E4CF38, 0x100E55678. This duplication is likely used to prevent a single spot location that would be easy to patch.

Code System Lib

This check is associated with the event EVT_CODE_SYSTEM_LIB which consists in verifying the integrity of the in-memory system libraries with their content in the dyld shared cache (on-disk).

 Assembly ranges: 0x100ED5BF8 – 0x100ED5D6C and 0x100ED5E0C – 0x100ED62D4

This check usually starts with the following pattern:

image

If the result of iterate_system_region with the given check_region_cbk callback is not 0, it triggers the EVT_CODE_SYSTEM_LIB event:

if (iterate_system_region(check_region_cbk) != 0) {
// Trigger `EVT_CODE_SYSTEM_LIB`
}

To understand the logic behind this check, we need to understand the purpose of the iterate_system_region function and its relationship with the callback check_region_cbk.

iterate_system_region

ℹ️
As for all the functions referenced in the blog post, their names come from my own analysis and might be inaccurate. Most of the functions related to the RASP checks were obviously stripped. In this case, iterate_system_region matches the original sub_100ED5BF8

This function aims to call the system function vm_region_recurse_64 and then, filter its output on conditions that could trigger the callback given in the first parameter: check_region_cbk.

iterate_system_region starts by accessing the base address of the dyld shared cache thanks to the SYS_shared_region_check_np syscall. This address is used to read and memoize a few attributes from the dyld_cache_header structure:

  1. The shared cache header
  2. The shared cache end address
  3. Other limits related to the shared cache

The following snippet gives an overview of these computations:

static dyld_shared_cache* header = nullptr; /* At: 0x1010DE940 */
static uintptr_t g_shared_cache_end;        /* At: 0x1010DE948 */
static uintptr_t g_overflow_address;        /* At: 0x1010DE950 */
static uintptr_t g_module_last_addr;        /* At: 0x1010DE958 */
if (header == nullptr) {
// return;
}
uintptr_t shared_cache_base;
syscall(SYS_shared_region_check_np, &shared_cache_base);
header = shared_cache_base;
g_shared_cache_end = shared_cache_addr + header->mappings[0].size;
g_overflow_address = -1;
g_module_last_addr = g_shared_cache_end;
if (header->imagesTextCount > 0) {
  uintptr_t slide = shared_cache_addr - header->mappings[0].address;
  uintptr_t tmp_overflow_address = -1;
  uintptr_t shared_cache_end_tmp = shared_cache_end;
  for (size_t i = 0; i < header->imagesTextCount; ++i) {
    const uintptr_t txt_start_addr = slide      + header->imagesText[i].loadAddress;
    const uintptr_t txt_end_addr   = start_addr + header->imagesText[i].textSegmentSize;
    if (txt_start_addr >= shared_cache_end_tmp && txt_start_addr < tmp_overflow_address) {
      g_overflow_address   = start_addr;
      tmp_overflow_address = start_addr;
    }
    if (txt_end_addr >= shared_cache_end_tmp) {
      g_module_last_addr   = txt_end_addr;
      shared_cache_end_tmp = txt_end_addr;
    }
  }
}
🧐
From a reverse engineering point of view, the stack variable used to memoize these information is aliased with the parameter info of vm_region_recurse_64 that is called later. I don’t know if this aliasing is on purpose, but it makes the reverse engineering of the structures a bit more complicated.

Following this memoization, there is a loop on vm_region_recurse_64 which queries the vm_region_submap_info_64 information for these addresses in the range of the dyld shared cache. We can identify the type of the query (vm_region_submap_info_64) thanks to the mach_msg_type_number_t *infoCnt argument which is set to 19:

image

This loop breaks under certain conditions and the callback is triggered with other conditions. As it is explained a bit later, the callback verifies the in-memory integrity of the library present in the dyld shared cache.

The verification and the logic behind this check is prone to take time, that’s why the authors of the check took care of filtering the addresses to check to avoid useless (heavy) computations.

Basically, the callback that performs the in-depth inspection of the shared cache is triggered if:

image

check_region_cbk

When the conditions are met, iterate_system_region calls the check_region_cbk with the suspicious address in the first parameter:

int iterate_system_region(callback_t cbk) {
  int ret = 0;
  if (cond(address)) {
    ret = cbk(address) {
      // Checks on the dyld_shared_cache
    }
  }
  return ret;
}

During the analysis of SingPass, only one callback is used in pair with iterate_system_region, and its code is not especially obfuscated (except the strings). Once we know that the checks are related to the dyld shared cache, we can quite easily figure out the structures involved in this function. This callback is located at the address 0x100ed5e0c and renamed check_region_cbk.

Firstly, it starts by accessing the information about the address:

int check_region_cbk(uintptr_t address) {
  Dl_info info;
  dladdr(address, info);
  // ...
}

This information is used to read the content of the __TEXT segment associated with the address parameter:

auto* header = reinterpret_cast<mach_header_64*>(info.dli_fbase);
segment_command_64 __TEXT = get_text_segment(header);
vm_offset_t data = 0;
mach_msg_type_number_t* dataCnt = 0;
vm_read(task_self_trap(), info.dli_fbase, __TEXT.vmsize, &data, &dataCnt);
🛡️
The __TEXT strings is encoded as well as the different paths of the shared cache like /System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64e and the header’s magic values: 0x01010b9126: dyld_v1 arm64e or 0x01010b9116: dyld_v1 arm64

On the other hand, the function opens the dyld_shared_cache and looks for the section of the shared cache that contains the library associated with the address parameter:

int fd = open('/System/Library/Caches/com.apple.dyld/dyld_shared_cache_arm64');
(1) mmap(nullptr, 0x100000, VM_PROT_READ, MAP_NOCACHE | MAP_PRIVATE, fd, 0x0): 0x109680000
// Look for the shared cache entry associated with the provided address
(2) mmap(nullptr, 0xad000, VM_PROT_READ, MAP_NOCACHE | MAP_PRIVATE, fd, 0x150a9000): 0x109681000

The purpose of the second call to mmap() is to load the slice of the shared cache that contains the code of the library. Then, the function checks byte per byte that the __TEXT segment’s content matches the in-memory content. The loop which performs this comparison is located between these addresses: 0x100ED6C58 - 0x100ED6C70.

As we can observe from the description of this RASP check, the authors paid a lot of attention to avoid performance issues and memory overhead. On the other hand, the callback check_region_cbk was never called during my experimentations (even when I hooked system function). I don’t know if it’s because I misunderstood the conditions but in the end, I had to manually force the conditions (by forcing the pages_swapped_out to 1).

vm_region_recurse_64 seems also always paired with an anti-hooking verification that is slightly different from the check described at the beginning of this blog post. Its analysis is quite easy and can be a good exercise.

RASP Design Weaknesses

Thanks to the different #EVT_* static variables that hold function pointers, the obfuscator enables to have dedicated callbacks for the supported RASP events. Nevertheless, the function init_and_check_rasp defined by the application’s developers setup all these pointers to the same callback: hook_detect_cbk_user_def. In such a design, all the RASP events end up in a single function which weakens the strength of the different RASP checks.

It means that we only have to target this function to disable or bypass the RASP checks.

Using Frida Gum, the bypass is as simple as using gum_interceptor_replace with an empty function:

enum class RASP_EVENTS : uint32_t {
  EVT_ENV_JAILBREAK        = 0x1,
  EVT_ENV_DEBUGGER         = 0x2,
  EVT_APP_SIGNATURE        = 0x20,
  EVT_APP_LOADED_LIBRARIES = 0x40,
  EVT_CODE_PROLOGUE        = 0x400,
  EVT_CODE_SYMBOL_TABLE    = 0x800,
  EVT_CODE_SYSTEM_LIB      = 0x1000,
  EVT_CODE_TRACING         = 0x2000,
};
struct event_info_t {
  RASP_EVENTS event;
  uintptr_t** ptr_to_corrupt;
};
void do_nothing(event_info_t info) {
  RASP_EVENTS evt = info.event;
  // ...
  return;
}
// This is **pseudo code**
gum_interceptor_replace(
  listener->interceptor,
  reinterpret_cast<void*>(&hook_detect_cbk_user_def)
  do_nothing,
  reinterpret_cast<void*>(&hook_detect_cbk_user_def)
);

Thanks to this weakness, I could prevent the error message from being displayed as soon as the application starts.

SingPass Jailbreak & RASP Bypass

🛡️
It exists two other RASP checks: EVT_APP_MACHO and EVT_APP_SIGNATURE which were not enabled by the developers and thus, are not present in SingPass.

Conclusion

This first part is a good example of the challenges when using or designing an obfuscator with RASP features. On one hand, the commercial solution implements strong and advanced RASP functionalities with, for instance, inlined syscalls spread in different places of the application. On the other hand, the app’s developers weakened the RASP functionalities by setting the same callback for all the events. In addition, it seems that the application does not use the native code obfuscation provided by the commercial solution which makes the RASP checks un-protected against static code analysis. It could be worth to enforce code obfuscation on these checks regardless the configuration provided by the user.

From a developer point of view, it can be very difficult to understand the impact in term of reverse-engineering when choosing to setup the same callback while it can be a good design decision from an architecture perspective.

In the second part of this series about iOS code obfuscation, we will dig a bit more in native code obfuscation through another application, where the application reacts differently to the RASP events and where the code is obfuscated with MBA, Control-Flow Flattening, etc.

If you have questions feel free to ping me 📫.

Annexes

JB Detection Files
Listed in PokemonGO
/.bootstrapped
No
/.installed_taurine
No
/.mount_rw
No
/Library/dpkg/lock
No
/binpack
Yes
/odyssey/cstmp
No
/odyssey/jailbreakd
No
/payload
No
/payload.dylib
No
/private/var/mobile/Library/Caches/kjc.loader
No
/private/var/mobile/Library/Sileo
No
/taurine
No
/taurine/amfidebilitate
No
/taurine/cstmp
No
/taurine/jailbreakd
No
/taurine/jbexec
No
/taurine/launchjailbreak
No
/taurine/pspawn_payload.dylib
No
/var/dropbear
No
/var/jb
No
/var/lib/undecimus/apt
No
/var/motd
No
/var/tmp/cydia.log
No

Flagged Packages

/Applications/AutoTouch.app/AutoTouch
/Applications/iGameGod.app/iGameGod
/Applications/zxtouch.app/zxtouch
/Library/Activator/Listeners/me.autotouch.AutoTouch.ios8
/Library/LaunchDaemons/com.rpetrich.rocketbootstrapd.plist
/Library/LaunchDaemons/com.tigisoftware.filza.helper.plist
/Library/MobileSubstrate/DynamicLibraries/ATTweak.dylib
/Library/MobileSubstrate/DynamicLibraries/GameGod.dylib
/Library/MobileSubstrate/DynamicLibraries/LocalIAPStore.dylib
/Library/MobileSubstrate/DynamicLibraries/Satella.dylib
/Library/MobileSubstrate/DynamicLibraries/iOSGodsiAPCracker.dylib
/Library/MobileSubstrate/DynamicLibraries/pccontrol.dylib
/Library/PreferenceBundles/SatellaPrefs.bundle/SatellaPrefs
/Library/PreferenceBundles/iOSGodsiAPCracker.bundle/iOSGodsiAPCracker