Treasure Chest Party Quest: From DOOM to exploit
, - 25/11/2020 - inContext and objectives
From an attacker point of view, gaining code execution on a printer connected to the LAN can be interesting for several reasons:
- It can provide a long-term persistence mechanism, as printers are less likely to be re-installed than workstation
- It can be used to perform lateral movement within the internal network
- It can give access to sensitive documents that may be scanned and printed, but never stored on a workstation
Security researchers from Contextis managed to run the famous FPS video game Doom on a Canon MG6450 printer, as shown in 1. They exploited weaknesses within the encryption algorithm, used to encrypt newer firmware versions, in order to craft and deploy a custom firmware. Based on their work, we managed to obtain the firmware used by Canon Printer from the MX920 series, such as the Pixma MX925:
Thus, we spent a week, not trying to run a video game on a printer, but rather trying to find vulnerabilities that may be triggered from a compromised workstation, sharing the same LAN as the targeted printer.
In this blogpost, we’ll explain the firmware update mechanism, highlight the operating system behind Canon firmware, and talk about the attack surface and vulnerabilities that we found.
Firmware analysis
Firmware download over HTTP
MX920 series firmware can be updated manually, through the dedicated section on the web-interface, used to configure and manage the printer. The following URL is hardcoded in the firmware, and is used to download an XML file containing update information in order to obtain the latest firmware for a specific model:
http://gdlp01.c-wss.com/rmds/ij/ijd/ijdupdate/176b.xml
curl http://gdlp01.c-wss.com/rmds/ij/ijd/ijdupdate/176b.xml
<?xml version="1.0" encoding="UTF-8" ?>
<update_info>
<version>3.020</version>
<url>http://pdisp01.c-wss.com/gdl/WWUFORedirectTarget.do?id=MDQwMDAwNDgwNjAx&cmp=Z01&lang=EN</url>
<size>37127366</size>
</update_info>
The ID used in the URL, 176b
, looks like the USB Product ID, and is used by Canon to reference a unique model. As shown on https://devicehunt.com/view/type/usb/vendor/04A9, it is related to PIXMA MX920 Series.
If the version specified in the XML file is newer than the current one, a second URL (extracted from the XML) is used to obtain an HTTP redirection leading to the final URL in order to get the new firmware.
curl "http://pdisp01.c-wss.com/gdl/WWUFORedirectTarget.do?id=MDQwMDAwNDgwNjAx&cmp=Z01&lang=EN"
<html><head><title>302 Moved Temporarily</title></head>
<body bgcolor="#FFFFFF">
<p>This document you requested has moved temporarily.</p>
<p>It's now at <a href="http://gdlp01.c-wss.com/gds/6/0400004806/01/176BV3020AN.bin">http://gdlp01.c-wss.com/gds/6/0400004806/01/176BV3020AN.bin</a>.</p>
</body></html>
It’s interesting to note that the firmware doesn’t embed an HTTP client like curl or wget but rather implement a custom one, using low level sockets (User-Agent "IP Client/1.0.0.0
").
Decrypting the firmware
As documented in Contextis's research 1, the firmware update file is ciphered using a XOR based custom scheme.
Reimplementing Contextis cleartext attack was just a matter of writing a script and analyzing the XOR patterns.
The script used for this attack is available in our Github repository at https://github.com/synacktiv/canon-tools
We are aware that newer printers released by Canon are fitted with firmware on which this attack doesn’t work anymore.
Decompressing the main firmware
The decrypted firmware is a bootloader used to decompress and run the main firmware (ARM code). The first step is thus to take a look at the bootloader in order to find the decompression routine.
After identifying a few functions manually, IDA disassembly engine starts to pay off and few functions are automatically discovered. This one grabbed our attention:
_BYTE *__fastcall small_decompress_routine(_BYTE *dictionnary, _BYTE *dest, int uncompressed_length)
{
_BYTE *end; // r2
int first_byte; // r3
int same_data_count; // r4
int chunk_size; // r5
int i; // r4
char tmp_same_byte; // r6
int v9; // r4
unsigned int off_; // r3
_BYTE *src_start; // r4
char *src; // r4
int chunk_size_; // r3
char byte; // r6
end = &dest[uncompressed_length];
do
{
first_byte = (unsigned __int8)*dictionnary++;
same_data_count = first_byte & 3;
if ( (first_byte & 3) == 0 )
same_data_count = (unsigned __int8)*dictionnary++;
chunk_size = first_byte >> 4;
if ( !(first_byte >> 4) )
chunk_size = (unsigned __int8)*dictionnary++;
for ( i = same_data_count - 1; i; --i )
{
tmp_same_byte = *dictionnary++;
*dest++ = tmp_same_byte;
}
if ( chunk_size )
{
v9 = (unsigned __int8)*dictionnary++;
off_ = (unsigned int)(first_byte << 28) >> 30;
src_start = &dest[-v9];
if ( off_ == 3 )
off_ = (unsigned __int8)*dictionnary++;
src = &src_start[-256 * off_];
chunk_size_ = chunk_size + 1;
do
{
byte = *src++;
*dest++ = byte;
--chunk_size_;
}
while ( chunk_size_ >= 0 );
}
}
while ( dest < end );
return dictionnary;
}
It implements a small decompression routine, based on a dictionary, similar to the LZ algorithm.
By looking at the calling function (X-Ref), we can identify that:
- The dictionary is located at
0x043ff000
- The uncompressed firmware will be stored at
0x0x1DF9DE00
- The uncompressed firmware size is
0x108A78
As we identified the dictionary at offset 0x3FF000, relative from the start of the ROM, we can deduce that the base address of the bootloader is 0x04000000
.
A Unicorn 2 script was written to emulate the decompression routine and get the uncompressed firmware:
#!/usr/bin/env python3
from unicorn import *
from unicorn.arm_const import *
def hook_code(mu, address, size, user_data):
if address == 0x04220384:
R1 = mu.reg_read(UC_ARM_REG_R1)
R2 = mu.reg_read(UC_ARM_REG_R2)
print('Uncompressed bytes %x / %x' % (R1, R2))
BASE = 0x04000000
STACK_ADDR = 0xFFFFFFFF
STACK_SIZE = 2 * 1024 * 1024 # 2 MB stack size
FW_PATH = 'firmware/176BV3020AN_decrypted-fixed.bin'
mu = Uc(UC_ARCH_ARM, UC_MODE_ARM|UC_MODE_THUMB)
with open(FW_PATH, 'rb') as f:
fw_data = f.read()
# Map stack
mu.mem_map(STACK_ADDR + 1 - STACK_SIZE, STACK_SIZE)
# Map firmware at 0x04000000
mu.mem_map(BASE, 16*1024*1024) # 16MB
mu.mem_write(BASE, fw_data)
# 0x1DF9DE00: address of decompression buffer of size 0x108A780
mu.mem_map(0x1DF9DE00 & (~(0x1000-1)) , (0x108A780 & (~(0x1000-1))) + 0x2000)
mu.hook_add(UC_HOOK_CODE, hook_code)
mu.reg_write(UC_ARM_REG_SP, STACK_ADDR & (~(0x1000-1)))
decompression_routine = 0x04220998+1
decompression_routine_end = 0x042209ae
mu.emu_start(decompression_routine, decompression_routine_end)
with open('firmware/176BV3020AN_decrypted-uncompressed.bin', 'wb') as f:
memory = mu.mem_read(0x1DF9DE00, 0x108A780)
f.write(memory)
Problems encountered during analysis
As the decrypted firmware is just a binary file that can’t be properly parsed by IDA such as an ELF or a PE file, IDA can’t easily recognize data and code. Also, the entry point, base address and memory map of the firmware were unknown. While these problems are common when analyzing firmware, it was sometimes a hindrance to our analysis.
At the end of our reverse-engineering week, we identified at least 58k functions in the MX920 series firmware. With a bit of scripting, we were able to automatically rename a few functions which used debugging primitives.
On reinventing the wheel
During the writing of this blogpost, we discovered that someone (leecher1337) did publish exactly the same kind of decryption and decompression tool 3, several months before we even looked at the printer. So there are now at least two tools allowing you to decrypt Canon Pixma Firmware ¯\_(ツ)_/¯.
Dry-Os
The operating system on the printer is based on a custom Real Time Operating System named “DryOS”:
DRYOS version 2.3, release #0049+SMP
This system is itself based on µITRON
, a Japanese RTOS specification, as can be seen in the following string: "ITRON4.0"
This system is used throughout all of the printers firmwares we looked at, but is also used in Canon DSL cameras 45.
RTOS Tasks
All of the services provided by the printer firmware are centered around more than 300 concurrent tasks. In the binary, these tasks are defined using the following structure:
struct dry_task {
int field_0;
int field_4;
void *lpTaskFunction;
int field_C;
int field_10;
int field_14;
char *lpszTaskName;
int field_1C;
};
Using this structure, we were able to identify the functions responsible for handling the different tasks in the firmware. For example, the following code extract shows the definition of the main HTTP server handler, and two of its workers.
ROM:005E3E8C task <0, 0, TSK_HTTPD_sub_24104+1, 0xA, 0x600, 0, aTskhttpd, 1>; 0x4C
ROM:005E3E8C task <0, 0, HTTP_WORKER1_sub_26458+1, 0xA, 0x3200, 0, aTskhttpwork0,1> ; 0x4D
ROM:005E3E8C task <0, 0, HTTP_WORKER_2sub_26466+1, 0xA, 0x3200, 0, aTskhttpwork1, 1> ; 0x4E
Attack surface
In order to identify the attack surface, we ran a quick nmap6 check on the printer, which indicated at least 8 listening network services.
Opened TCP ports
nmap -A -p- 192.168.1.36 Starting Nmap 7.80 ( https://nmap.org ) at 2020-08-07 14:00 CEST Nmap scan report for 192.168.1.36 Host is up (0.047s latency). Not shown: 65532 closed ports PORT STATE SERVICE VERSION 80/tcp open http Canon Pixma printer http config (KS_HTTP 1.0) |_http-title: Site doesn't have a title. 515/tcp open printer 631/tcp open ipp CUPS 1.4 |_http-server-header: CUPS/1.4 |_http-title: 404 Not Found Service Info: Device: printer Service detection performed. Please report any incorrect results at https://nmap.org/submit/ . Nmap done: 1 IP address (1 host up) scanned in 78.21 seconds
Opened UDP ports
sudo nmap -sU -p- 192.168.1.36 Starting Nmap 7.80 ( https://nmap.org ) at 2020-08-07 14:03 CEST Nmap scan report for 192.168.1.36 Host is up (0.031s latency). Not shown: 65528 closed ports PORT STATE SERVICE 68/udp open|filtered dhcpc 500/udp open|filtered isakmp 3702/udp open|filtered ws-discovery 5353/udp open zeroconf 8611/udp open canon-bjnp1 8612/udp open canon-bjnp2 8613/udp open canon-bjnp3 MAC Address: 60:12:8B:68:F8:77 (Canon) Nmap done: 1 IP address (1 host up) scanned in 65.48 seconds
Custom HTTP Server
The firmware implements its own custom HTTP server. This custom server is recognizable thanks to the particular server string: KS_HTTP
.
A shodan 7 lookup can then show us that there are around 3500 of such servers publicly accessible over the Internet.
The web server is handled by one main task, and several workers, which are processing the available web pages using an array of the following structure:
struct web_page_handler {
void *field_0;
char *base_uri;
char *filename;
void *handler;
int field_10;
int field_14;
};
For example, the CGI script /English/pages_WinUS/cgi_oth.cgi
, targeted in our exploit, is defined in the handlers array in the following manner:
ROM:0072A808 web_page_handler <null_byte, aEnglishPagesWi, aCgiOthCgi, \
ROM:0072A808 VULN_CGI_OTH_CGI+1, 0, 0>; 0x78
Each of the handlers is using a global shared object to access the request’s data.
BJNP
BJNP is a proprietary protocol, designed by Canon, in order to print documents over the network, and perform LAN service discovery.
Not much resources are available related to this protocol, a good start is the source code of the debian package cups-backend-bjnp
.
As this is a proprietary “binary” protocol (i.e handling many “size” fields), it is always a target of choice when looking for Out-Of-Bounds read/write or integer overflow vulnerabilities.
Vulnerabilities
CGI stack buffer overflow
CGI scripts are a prime target for exploitation due to the fact that they parse user input.
This firmware was no exception, as the parsing of textual arguments is lacking length checks.
The vulnerable function is called throughout several CGI handlers, such as cgi_oth.cgi
, cgi_ips.cgi
or cgi_lan.cgi,
thus allowing for several stack based buffer overflows.
The following code extract, taken from the cgi_oth.cgi
page handler, illustrates the pattern for this vulnerability.
char szOutput[128]; // [sp+8h]
[...]
lpszInput = lpHTTPObject->vtable->get_param(lpHTTPObject, "OTH_TXT1");
URL_decode_sub_1E20EFC6(lpszInput, szOutput);
The function URL_decode_sub_1E20EFC6
is responsible for the overflow, as it will copy the arbitrary characters from the parameter inside the provided stack buffer. As it also decodes "%
" encoded characters, thus allowing to write arbitrary data in the stack buffer.
int __fastcall URL_decode_sub_1E20EFC6(unsigned __int8 *lpszInput, unsigned __int8 *lpszOutput)
{
int cur_char; // r0
char *v5; // r4
int result; // r0
char v7[24]; // [sp+0h] [bp-18h] BYREF
while ( 1 )
{
result = *lpszInput; // Return when the parameter is finished
if ( !*lpszInput )
break;
cur_char = *lpszInput;
if ( cur_char == '%' ) {
[...] // Convert % encoded characters
}
else if ( cur_char == '&' ) { // Terminate the parameter parsing if we attain the & separator
++lpszInput; *lpszOutput++ = 0;
} else {
if ( cur_char == '+' ) { // Replace + by spaces
++lpszInput; *lpszOutput = 0x20;
} else {
*lpszOutput = *lpszInput++; // Copy the character
}
++lpszOutput;
}
}
*lpszOutput = result;
return result;
}
A Proof Of Concept code triggering a crash of a targeted printer is contained in the following command:
curl 'http://target/English/pages_WinUS/cgi_oth.cgi' --data 'OTH_TXT2=++++AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA’'
Exploitation
Exploiting this vulnerability leads to a remote code execution with full privileges.
While the firmware is lacking protections against exploitation, such as stack cookies, writing a shellcode for exploiting the CGI stack based buffer overflow is a bit more complicated than usual due to the fact that the underlying system differs a lot from usual ones.
As we know our readers like challenges, we leave this as an exercise to them :)
BJNP Out of Bound write
The BJNP protocol is handled by the following tasks:
tskBJNP
tskBJNPPrinterTCP
tskBJNPPrinterUDP
tskBJNPScannerTCP
tskBJNPScannerUDP
The task tskBJNPPrinterTCP
initializes a context structure for handling BJNP messages received on TCP port 8611, and then uses socket(), bind(), listen(), accept() to receive incoming connections from BJNP clients. Each client is handled in BJNP_tcp_process_message
, which reads a 16 bytes structure on the socket. This structure is defined in cups-backend-bjnp
as following:
struct __attribute__((__packed__)) bjnp_header {
char BJNP_id[4]; /* string: BJNP */
uint8_t dev_type; /* 1 = printer, 2 = scanner */
uint8_t cmd_code; /* command code/response code */
uint16_t unknown1; /* unknown, always 0? */
uint16_t seq_no; /* sequence number */
uint16_t session_id; /* session id for printing */
uint32_t payload_len; /* length of command buffer */
};
After reading the 16 bytes bjnp_header
structure, the magic number is checked, and a dispatch function is called (using a fonction pointer) in order to process the message:
int __fastcall BJNP_tcp_process_message(bjnp_tcp_ctx *ctx)
{
char *v2; // r1
void *v3; // r2
void *v4; // r3
int v6; // r3
while ( 1 )
{
if ( bjnp_tcp_read_n_bytes(ctx->sockclient, ctx->buff_addr, 16, 0, &ctx->last_read_retval) != 16 )// read a 16 bytes bjnp_header structure
{
if ( ctx->last_read_retval < 0 )
return bjnp_close_context(ctx, v2, (int)v3, (int)v4);
if ( !ctx->last_read_retval )
break;
}
if ( bjnp_read_magic((unsigned int *)ctx->buff_addr) == 'BJNP' )// check the magic number correctness
{
if ( ((int (__fastcall *)(bjnp_tcp_ctx *))ctx->bnjp_callback)(ctx) < 0 ) // call a dispatch function in order to process the message
{
if ( ctx->last_read_retval < 0 )
return bjnp_close_context(ctx, v2, (int)v3, (int)v4);
if ( !ctx->last_read_retval )
return sub_1E00106C(ctx, (int)v2, v3, v4);
}
}
else // if the magic number is invalid, reply to the client with a special BJNP message
{
qmemcpy(ctx->buff_addr, "BJNP", 4);
bjnp_build_response_header(ctx->buff_addr, 0x8200, 0);
if ( NW_send(ctx->sockclient, (int)ctx->buff_addr, 0x10u, 0) < 0 )
{
bjnp_error("bjnp_tcp.c", 292, (int)"NW_send error");
sub_1E1F0DDE(1, "bjnp_tcp.c", 293, v6);
}
}
}
return sub_1E00106C(ctx, (int)v2, v3, v4);
}
The dispatch function calls several routines according to the command code (cmd_code
) specified in the header. An Out-of-band write vulnerability has been identified within 2 routines, called when dev_type
is set to 0 and cmd_code
is set to either 1 or 2. Here is the code of the routine related to cmd_node
1:
int __fastcall bjnp_tcp_handle_msg_0x01(bjnp_tcp_ctx *ctx)
{
unsigned int payload_len; // r5
int v3; // r6
payload_len = bjnp_read_payload_len((int)ctx->buff_addr);
bjnp_build_response_header(ctx->buff_addr, 0, 0);
v3 = bjnp_tcp_send(ctx->sockclient, (int)ctx->buff_addr, 16u);
if ( bjnp_read_response(ctx, payload_len) != payload_len )
v3 = -1;
return v3;
}
The bjnp_read_payload_len
function returns the field payload_len
, from the header structure filled by BJNP_tcp_process_message
. As this size is specified by the TCP client which sent the header, it is entirely controlled. Then, this size is used to specify to bjnp_read_response
how many bytes must be read on the socket, and copied within the destination buffer. This gives an out-of-band write primitive as the destination buffer is only 0x6000 bytes long, and the size used to copy is a 32 bit integer. The destination buffer is located in memory at 0x18998160, just after the bjnp tcp structure context. An exploitation scenario could be to override the callback function pointer of the bjnp udp structure context, located near after the destination buffer, at 0x1899E1A8.
Exploitation
We didn’t manage to trigger this vulnerability on a physical device, as our test model (an MX 475 series printer) doesn’t have TCP BJNP ports open by default (cf: Opened TCP ports).
If you have an Canon MX920 series printer at home, feel free to implement a POC and tell us whether you managed to trigger this vulnerability !
Countermeasures
As you'll see in the timeline section, Canon decided not to patch our vulnerabilities. In order to mitigate risks posed by vulnerable devices, we recommend to setup authentication using strong passwords, and if possible to segregate them from the network. We also recommend to keep them updated, when security patches are available. It's worth mentioning that Canon also released a document named "Useful tips for reducting the Risk of Unauthorized Access for Inkjet Printer", available at https://global.canon/en/support/security/pdf/inkjet-printer.pdf.
Conclusion
While it is really fun to play Doom on a printer, using previous research can unlock new quests8 for finding vulnerabilities in printers firmwares, which is equally if not more fun :) .
Timeline
- 06/07/2020 - Start of the research dedicated to firmware decryption / decompression and attack surface analysis (2 days)
- 03/08/2020 - Firmware reverse engineering / vulnerability research (5 days)
- 04/08/2020 - Stack Buffer-overflow vulnerability identified in cgi_oth.cgi
- 07/08/2020 - Second-hand Canon MX 425 Printer purchased and first vulnerability (CGI stack buffer overflow) confirmed with a POC
- 11/08/2020 - First mail sent to product-security@canon-europe.com in order to discuss vulnerability reporting process
- 12/08/2020 - Vulnerability details sent to Canon Europe Product Security
- 27/08/2020 - New mail sent to Canon as they didn't reply to our vulnerability report
- 28/08/2020 - Canon Europe Product Security replied that our findings have been forwarded to Canon Inc
- 17/11/2020 - More than 90 days have been spent since vulnerability details reporting, we asked for an update to Canon Europe Product Security
- 17/11/2020 - Canon Europe Product Security replied that "Following some investigation, this issue appears to relate to CVE-2013-4615. By following the ‘Useful Tips for Reducing the Risk of Unauthorized Access for Inkjet Printer’ https://global.canon/en/support/security/pdf/inkjet-printer.pdf we believe will mitigate the vulnerability."
- 17/11/2020 - We notify Canon Europe Product Security that we understand our vulnerability won't be patched and should be mitigated using their recommendations. We also announce that we'll publish our work and findings.
- 25/11/2020 - CVE-2020-29073 attributed for the Stack Buffer-overflow vulnerability
- 1. a. b. Hacking Canon Pixma Printers - Doomed Encryption
- 2. Unicorn - The ultimate CPU emulator
- 3. https://github.com/leecher1337/pixma
- 4. DryOS PIXMA Printer Shell
- 5. Reversing a Japanese Wireless SD Card: From Zero to Code Execution
- 6. Nmap: The network mapper
- 7. Shodan request
- 8. ALESTORM - Treasure Chest Party Quest