The fix for CVE-2020-9859 and the lightspeed vulnerability
TL;DR iOS 13.5.1 fixes the vulnerability we have already talked about in a previous blogpost but there is still a memleak left in the code if you are FAST!
A quick follow-up
As expected Apple was prompt to fix the vulnerability with the release of iOS 13.5.1. We see in the new kernel strings that it was compiled on May 26 2020, so only two days after the release of unc0ver. The security notes says:
Kernel
Available for: iPhone 6s and later, iPad Air 2 and later, iPad mini 4 and later, and iPod touch 7th generation
Impact: An application may be able to execute arbitrary code with kernel privileges
Description: A memory consumption issue was addressed with improved memory handling.
CVE-2020-9859: unc0ver
So we expect both the vulnerability and the memleak to be fixed. By reverse-engineering the new implementation of the lio_listio
syscall we can see that they refactored the way they do the clean-up. In the enqueuing loop, we still have 3 error cases where the asynchronous I/O that was supposed to be scheduled is not:
- the user provided a
NULL
address in the array ofstruct aiocb
; lio_create_entry()
returned NULL forentrypp
as its output pointer, this can happen withLIO_NOP
operations;- the user already scheduled
aio_max_requests_per_process
I/O or the given one is already scheduled;
In each case, the lio_context is accessed, as it is necessary to reduce lio_context->io_issued
. With the patch, we now have the following patterns, where free_context
is set to true
when this aborted I/O was the the last and only one to be scheduled:
aio_proc_lock_spin(p);
lio_context->io_issued--;
if (lio_context->io_issued == 0)
free_context = TRUE;
aio_proc_unlock(p);
continue;
This allows to keep trace of the case where no I/O were dispatched at all, leaving the syscall thread responsible to free the context. Now, once something is scheduled, only the worker thread can free the context. Indeed, within the syscall loop, free_context
can not be set to true when aio_enqueue_work()
was called as lio_context->io_issued
would not be zero.
Moreover, if lio_context
is still accessed while an I/O could have been scheduled, it's only because we are sure that the context was not freed. Indeed, we know for sure that lio_context->io_issued
is still greater than lio_context->io_completed
.
This is because, for any iteration of the loop, we have the invariant lio_context->io_completed <= P
and lio_context->io_issued >= (P+1)
, with P
being the number of time aio_enqueue_work()
was called. Therefore the worker thread never see lio_context->io_issued == lio_context->io_completed
and cannot free the context.
As an exit routine we now have something similar to (please note that this is reconstruction from disassembly and not actual source):
if (free_context == TRUE)
goto ExitRoutine;
switch (uap->mode) {
case LIO_WAIT:
aio_proc_lock_spin(p);
while (lio_context->io_completed < lio_context->io_issued) {
result = msleep(lio_context, aio_proc_mutex(p), PCATCH | PRIBIO | PSPIN, "lio_listio", 0);
if (result != 0) {
break;
}
}
if (lio_context->io_completed == lio_context->io_issued)
{
free_context = TRUE;
}
else
{
free_context = FALSE;
lio_context__->io_waiter = 0;
}
aio_proc_unlock(p);
break;
case LIO_NOWAIT:
{
free_context = FALSE;
break;
}
}
if (call_result == -1) {
call_result = 0;
*retval = 0;
}
ExitRoutine:
if (entryp_listp != NULL) {
FREE( entryp_listp, M_TEMP );
}
if (aiocbpp != NULL) {
FREE( aiocbpp, M_TEMP );
}
if (free_context) {
free_lio_context(lio_context);
}
From that, we understand that in the LIO_NOWAIT
case the context is now never freed by the syscall unless an error was detected beforehand, meaning that no worker dealt with I/O.
So the vulnerability is now fixed and there is no memleak anymore right? Well ... not really. There is still the case where the following race happens in the LIO_NOWAIT
case:
- The syscall expects to schedule I/O, with at least one valid and one invalid (with
LIO_NOP
for instance). - The syscall enqueues the valid I/O and a context switch happens before the invalid ones are scheduled.
- The kernel worker deals with the I/O without cleaning the context (as
lio_context->io_issued
is still too high). - The syscall thread is resumed, and returns without freeing the context as
lio_context->io_issued
is not zero.
In that situation the context allocation is never freed. That is not a big deal as it is harder to trigger but one process can still exhaust the kalloc.16
memory pool, making the kernel crash (or most likely jetsam killing critical stuff).
Here is a jetsam log showing a large usage of the kalloc.16
pool when we attempted to exhaust it.
{"ale_flag":true,"bug_type":"298","os_version":"iPhone OS 13.5.1 (17F80)","timestamp":"2020-06-03 09:16:34.45 -0700","incident_id":"8545A150-B2B2-45A1-BE27-BF92461C9277"}
{
"crashReporterKey" : "<REDACTED>",
"kernel" : "Darwin Kernel Version 19.5.0: Tue May 26 20:56:05 PDT 2020; root:xnu-6153.122.2~1\/RELEASE_ARM64_T8030",
"product" : "iPhone12,1",
"incident" : "<REDACTED>",
"date" : "<REDACTED>",
"build" : "iPhone OS 13.5.1 (17F80)",
"timeDelta" : 0,
"memoryStatus" : {
"compressorSize" : 0,
"compressions" : 545,
"decompressions" : 4,
"zoneMapCap" : 1439809536,
"largestZone" : "kalloc.16",
"largestZoneSize" : 1309753344,
"pageSize" : 16384,
"uncompressed" : 0,
"zoneMapSize" : 1368965120,
"memoryPages" : {
"active" : 13531,
"throttled" : 0,
"fileBacked" : 56769,
"wired" : 97240,
"anonymous" : 4880,
"purgeable" : 81,
"inactive" : 12406,
"free" : 83360,
"speculative" : 35712
}
}
As a conclusion, we are curious to see if Apple bother to fix the (unlikely) memleak and how they do so.