一个程序员的辩白

12 Sep 2017

XNU kernel debugging via VMWare Fusion

Prelude

Being able to debugging the XNU kernel is one of basic skill of my work.

It’s not daunting, but a boring job anyway.

So I decided to wrapup a post to summarize how to setup a debug environment for XNU kernel. Hopefully, it should be a canonical way to do that.

Confess

This article partly taken from Damien DeVille’s Kernel debugging with LLDB and VMware Fusion

The very first thing when you want to debug XNU kernel is to setup a proper debugging environment, that is easy even it’s your first time get to know about them. Just be patient :-)

Typically, the kernel debugging is two-boxes, which means that you need two machines to complete the debugging. In such case, one can done this by install a virtual machine as a guest to communicate with host(physical machine).

The basic assumption is that your currently system is OS X.

In the following sections, I’ll taking Sierra 10.12.6(16G29) as an example.


Host machine

Install XCode and Kernel Debug Kit(KDK)

The XCode can be install via App store(optional), and the KDK can be downloaded from Downloads for Apple Developers.

Note that the kernel debug kit should follow virtual machine build version(via sw_vers(1)), not your host machine build version, they should be installed in host and virtual machine.

You can install multiple KDKs in host if you own multiple OS virtual machines.

see: /Library/Developer/KDKs/KDK_A.B.C_D.kdk/ReadMe.html

(KDK for 16G29 is KDK_10.12.6_16G29.kdk)

Install macOS in VMWare Fusion

There’re plenty of virtual machine softwares you can use, like Virtualbox, VMware Fusion, Parallels Desktop, etc,. I’ll taking VMWare Fusion as our solution.

After you installed the software mentioned above, you should install OS X in your VMs. Basically, you need a iso image to install OS X from scratch.

NOTE: macOS can also be installed from recovery partition in Fusion.

Fortunately, those already some scripts done this job for your:

Simple bash script to create a Bootable ISO from macOS Sierra Install Image from Mac App Store

Create bootable ISO from HighSierra Installer

Also, here a refined script based on above solution.

Before you run the script, you should download the macOS(Sierra, for example) install application. App Store will automatically place it into /Applications/.

Update host firmware NVRAM boot-args

$ sudo nvram boot-args="debug=0x140 kext-dev-mode=1 -v"

The boot-args also needed to update in VM, I’ll explain them in the following sections.


Virtual machine

Install VMWare tools and KDK

When you installed macOS in VM, you should also install VMWare tools(VMWare Fusion menu -> Virtual Machine -> Install VMWare Tools) to allow Fusion-specific features to work.

Kernel Debug Kit must follow output of sw_vers(1)

You may copy development/debug kernels from KDK

/Library/Developer/KDKs/KDK_A.B.C_D.kdk/System/Library/Kernels

into

/System/Library/Kernels

 

16G29, for example:

$ sudo cp /Library/Developer/KDKs/KDK_10.12.6_16G29.kdk/System/Library/Kernels/kernel.development /System/Library/Kernels

Update VM firmware NVRAM boot-args

The NVRAM(non-volatile random-access memory) is a firmware used to store persistent information even when power is off, in our case, it can be used to store bootstrap arguments which system will read it when booting.

This memory region can be manipulated via nvram(8), you should read the man page for its usage since it changes vary from different system versions.

A typical boot-args for kernel debugging can be:

$ sudo nvram boot-args="debug=0x144 kext-dev-mode=1 kcsuffix=development pmuflags=1 -v"
  • debug = 0x144, this debugging flags is a combination of (DB_NMI | DB_ARP | DB_LOG_PI_SCRN), which ensure the vm can be connected via a remote debugger.

    DB_NMI(0x4), Which drop into debugger on NMI (Command–Power, Command-Option-Control-Shift-Escape, or interrupt switch).

    You usually don’t need to provide DB_NMI in host machine’s boot-args, occasionally, if you do that, you need to remapping the Command-Option-Control-Shift-Escape key combination into another one, otherwise the host will trap if you press the combination key.

    DB_ARP(0x40), Which allow debugger to ARP and route.

    DB_LOG_PI_SCRN(0x100), Which disable graphical panic dialog.

    NOTE: If you mean to be debugging the kernel at boot-time, yuo should specify DB_HALT(0x1), it halts the machine in boot-time and wait for debugger to attach.

  • kext-dev-mode=1, allow us to load unsigned(without Developer ID certificate) kernel extensions.

    if you’re not intended to debugging kernel extension, this argument is optional.

    NOTE: Since macOS El Capitan(10.11), the kext-dev-mode boot-arg is obsolete, so this flag mainly used for forward-compatible.

    Newer systems should use csrutil(8) to archieve the same functionality. more see Configuring System Integrity Protection

    For older systems, you should first disable system security policies so you can alter the nvram.

  • kcsuffix=development, allow us to boot system with development kernel(which previously copied into /System/Library/Kernels from KDK). Another one is kcsuffix=debug. if you’re not intended to debugging kernel itself, this argument is optional.

[190214] NOTE: when you changed boot-args’s kcsuffix argument, please remember to clean up prelinked kernel images(compressed kernel cache) at /System/Library/PrelinkedKernels

  • pmuflags=1, disable the watchdog timers, this used to avoid watchdog timer problems.

  • -v, enable the kernel verbose mode that will be useful when debugging.

Is there a list of available boot-args for darwin / OS X describes elaborate boot arguments.

Invalidate the kext cache

If you specified kcsuffix boot arguments, you should invalidate the kext cache files. You can achieve via kextcache(8):

# / means the root of the current volume, you can specify another volume if needed
$ sudo kextcache -invalidate /

Retrieve VM network info

In order to connect the debugger to the VM, you need to know network information about the VM, typically, via ifconfig(8)

$ ifconfig
...
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1500
	options=b<RXCSUM,TXCSUM,VLAN_HWTAGGING>
	ether 00:0c:29:af:fa:a0 
	inet6 fe80::808:99a1:9108:fb70%en0 prefixlen 64 secured scopeid 0x4 
	inet 172.16.41.129 netmask 0xffffff00 broadcast 172.16.41.255
	nd6 options=201<PERFORMNUD,DAD>
	media: autoselect (1000baseT <full-duplex>)
	status: active
...

so later you can kdp-remote(lldb) or target remote-kdp(gdb) the VM in host machine.

In practice

In order to test our debugging setup, I’ll illustrate two examples to show how to debug the kernel.

In both cases, I’ll use lldb(1) as our remote debugger.

Example 1 - debugging the kernel

Let’s say we want to debug the vn_getxattr function in kernel, scrutinize its call stack and its arguments.

/**
 * Retrieve the data of an extended attribute.
 * see: xnu/bsd/sys/vnode_internal.h
 */
int vn_getxattr(vnode_t vp, const char *name, uio_t uio, size_t size, int options, vfs_context_t context);

In this case, we OR’ed boot-args with DB_HALT(0x1) for VM:

$ sudo nvram boot-args="debug=0x145 kext-dev-mode=1 kcsuffix=development pmuflags=1 -v"

So the VM kernel can be interrupted for debugger to attach at boot-time.

When you reboot, the kernel will interrupt automatically, the console will output like:

...
Darwin Bootstrapper Version 4.0.0: Sun May  7 19:34:52 PDT 2017; root:libxpc_executables-972.70.1~4/launchd/RELEASE_X86_64
boot-args = debug=0x145 kext-dev-mode=1 kcsuffix=development pmuflags=1 -v
...
IOKernelDebugger: registering debugger
ethernet MAC address: 00:0c:29:af:fa:a0
ip address: 172.16.41.129

Waiting for remote debugger connection.

Connect to the VM for debugging

$ lldb /Library/Developer/KDKs/KDK_10.12.6_16G29.kdk/System/Library/Kernels/kernel.development
...
Current executable set to '/Library/Developer/KDKs/KDK_10.12.6_16G29.kdk/System/Library/Kernels/kernel.development' (x86_64).

# Run all discovered debug scripts in this session
(lldb) settings set target.load-script-from-symbol-file true
...
xnu debug macros loaded successfully. Run showlldbtypesummaries to enable type summaries.

# Connect to the VM
(lldb) kdp-remote 172.16.41.129
Version: Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/DEVELOPMENT_X86_64; UUID=B8B972B8-220D-3E56-92F8-ADB8CF777CE2; stext=0xffffff8016c00000
Kernel UUID: B8B972B8-220D-3E56-92F8-ADB8CF777CE2
...
kernel.development was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 1 stopped
* thread #2, name = '0xffffff8021bc1b00', queue = '0x0', stop reason = signal SIGSTOP
    frame #0: 0xffffff8016dfec24 kernel.development`kdp_register_send_receive(send=(IONetworkingFamily`IOKernelDebugger::kdpTransmitDispatcher(void*, unsigned int) at IOKernelDebugger.cpp:369), receive=(IONetworkingFamily`IOKernelDebugger::kdpReceiveDispatcher(void*, unsigned int*, unsigned int) at IOKernelDebugger.cpp:353)) at kdp_udp.c:478 [opt]
Target 0: (kernel.development) stopped.

At this point, we are stopped in the debugger and the VM kernel is waiting for us to continue.

(lldb) breakpoint set --name vn_getxattr
Breakpoint 1: where = kernel.development`vn_getxattr + 51 at vfs_xattr.c:122, address = 0xffffff80170c7453

After we set breakpoint, we can continue the interrupted process.

(lldb) continue
Process 1 resuming

The kernel thus continue to run until it hits set breakpoint(very likely hit at boot-time, since XNU itself use xattr heavily), in this case, e.g. vn_getxattr.

Process 1 stopped
* thread #5, name = '0xffffff8021a432c0', queue = '0x0', stop reason = breakpoint 1.1
    frame #0: 0xffffff80170c7453 kernel.development`vn_getxattr(vp=0xffffff8021fca078, name="com.apple.decmpfs", uio=0x0000000000000000, size=0xffffff80ae99b868, options=8, context=0xffffff8021a6d320) at vfs_xattr.c:122 [opt]
Target 0: (kernel.development) stopped.

Back in the lldb we can print information about the current stack trace and arguments:

(lldb) thread backtrace
* thread #5, name = '0xffffff8021a432c0', queue = '0x0', stop reason = breakpoint 1.1
  * frame #0: 0xffffff80170c7453 kernel.development`vn_getxattr(vp=0xffffff8021fca078, name="com.apple.decmpfs", uio=0x0000000000000000, size=0xffffff80ae99b868, options=8, context=0xffffff8021a6d320) at vfs_xattr.c:122 [opt]
    frame #1: 0xffffff80170e6a3f kernel.development`decmpfs_fetch_compressed_header(vp=0xffffff8021fca078, cp=0xffffff8021f16540, hdrOut=0xffffff80ae99b920, returnInvalid=1) at decmpfs.c:513 [opt]
    frame #2: 0xffffff80170e7a4b kernel.development`decmpfs_file_is_compressed(vp=<unavailable>, cp=0xffffff8021f16540) at decmpfs.c:754 [opt]
    frame #3: 0xffffff7f986d5852
    frame #4: 0xffffff7f986ccba7
    frame #5: 0xffffff80170d12d8 kernel.development`vnode_getattr [inlined] VNOP_GETATTR(vp=0x0000000000000000, vap=0xffffff8021fca078, ctx=0x0000000000000000) at kpi_vfs.c:3287 [opt]
    frame #6: 0xffffff80170d12a8 kernel.development`vnode_getattr(vp=0x0000000000000000, vap=0xffffff8021fca078, ctx=0x0000000000000000) at kpi_vfs.c:2352 [opt]
    frame #7: 0xffffff80172f710f kernel.development`exec_activate_image [inlined] exec_check_permissions(imgp=0xffffff8021c16000) at kern_exec.c:4307 [opt]
    frame #8: 0xffffff80172f7069 kernel.development`exec_activate_image(imgp=<unavailable>) at kern_exec.c:1388 [opt]
    frame #9: 0xffffff80172f6858 kernel.development`posix_spawn(ap=<unavailable>, uap=<unavailable>, retval=0xffffff8021a8de78) at kern_exec.c:2714 [opt]
    frame #10: 0xffffff80173e0c5b kernel.development`unix_syscall64(state=<unavailable>) at systemcalls.c:376 [opt]
    frame #11: 0xffffff8016dd9d96 kernel.development`hndl_unix_scall64 + 22

(lldb) p *((vnode_t) $rdi)  # Examine first argument of vn_getxattr
(vnode) $24 = {
  v_lock = {
    opaque = ([0] = 0, [1] = 18446744069414584320)
  }
...
  v_label = 0x0000000000000000
  v_resolve = 0x0000000000000000
}

$ (lldb) p ((vnode_t) $rdi)->v_name
(const char *) $26 = 0xffffff8022000710 "dirhelper"

(lldb) p (const char *) $rsi  # Second argument
(const char *) $31 = 0xffffff801757b283 "com.apple.decmpfs"

# (Other arguments omitted)

You can almost do everything you want!


Quoted from Damien DeVille’s:

It’s important to note that once the kernel has launched and the debugger continued, the kernel cannot be halted again from the debugger. In fact, if you try you will get an error message:

(lldb) process interrupt
error: Failed to halt process: Halt timed out. State = running

For this reason, you should make sure that all your breakpoints are registered in the debugger before running continue for the kernel to complete its boot.

 

Yet if you continue execution of the kernel, you can interrupt the kernel manually via Command-Option-Control-Shift-Escape key.

The screen will output like:

Debugger called: <HID: USB Programmer Key>
ethernet MAC address: 00:0c:29:af:fa:a0
ip address: 172.16.41.129

Waiting for remote debugger connection.

Once you connected the VM kernel, the kernel likely stopped at hw_atomic_sub, it called within Debugger.

Version: Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/DEVELOPMENT_X86_64; UUID=B8B972B8-220D-3E56-92F8-ADB8CF777CE2; stext=0xffffff801d600000
Kernel UUID: B8B972B8-220D-3E56-92F8-ADB8CF777CE2
...
kernel.development was compiled with optimization - stepping may behave oddly; variables may not be available.
Process 1 stopped
* thread #2, name = '0xffffff802880c770', queue = '0x0', stop reason = signal SIGSTOP
    frame #0: 0xffffff801d98957e kernel.development`Debugger [inlined] hw_atomic_sub(delt=1) at locks.c:1514 [opt]
Target 0: (kernel.development) stopped.

At this stage, you can follow the steps mentioned above to debugging the kernel.


Example 2a - debugging a kernel extension(kext for short)

In order to debug a kext, we use the similar arrangement mentioned above.

# Update boot-args
$ sudo nvram boot-args="debug=0x144 kext-dev-mode=1 kcsuffix=development pmuflags=1 -v"

# Invalidate kext cache
$ sudo kextcache -invalidate /

Say, we have a kext named kext-panic.

#include <mach/mach_types.h>
#include <sys/systm.h>

#define KEXT_NAME "kext-panic"
#define LOG(fmt, ...) printf(KEXT_NAME ": " fmt "\n", ##__VA_ARGS__)

kern_return_t kext_panic_start(kmod_info_t *ki, void *d)
{
    LOG("loaded  id: %#x addr: %#lx", ki->id, ki->address);
    return KERN_SUCCESS;
}

kern_return_t kext_panic_stop(kmod_info_t *ki, void *d)
{
    LOG("unloaded  id: %#x addr: %#lx", ki->id, ki->address);
    char *p = (char *) 0xdeadbeef;
    *p = '\0';      /* Should panic */
    return KERN_SUCCESS;
}

Before you compile above code, you should set Debug Information Format with DWARF with dSYM File for your current build schema.

[ref]: It can be located anywhere that is indexed by Spotlight. When encountering an unknown symbol (for example a function in your kext), LLDB will look for a .dSYM file that matches this symbol’s Mach-O binary UUID. If the .dSYM file is on your host machine and was indexed by Spotlight then LLDB will symbolicate things nicely.

Also you may turn off code-level optimization, so stepping won’t weird.

In VM, copy the kext compiled from host.

$ cp -r $SOMEWHERE/kext-panic.kext /var/tmp
# You have to chown the kext so you can load them
$ sudo chown -R root:wheel /var/tmp/kext-panic.kext

Load the kext:

$ sudo kextload /var/tmp/kext-panic.kext

The kernel message will output like:

$ sudo dmesg | grep kext-panic
kext-panic: loaded  id: 0x6e addr: 0xffffff7f8766b000

When we unload the kext, the kernel should panic:

panic(cpu 1 caller 0xffffff8005583bd3): Kernel trap at 0xffffff7f8766bf34, type 14=page fault, registers:
CR0: 0x0000000080010033, CR2: 0x00000000deadbeef, CR3: 0x0000000085c1c080, CR4: 0x00000000003606e0
RAX: 0x0000000000000000, RBX: 0xffffff800fc54640, RCX: 0x00000000deadbeef, RDX: 0x00000000deadbeef
RSP: 0xffffff809c823ba0, RBP: 0xffffff809c823bc0, RSI: 0x0000000000000000, RDI: 0xffffff807658c000
R8:  0x00000011af78a0ae, R9:  0x0000000000000000, R10: 0x00000000002b0007, R11: 0x0000000002000001
R12: 0xffffff800eee1fc0, R13: 0xffffff800fc54640, R14: 0x0000000000000000, R15: 0xffffff7f8766bf8e
RFL: 0x0000000000010246, RIP: 0xffffff7f8766bf34, CS:  0x0000000000000008, SS:  0x0000000000000010
Fault CR2: 0x00000000deadbeef, Error code: 0x0000000000000002, Fault CPU: 0x1 VMM, PL: 0, VF: 1

Backtrace (CPU 1), Frame : Return Address
0xffffff809c823400 : 0xffffff8005430c64 
0xffffff809c8238a0 : 0xffffff8005583bd3 
0xffffff809c823a90 : 0xffffff80053d9593 
0xffffff809c823ab0 : 0xffffff7f8766bf34 
0xffffff809c823bc0 : 0xffffff80059ffedf 
0xffffff809c823bf0 : 0xffffff80059fd7da 
0xffffff809c823c30 : 0xffffff8005a06bfe 
0xffffff809c823c70 : 0xffffff8005a0c7e4 
0xffffff809c823cf0 : 0xffffff8005a1bcfc 
0xffffff809c823d70 : 0xffffff800548bd15 
0xffffff809c823dc0 : 0xffffff80054361cc 
0xffffff809c823e20 : 0xffffff800540d19c 
0xffffff809c823e70 : 0xffffff8005426057 
0xffffff809c823f00 : 0xffffff800556db7d 
0xffffff809c823fb0 : 0xffffff80053d9db6 
      Kernel Extensions in backtrace:
         cn.junkman.kext.kext-panic(1.0)[43915CCB-4282-3481-B711-9DC4A74ED9D7]@0xffffff7f8766b000->0xffffff7f8766cfff

BSD process name corresponding to current thread: kextunload
Boot args: debug=0x144 kext-dev-mode=1 kcsuffix=development pmuflags=1 -v

Mac OS version:
16G29

Kernel version:
Darwin Kernel Version 16.7.0: Thu Jun 15 17:36:27 PDT 2017; root:xnu-3789.70.16~2/DEVELOPMENT_X86_64
Kernel UUID: B8B972B8-220D-3E56-92F8-ADB8CF777CE2
Kernel slide:     0x0000000005000000
Kernel text base: 0xffffff8005200000
__HIB  text base: 0xffffff8005100000
System model name: VMware7,1 (Mac-E43C1C25D4880AD6)

System uptime in nanoseconds: 310848796046
ethernet MAC address: 00:0c:29:af:fa:a0
ip address: 172.16.41.129

Waiting for remote debugger connection.

Using lldb to remote-debug the problematic kext:

$ lldb /Library/Developer/KDKs/KDK_10.12.6_16G29.kdk/System/Library/Kernels/kernel.development
...
(lldb) settings set target.load-script-from-symbol-file true
...

(lldb) kdp-remote 172.16.41.129
...
Process 1 stopped
* thread #2, name = '0xffffff80148b0ee0', queue = '0x0', stop reason = signal SIGSTOP
    frame #0: 0xffffff7f8766bf34 kext-panic`kext_panic_stop(ki=0xffffff7f8766c000, d=0x0000000000000000) at kext_panic.c:17
   14  	{
   15  	    LOG("unloaded  id: %#x addr: %#lx", ki->id, ki->address);
   16  	    char *p = (char *) 0xdeadbeef;
-> 17  	    *p = '\0';      /* Should panic */
   18  	    return KERN_SUCCESS;
   19  	}
Target 0: (kernel.development) stopped.

The kext .dSYM file loaded successfully, we can easily debugging our kext.

(lldb) thread backtrace
* thread #2, name = '0xffffff80148b0ee0', queue = '0x0', stop reason = signal SIGSTOP
  * frame #0: 0xffffff7f8766bf34 kext-panic`kext_panic_stop(ki=0xffffff7f8766c000, d=0x0000000000000000) at kext_panic.c:17
    frame #1: 0xffffff80059ffedf kernel.development`OSKext::stop(this=0xffffff800fc54640) at OSKext.cpp:6368 [opt]
    frame #2: 0xffffff80059fd7da kernel.development`OSKext::unload(this=0xffffff800fc54640) at OSKext.cpp:6470 [opt]
    frame #3: 0xffffff8005a06bfe kernel.development`OSKext::removeKext(aKext=<unavailable>, terminateServicesAndRemovePersonalitiesFlag=true) at OSKext.cpp:3540 [opt]
    frame #4: 0xffffff8005a0c7e4 kernel.development`OSKext::handleRequest(hostPriv=0x0000000000000000, clientLogFilter=<unavailable>, requestBuffer="؅?\x80???, requestLength=<unavailable>, responseOut=<unavailable>, responseLengthOut=<unavailable>, logInfoOut=<unavailable>, logInfoLengthOut=<unavailable>) at OSKext.cpp:8025 [opt]
    frame #5: 0xffffff8005a1bcfc kernel.development`::kext_request(hostPriv=<unavailable>, clientLogSpec=<unavailable>, requestIn=<unavailable>, requestLengthIn=<unavailable>, responseOut=0xffffff8016b8a888, responseLengthOut=0xffffff8016b8a8b0, logDataOut=<unavailable>, logDataLengthOut=<unavailable>, op_result=<unavailable>) at OSKextLib.cpp:291 [opt]
    frame #6: 0xffffff800548bd15 kernel.development`_Xkext_request(InHeadP=<unavailable>, OutHeadP=0xffffff8016b8a864) at host_priv_server.c:2997 [opt]
    frame #7: 0xffffff80054361cc kernel.development`ipc_kobject_server(request=<unavailable>, option=<unavailable>) at ipc_kobject.c:352 [opt]
    frame #8: 0xffffff800540d19c kernel.development`ipc_kmsg_send(kmsg=<unavailable>, option=<unavailable>, send_timeout=<unavailable>) at ipc_kmsg.c:1828 [opt]
    frame #9: 0xffffff8005426057 kernel.development`mach_msg_overwrite_trap(args=<unavailable>) at mach_msg.c:556 [opt]
    frame #10: 0xffffff800556db7d kernel.development`mach_call_munger64(state=0xffffff80176eb020) at bsd_i386.c:556 [opt]
    frame #11: 0xffffff80053d9db6 kernel.development`hndl_mach_scall64 + 22
    
(lldb) register read
General Purpose Registers:
       rax = 0x0000000000000000
       rbx = 0xffffff800fc54640
       rcx = 0x00000000deadbeef
       rdx = 0x00000000deadbeef
       rdi = 0xffffff807658c000
       rsi = 0x0000000000000000
       rbp = 0xffffff809c823bc0
       rsp = 0xffffff809c823ba0
        r8 = 0x00000011af78a0ae
        r9 = 0x0000000000000000
       r10 = 0x00000000002b0007
       r11 = 0x0000000002000001
       r12 = 0xffffff800eee1fc0
       r13 = 0xffffff800fc54640
       r14 = 0x0000000000000000
       r15 = 0xffffff7f8766bf8e  kext-panic`_stop
       rip = 0xffffff7f8766bf34  kext-panic`kext_panic_stop + 68 at kext_panic.c:17
    rflags = 0x0000000000010246
        cs = 0x0000000000000008
        fs = 0x0000000000000000
        gs = 0x00000000dead0000

Example 2b - Debugging(trace) kext at load stage

What if you want to debugging the kext from the very first stage? In a source trace level.

In order to achieve this, we have two steps:

# Step 1 - Specify DB_HALT(0x1) in boot-args and clean cache

$ sudo nvram boot-args="debug=0x145 kext-dev-mode=1 kcsuffix=development pmuflags=1 -v"
$ sudo kextcache -invalidate /

We also need launchctl plist file cn.junkman.kext.kext-panic.daemon.plist, which load kext at boot-time.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
        "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>cn.junkman.kext.kext-panic.daemon</string>

    <key>ProgramArguments</key>
    <array>
        <string>/sbin/kextload</string>
        <string>/var/tmp/kext-panic.kext</string>
    </array>

    <key>RunAtLoad</key>
    <true/>

    <key>KeepAlive</key>
    <false/>

    <key>LaunchOnlyOnce</key>
    <true/>
</dict>
</plist>

It’s a bare-bone launchd-plist file, for advanced topic, see A launchd Tutorial.

# Step 2 - Copy launchd xml-script to specific location
#          launchd will load them when booting
$ sudo cp $SOMEWHERE/cn.junkman.kext.kext-panic.daemon.plist /Library/LaunchDaemons

When booting, the console will halt automatically:

...
IOKernelDebugger: registering debugger
ethernet MAC address: 00:0c:29:af:fa:a0
ip address: 172.16.41.129

Waiting for remote debugger connection.
$ lldb /Library/Developer/KDKs/KDK_10.12.6_16G29.kdk/System/Library/Kernels/kernel.development
...
(lldb) settings set target.load-script-from-symbol-file true
...

(lldb) kdp-remote 172.16.41.129
...
Process 1 stopped
* thread #2, name = '0xffffff802594aa30', queue = '0x0', stop reason = signal SIGSTOP
    frame #0: 0xffffff801a9fec24 kernel.development`kdp_register_send_receive(send=(IONetworkingFamily`IOKernelDebugger::kdpTransmitDispatcher(void*, unsigned int) at IOKernelDebugger.cpp:369), receive=(IONetworkingFamily`IOKernelDebugger::kdpReceiveDispatcher(void*, unsigned int*, unsigned int) at IOKernelDebugger.cpp:353)) at kdp_udp.c:478 [opt]
Target 0: (kernel.development) stopped.

(lldb) breakpoint set --name kext_panic_start
Breakpoint 1: no locations (pending).
WARNING:  Unable to resolve breakpoint to any actual locations.

(lldb) continue
...
1 location added to breakpoint 1
Process 1 stopped
* thread #16, name = '0xffffff8026104580', queue = '0x0', stop reason = breakpoint 3.1
    frame #0: 0xffffff7f9cd24ed7 kext-panic`kext_panic_start(ki=0xffffff7f9cd25000, d=0x0000000000000000) at kext_panic.c:9
   6   	
   7   	kern_return_t kext_panic_start(kmod_info_t *ki, void *d)
   8   	{
-> 9   	    LOG("loaded  id: %#x addr: %#lx", ki->id, ki->address);
   10  	    return KERN_SUCCESS;
   11  	}
   12  	
Target 0: (kernel.development) stopped.

Thus we successfully attached to our kext, feel free to scrutinize the kext. :-)

NOTE: when we set breakpoint for our kext, it prompts it’s pending, since lldb not yet read kext-panic .dSYM file, so it’s not visible to lldb at the very first stage.

 

You may use kexutil(8) -i, -interactive option to pause each specified kext and wait for user input to start the kext.

This allows for debugger setup when the kext needs to be debugged during its earliest stages of running.

So you can attach to the debugger and start kext debugging at booting stage.

Summary

This article describes HOWTO debugging the XNU kernel elaborately, it still missing some advanced topics about kernel debugging. Yet, it’s enough for a programmer to get started with.

I illustrated two examples to show HOWTO debug the kernel and kext respectively, still dark corners frustrate you, so I collected some tips which may helpful.

Gotchas

  • The panic report(kernel dump) located at /Library/Logs/DiagnosticReports, you can use Console.app to view the System Reports.

  • XNU kernel printf() implmentation sucks, it don’t recognize some rarely used sub-specifiers like %h, %i, %j, %t, etc., use unrecognized spcifier may cause kernel panic.

Update: newer kernel seems fixed this bug, yet it discards unsupported specifiers.

  • When possible, avoid using _MALLOC or _FREE. if you don’t free the memory allocated by _MALLOC, it certainly memory leaked, it won’t kernel panic even when you unload the kext.

The same frustration applys to OSMalloc, I discussed this issue elaborately in a solo article.

References

Apple Documentation Archive

Kernel Programming Guide

Kernel Extension Programming Topics

Debugging macOS Kernel using VirtualBox

Kernel debugging with LLDB and VMware Fusion

Building and Debugging Kernels

Introduction to macOS Kernel Debugging

Two-box osx kernel development

Technical Note TN2063 - Understanding and Debugging Kernel Panics

x64 Cheat Sheet - 4.4 Stack Organization and Function Calls

Objective-See’s Blog

fusion_retrial.sh