Subscribe to Jacob’s Tech Tavern for free to get ludicrously in-depth articles on iOS, Swift, tech, & indie projects in your inbox every week.
Full subscribers unlock Quick Hacks, my advanced tips series, and enjoy exclusive early access to my long-form articles.
This seems like a pretty obvious question at first — a crash is when your app unexpectedly stops working, and exits.
You might be surprised to learn that a crash is actually more like a controlled detonation: the system triggers the crash to keep your device safe.
Today, we’re delving deep into the underlying source code of Swift and iOS to understand what is actually going on when your app crashes.
A Hierarchy of Badness
A crash is far from the worst pain you can inflict on your users. Here’s the 9-Level hierarchy of the cardinal sins:
Perceptible issues: Frame-drops, slight device heating, unnecessary battery drain.
Mild glitches: Small UI bugs, animation hitches, unresponsive UI, and network timeouts.
Moderate bugs: Broken layout, navigation getting stuck, features not working, and long UI hangs.
Mining bitcoin in the background (but to be fair everybody does this*)
Data loss: Failing to submit or persist data, so they waste time and effort.
Crashing: Exit the process and break the user flow, probably also losing some data.
Data corruption: Losing pre-existing data for good. Your customer just became your biggest detractor.
Security breach: Exposing private information that might put your users at risk outside the context of your app.
System compromise: Escaping the app sandbox via system-level vulnerabilities to put the user’s device (and sometimes their life) at risk.
*my lawyer instructed me to say that not everybody does this.
A crash is a proactive measure to protect system integrity.
Put more simply, the system chooses to crash (a Level 6 sin) wherever it mitigates the chances of causing a Level 7, Level 8, or even Level 9 problem.
The Two Types of Crash
While there are countless reasons to crash, crashes in your iOS app originate from 1 of 2 places: the Swift Runtime and the XNU Kernel.
The Runtime
The Swift Runtime, A.K.A. libswiftCore
, runs alongside every Swift program. It provides core functionality for executing Swift such as dynamic dispatch, error handling, & memory management.
The runtime has the authority to crash the running process whenever it detects violations of memory safety, such as accessing memory that’s already been deallocated.
The Kernel
The kernel is the core of the operating system. It creates and manages abstractions on top of hardware — converting physical components like CPU, RAM, & SSD into virtual resources such as threads, heap memory, & file systems.
iOS runs on the XNU Kernel (“X is Not Unix”). Kernel crashes are the last line of defence for memory access violations to the system; designed to protect the user from Level 7+ issues like data corruption and system compromise.
XNU’s Mach subsystem is responsible for virtual memory management. It creates the “virtual address space” abstraction for each app process, then maps this virtual memory to physical RAM. It fires out exceptions such as EXC_BAD_ACCESS
to protect the integrity of these address spaces.
The kernel has many ways to kill your process (i.e. crash your app). For example, Watchdog kills any running process which is unresponsive for ~20 seconds, if you manage to block the main thread for that long.
Digging into the open source code
Crashes are far from the worst thing that code can do to your device.
Memory safety issues can allow bad actors to crack open your entire system. If the runtime or kernel let you access deallocated objects, they may not be subject to memory safety guardrails.
If the system let you access the 0xFFFFFFFF
-th index of an array, code can break out of the process sandbox and compromise the entire OS.
Conveniently, both the Swift Runtime and the XNU Kernel are open-source, so we can dive straight in to their code and check out both flavours of crash in action.
If you want to understand how dangerous deallocated memory can be, look no further than Jailbreak your Enemies with a Link: Remote Execution on iOS.
Swift Runtime Crash: Unowned
Triggering the crash with an Unowned reference
The Swift Runtime has many guardrails to ensure memory safety. For example, one mechanism crashes out whenever code tries to access the memory address of a deallocated object.
To help developers prevent retain cycles, Swift includes unowned references. These behave like weak references, except they include an assumption: we assume that the memory of the referenced object will always outlive the unowned pointer to it.
If we get these object lifetimes wrong, accessing an unowned reference to a deallocated object leads to a crash. This is the famous dangling pointer.
We’re making a trade-off for performance: Unowned references store less metadata than weak references, because they don’t create a side table on the heap object they point to. Accessing an unowned reference is therefore a little bit more performant — to get at the memory, it requires one less jump between pointers.
I’ve written about unowned references before. Honestly, I don’t like them much, but that’s another story. They are, however, a wonderful source of crashes if you get them wrong.
Implementation of a Runtime Crash
We can handily look into the Swift runtime source code to see this in action. Let’s look at the runtime ABI for an unowned reference in include/swift/runtime/HeapObject.h:
/// An unowned reference in memory. This is ABI.
struct UnownedReference {
HeapObject *Value;
};
/// Aborts if the object has been deallocated.
SWIFT_RUNTIME_EXPORT
void swift_unownedCheck(HeapObject *value);
This unowned check is found in the implementation for a runtime heap object, stdlib/public/runtime/HeapObject.cpp:
void swift::swift_unownedCheck(HeapObject *object) {
if (!isValidPointerForNativeRetain(object)) return;
assert(object->refCounts.getUnownedCount() &&
"object is not currently unowned-retained");
if (object->refCounts.isDeiniting())
swift::swift_abortRetainUnowned(object);
}
Finally, this abort method can be defined in stdlib/public/runtime/Errors.cpp:
// Crash due to retain of a dead unowned reference.
// FIXME: can't pass the object's address from InlineRefCounts without hacks
void swift::swift_abortRetainUnowned(const void *object) {
if (object) {
swift::fatalError(FatalErrorFlags::ReportBacktrace,
"Fatal error: Attempted to read an unowned reference but "
"object %p was already deallocated", object);
} else {
swift::fatalError(FatalErrorFlags::ReportBacktrace,
"Fatal error: Attempted to read an unowned reference but "
"the object was already deallocated");
}
}
We’ve tracked down the exact error message we saw in our example:
Fatal error: Attempted to read an unowned reference but object
0x6000000085e0
was already deallocated
What is a FatalError?
The swift::fatalError
method itself is also defined in stdlib/public/runtime/Errors.cpp:
// Report a fatal error to system console, stderr, and crash logs, then abort.
SWIFT_NORETURN void swift::fatalError(uint32_t flags, const char *format, ...) {
va_list args;
va_start(args, format);
fatalErrorv(flags, format, args);
}
// Report a fatal error to system console, stderr, and crash logs, then abort.
SWIFT_NORETURN void swift::fatalErrorv(uint32_t flags, const char *format,
va_list args) {
char *log;
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wuninitialized"
swift_vasprintf(&log, format, args);
#pragma GCC diagnostic pop
swift_reportError(flags, log);
abort();
}
The abort()
function actually originates in the C runtime. It actually signals to the Kernel, generating a SIGABRT
which terminates the running process — crashing the app.
XNU Kernel Crash: EXC_BAD_ACCESS
Triggering the crash with Unsafe Swift
In order to get past the Swift Runtime, we need to bypass the basic guardrails provided by the language against undefined behaviour. Swift comes with a tool for exactly this: unsafe APIs.
Unsafe APIs are types which bypass Swift’s memory safety guarantees and type checks, providing the flexibility required inter-op with C and C++. This is useful for ultra-performance-critical code, and helps with low-level systems programming.
UnsafeMutableRawPointer
is a direct port of the void*
C pointer — it represents an untyped memory address on the heap. It allows developers to read and write directly to this address without any pesky overhead like “types” or “runtime memory integrity”.
Let’s see unsafe Swift in action and force the kernel’s hand.
func triggerKernelException() {
let pointer = UnsafeMutableRawPointer(bitPattern: 0xdeadbeef)
_ = pointer?.load(as: UInt8.self)
}
This method creates an unsafe pointer to an arbitrary memory address, 0xdeadbeef
, and attempts to load an 8-bit unsigned integer (UInt8
) from this. The kernel immediately steps in with an EXC_BAD_ACCESS
exception to protect the system.
There’s no debug message this time, because the kernel doesn’t give the Swift Runtime time to generate debug information. It grimly crashes the process to protect the system from compromise.
EXC_BAD_ACCESS in XNU
We can look through the open-source Kernel code to inspect where this exception really takes place.
The XNU kernel code is a little tricky to parse because each submodule contains folders such as arm64/
and x86_64/
. These are different implementations of the same kernel logic tailored to the specific requirements of each chip architecture(ARM vs Intel).
Typically, header .h
files define the shared API and implementations .c
live in these CPU-architecture-specific folders.
EXC_BAD_ACCESS
is defined inside xnu/osfmk/mach/exception_types.h:
#include <mach/machine/exception.h>
/*
* Machine-independent exception definitions.
*/
#define EXC_BAD_ACCESS 1 /* Could not access memory */
/* Code contains kern_return_t describing error. */
/* Subcode contains bad memory address. */
#define EXC_BAD_INSTRUCTION 2 /* Instruction failed */
/* Illegal or undefined instruction or operand */
#define EXC_ARITHMETIC 3 /* Arithmetic exception */
/* Exact nature of exception is in code field */
#define EXC_EMULATION 4 /* Emulation instruction */
/* Emulation support instruction encountered */
/* Details in code and subcode fields */
#define EXC_SOFTWARE 5 /* Software generated exception */
/* Exact exception is in code field. */
/* Codes 0 - 0xFFFF reserved to hardware */
/* Codes 0x10000 - 0x1FFFF reserved for OS emulation (Unix) */
We can also see some of its friends:
EXC_BAD_INSTRUCTION
will be thrown when the CPU tries to execute an invalid assembly instruction. You can easily create one yourself by force unwrapping anil
optional.EXC_ARITHMETIC
protects the hardware from executing impossible arithmetic, most commonly when you attempt to divide by zero — many such cases when defining UI aspect ratios!EXC_SOFTWARE
will be thrown when software, including the runtime, explicitly triggers it — this is where theabort()
calls lead.
To understand how we trigger EXC_BAD_ACCESS
, we need to understand what we’re really doing with our unsafe Swift code.
The Virtual Memory Subsystem
One of the kernel’s main jobs is managing the ‘virtual memory’ abstraction on top of hardware. In XNU, this is performed by the vm
subsystem, inside OSFMK (Open Software Foundation Mach kernel).
When we attempt to access UnsafeMutableRawPointer(bitPattern: 0xdeadbeef)
, we’re searching for a pointer to an address in the mapped virtual memory address space. The virtual memory subsystem determines whether the address is valid in xnu/osfmk/vm/vm_map.c:
/*
* vm_region:
*
* User call to obtain information about a region in
* a task's address map. Currently, only one flavor is
* supported.
*
*/
kern_return_t
vm_map_region(
vm_map_t map,
vm_map_offset_t *address, /* IN/OUT */
vm_map_size_t *size, /* OUT */
vm_region_flavor_t flavor, /* IN */
vm_region_info_t info, /* OUT */
mach_msg_type_number_t *count, /* IN/OUT */
mach_port_t *object_name) /* OUT */
{
// ...
vm_map_lock_read(map);
start = *address;
if (!vm_map_lookup_entry(map, start, &tmp_entry)) {
if ((entry = tmp_entry->vme_next) == vm_map_to_entry(map)) {
vm_map_unlock_read(map);
return KERN_INVALID_ADDRESS;
// ...
}
There are dozens of functions in vm_map.c
, 24,000 lines of code, and 93 invocations of KERN_INVALID_ADDRESS
. I will bring no report to darken the light of day.
These functions are used in many memory-management contexts, but most call vm_map_lookup_entry
. This searches for the memory in the virtual address space. If the address lies outside the mapped memory, KERN_INVALID_ADDRESS
is thrown.
Waaaay down into the guts of the virtual memory subsystem, in vm_map_store_rb.c, we can see where this boolean originates:
#define VME_FOR_STORE(ptr) __container_of(ptr, struct vm_map_entry, store)
bool
vm_map_store_lookup_entry_rb(
vm_map_t map, vm_map_offset_t address, vm_map_entry_t *vm_entry
) {
struct vm_map_header *hdr = &map->hdr;
struct vm_map_store *rb_entry = RB_ROOT(&hdr->rb_head_store);
vm_map_entry_t cur = vm_map_to_entry(map);
vm_map_entry_t prev = VM_MAP_ENTRY_NULL;
while (rb_entry != (struct vm_map_store*)NULL) {
cur = VME_FOR_STORE(rb_entry);
if (address >= cur->vme_start) {
if (address < cur->vme_end) {
*vm_entry = cur;
return TRUE;
}
rb_entry = RB_RIGHT(rb_entry, entry);
prev = cur;
} else {
rb_entry = RB_LEFT(rb_entry, entry);
}
}
if (prev == VM_MAP_ENTRY_NULL) {
prev = vm_map_to_entry(map);
}
*vm_entry = prev;
return FALSE;
}
This function iteratively navigates a red-black tree, rb_head_store
, to identify whether address
falls within the mapped memory address range, vm_map_entry
.
In XNU, the virtual memory address space is implemented using a red-black tree for
O(log(n))
search, insertion, and deletion. I guess the guys at Carnegie Mellon University actually paid attention in CS class.
Whenever this returns FALSE
, the KERN_INVALID_ADDRESS
exception is thrown.
Mach and POSIX Signals
The KERN_INVALID_ADDRESS
return code indicates an unsafe memory access; outside the virtual address space of the process. This is a classic attack vector for Level 9 system compromise, so crashing really is the best outcome here.
The virtual memory subsystem vm
ultimately invokes the exception handling subsystem, which transforms the kernel exception into a higher-level exception that terminates the process.
These systems are all part of Mach, the low-level microkernel that powers XNU. Mach manages virtual memory, inter-process communication, thread scheduling, and exception handling.
This high-level exception takes the form of a SIGSEV
POSIX signal — a notification sent to the offending thread to kill its parent process.
We can see this mapping from KERN_INVALID_ADDRESS
to SIGSEV
in xnu/bsd/uxkern/ux_exception.c:
static int
ux_exception(int exception,
mach_exception_code_t code,
mach_exception_subcode_t subcode)
{
int machine_signal = 0;
/* Try machine-dependent translation first. */
if ((machine_signal = machine_exception(exception, code, subcode)) != 0) {
return machine_signal;
}
switch (exception) {
case EXC_BAD_ACCESS:
if (code == KERN_INVALID_ADDRESS) {
return SIGSEGV;
} else {
return SIGBUS;
}
// ...
SIGSEV
is a segmentation fault, which is a fancy word for a dangerous memory access violation.
Conclusion
As engineers we’re always on a mission to minimise crashes.
Once in a blue moon, it’s nice to appreciate what crashes do for us.
Crashes protect our users from the most dangerous system compromises. So the next time you get an array-out-of-bounds index error, you should grin at the protection provided by the Runtime and Kernel.
Next time your CTO admonishes you for the ludicrously high crash rate of your iOS app, you must take their hand, tearfully, and explain you are just keeping your users safe from dangerous memory access.
Woah. I had a lot of fun writing that. I need to mess around with the Swift source code more often.
…The Kernel source code I could take or leave.