Since I last talked about Mach ports, Mac OS X has changed a bit. In this essay, rather than investigating a particular issue, I will talk about some changes to the OS X Mach IPC layer.
bootstrap_register
When a new task is created on OS X, it is given a set of special Mach ports. Among these are its host port, which represents the machine on which the task is running; its task port, which is self-referential; and the bootstrap port, which is a connection to the bootstrap server. The bootstrap server provides a port namespace, in which tasks can register their own ports, which other tasks can look up and send messages to. Think of the bootstrap server as a telephone directory: a task can place a known, named value to correspond to a Mach port on which that task is listening.
To register a service with the bootstrap server, a task could use the bootstrap_register()
function, which takes a string name and the Mach port to associate with it. However Apple deprecated this function in 10.5 and recommended using launchd instead. There is a long thread about this on the darwin-dev mailing list, which largely centers around one problem: how to connect a parent task to a child.
The way one creates a process on OS X is to use the fork()
system call, which is managed by the BSD part of XNU, since most of process management is handled via the BSD mechanisms for POSIX compatibility. The Mach process creation facility, task_create()
, has been disabled since 10.5, because too many other system calls assumed the presence of a BSD process in an execution context. Mach ports are not inherited across fork()
(unlike file descriptors), so passing ports to child processes requires some work. A typical way of passing a port from a parent was to create a port in the parent process, register it using a name that could be passed to the child, fork/exec, and then grab the port using the shared name. With the deprecation of bootstrap_register()
, though, a new way to pass ports needed to be found.
A simple replacement is found in the -[NSMachBootstrapServer registerPort:name:]
API, which wraps a private bootstrap_register2()
function. This is what Chrome uses, but it has the unfortunate characteristic of not being plain C. If the application has an installer, you can use launchd and its configuration directives to create the bootstrap server entry for you; this is what Apple recommends, but it requires additional setup and it cannot be done dynamically.
mach_ports_register()
While scanning the Mach system calls for other possibilities to hand off a port to a child during fork()
, I found mach_ports_register()
. This function is documented as taking an array of ports that are passed to the child during task_create()
. The child could then look these up using mach_ports_lookup()
. This seemed incredibly promising, but my tests revealed that it did not work.
In the kernel, when the fork()
system call is handled, it calls fork_create_child()
[xnu/bsd/kern/kern_fork.c], which then itself calls task_create_internal()
and then ipc_task_init()
[xnu/osfmk/kern/ipc_tt.c]. Reading through this chain, it’s apparent that mach_register_ports()
does in fact work, and it places up to TASK_PORT_REGISTER_MAX
(3) ports into the itk_registered
field of the task. During ipc_task_init()
, if a parent task is present, it copies the send rights into the child task. This looked very promising, so it was unclear why this was not working.
A quick use of the dtrace command revealed why:
$ sudo dtrace -n 'fbt::mach_ports_register:entry { ustack() }' -c ./parent
CPU ID FUNCTION:NAME
4 266220 mach_ports_register:entry
libsystem_kernel.dylib`mach_msg_trap+0xa
libsystem_kernel.dylib`mach_ports_register+0x70
parent`main+0x116
libdyld.dylib`start
parent`0x1
4 266220 mach_ports_register:entry
libsystem_kernel.dylib`mach_msg_trap+0xa
libsystem_kernel.dylib`mach_ports_register+0x70
libxpc.dylib`xpc_atfork_prepare+0x2b
libSystem.B.dylib`libSystem_atfork_prepare+0x9
libsystem_c.dylib`fork+0xc
parent`main+0x138
libdyld.dylib`start
parent`0x1
And there’s the problem… mach_ports_register()
works just fine, but the system, specifically libC and libxpc, clobber the registered ports as part of the fork()
routine. After my program parent
calls mach_ports_register()
, hooks in the libC fork()
syscall wrapper call out to libSystem_atfork_prepare()
, which then runs xpc_atfork_prepare()
. XPC is not open source (I’ve filed rdar://problem/11192369 requesting it), but my guess is that it registers a special port for its own use. A disassembly of /usr/lib/system/libxpc.dylib reveals that:
_xpc_atfork_prepare:
000000000000d596 pushq %rbp
000000000000d597 movq %rsp, %rbp
000000000000d59a pushq %r14
000000000000d59c pushq %rbx
000000000000d59d subq $16, %rsp
000000000000d5a1 movl _xpc_bootstrap_port(%rip), %eax
000000000000d5a7 movl %eax, -20(%rbp)
000000000000d5aa movq 89031(%rip), %rax
000000000000d5b1 movl (%rax), %edi
000000000000d5b3 leaq -20(%rbp), %rsi
000000000000d5b7 movl $1, %edx
000000000000d5bc callq 0x1704e ## symbol stub for: _mach_ports_register
XPC sets its bootstrap port (which is set up in an XPC initialization routine as part of pre-main()
initialization) as the array of ports to be registered for a child, overwriting anything that was set up pre-fork()
by the parent.
I tested whether mach_ports_register()
worked on 10.6, prior to the introduction of XPC, and it does pass the port down to the child as expected. You can find the sample code here. You can build and test using make && ./parent
. I filed rdar://problem/15417334 as a regression, but Apple responded by saying, “This API was never shareable, and now the system owns it” (whatever that means).
With mach_ports_register()
broken, the only other way to pass ports from a parent to a child is to use the special ports, declared in /usr/include/mach/task_special_ports.h. The special ports are connections to various system services, and they are also inherited across fork()
. To pass a port from parent to child, this is how to do it using special ports:
task_get_special_port()
and stores it in a local variable.mach_port_allocate()
that it wants to pass to its child and sets the new port as that special port with task_set_special_port()
. Alternatively, it could use mach_port_insert_right()
to create an additional send right for the child on an existing port.The parent and child are now connected over a port across fork()
, with the parent holding the receive end of a port and the child having a send right to it.
Until 10.6, there was no way to pass a file descriptor over Mach IPC. This was likely a result of joining two kernels, BSD and Mach, into XNU, and the POSIX layer of BSD is only coupled to the Mach part in specific places. Starting in OS X 10.6 however, Apple introduced a new way to bridge these parts of the kernel using two private APIs, one to “convert” an FD to a port right, and another to get the underlying FD back. These SPIs should not be used, since they are not public and are liable to change, but the underlying concept is interesting and worth discussing.
fileport_makeport
takes a file descriptor and an out-pointer to a new Mach port to be created. In the kernel, this creates a new port and allocates a send right to it, then associates the fileglob
(the kernel structure that backs an fd) with the port. Since the fd is now a Mach name-right, it can be sent as a descriptor to another process in a Mach message.
When another process receives a fileport, it can call fileport_makefd()
to convert the send right back into a file descriptor. Since file descriptors represent files, pipes, and sockets, this is a convenient way to use POSIX conventions across Mach IPC.