I have a dedicated render thread with a run loop that has a CADisplayLink added to it (that's the only input source attached). The render thread has this loop in it:
while (_continueRunLoop)
{
[runLoop runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]];
}
I have some code to stop the render thread that sets _continueRunLoop to false in a block, and then does a pthread_join
on the render thread:
[_renderThreadRunLoop performBlock:^{
self->_continueRunLoop = NO;
}];
pthread_join(_renderThread, NULL);
I have noticed recently (iOS 18?) that if the Display Link is paused or invalidated before trying to stop the loop then the pthread_join blocks forever and the render thread is still sitting in the runMode:beforeDate: method. If the display link is still active then it does exit the loop, but only after one more turn of the display link callback.
The most likely explanation I can think of is there has been a behaviour change to performBlock
- I believe this used to "consume" a turn of the run loop, and exit the runMode:beforeDate
call but now it happens without leaving that function.
I can't find specific mention in the docs of the expected behaviour for performBlock - just that other RunLoop input sources cause the run method to exit, and timer sources do not. Is it possible that the behaviour has changed here?
The most likely explanation I can think of is there has been a behaviour change to performBlock - I believe this used to "consume" a turn of the run loop, and exit the runMode:beforeDate call but now it happens without leaving that function.
There has not, but that's because CFRunLoop is documented in a way that gives it plenty of room to do whatever it wants. In this case, from CFRunLoopPerformBlock:
"This method enqueues the block only and does not automatically wake up the specified run loop. Therefore, execution of the block occurs the next time the run loop wakes up to handle another input source. If you want the work performed right away, you must explicitly wake up that thread using the CFRunLoopWakeUp function."
However my bigger concern is that you're doing this at all:
I have some code to stop the render thread that sets _continueRunLoop to false in a block, and then does a pthread_join on the render thread:
In practice, returning from a runloop and destroying thread:
while (_continueRunLoop)
{
[runLoop runMode:NSDefaultRunLoopMode beforeDate:[NSDate distantFuture]];
}
...doesn't really work, at least not very well. Depending on the API usage, it either leaks mach ports (because the thread was destroyed) or the thread doesn't get destroyed (because input source leak prevents a return). The actual issue here is what was specifically called out in the documentation for NSRunLoop.runMode:beforeDate:
"...Manually removing all known input sources and timers from the run loop does not guarantee that the run loop will exit immediately. macOS may install and remove additional input sources as needed to process requests targeted at the receiver’s thread. Those sources could therefore prevent the run loop from exiting."
In practical terms, this means that outside of extremely simple usage patterns, it's basically "impossible" to control what's using a given runloop and, by extension, guarantee that NSRunLoop.runMode:beforeDate: will ever return.
More specifically, this kind of code tends to:
-
...work fine in VERY simple, highly controlled cases where you know and control EVERY input source that interacts with the runloop. In practice, VERY view of our APIs actually fit within that category.
-
... work fine in development because your usage accounts for every input source that happens to be occurring in that particular system version. More importantly when it fails to work (in development), then assumption is that the failure is "a bug", so you go to the trouble of tracking down and stopping the "extra" input source.
-
...fail in real world use because system level changes and your own codes evolution will inevitably cause "extra" source to attach.
In terms of what you do about, you basically have 3 choices:
-
Track down the "extra" input sources and stop them. This is entirely possible (though often annoying and time consuming), however, it's important to understand that this will be an ongoing maintenance task, not a one time bug. The problem you're seeing here will happen again.
-
You can use CFRunLoopStop to force the return, as shown in the code snippet I posted on this forum thread. Note that you can also enjoy My Take On RunLoops™, at topic Quinn an I have both spent many an hour trying to explain. In any case, while this does work, it also means that you'll leak mach port, which is always worth avoiding*. In practice, CFRunLoopStop can be useful in very specific circumstances** but shouldn't really be used in long lived processes/apps.
*Mach port leaks are just as annoying as the input source leak you're tracking down in #1, except you can leak a lot mach ports without noticing and when you run out basically "anything" in your app can fail.
**For example, many command line tools work by using CFRunLoopStop to break out of their main thread runloop, running any finalization/cleanup code, and then calling "exit". Mach port leaks don't matter, as exit() solves all leaks.
- Use a single, long lived runloop thread that you never destroy. This is the system preferred solution and what I would recommend as well. The resource cost of a single thread is trivial and it's far simpler architecture to manage and maintain.
The idea with #3 is that you may still need to track down input source leaks (as in #1), but the leak itself won't cause any immediate problems. That's somewhat true of #2 as well, however, #2 also tends to magnify the leak issue. Many input sources only attach to the runloop "once", which means #2 turns "my app leaks 1 mach port" into "my app leaks 1 mach port every time I create my thread".
__
Kevin Elliott
DTS Engineer, CoreOS/Hardware