We've recently updated a view which displays photos via a CoreImage chain from a NSOpenGLView subclass to a NSView with a backing CAMetalLayer.
Things are mostly working fine, but we occasionally hit a deadlock involving CALayer
and CIMetalCommandQueue
. I've made a spindump, it appears none of our code is involved in the locked threads. Despite this, I'm assuming the problem is ours 😅
I saw the mention in the CAMetalLayer documentation about releasing drawables with an @autoreleasepool
in drawRect
, we have done this and I can't find any places we're retaining a drawable outside drawRect
.
https://developer.apple.com/documentation/quartzcore/cametallayer?language=objc
I am seeing this on macOS 15.0.1, M2 Max MacBookPro. We haven't seen it on macOS 14.x but it may be luck as we have not tested much on that OS.
I don't know how to move forward debugging this, any help much appreciated!
The two locking threads in the spindump are MainThread
and CI::RenderCompletionQueue
.
Thread 0xb3b0f8 DispatchQueue "com.apple.main-thread"(1)
…
CA::Layer::commit_if_needed(CA::Transaction*, void (CA::Layer*, unsigned int, unsigned int) block_pointer) + 364 (QuartzCore + 178484) [0x1a5dba934]
invocation function for block in CA::Context::commit_transaction(CA::Transaction*, double, double*) + 176 (QuartzCore + 1782676) [0x1a5f42394]
-[CALayer(CALayerPrivate) _copyRenderLayer:layerFlags:commitFlags:] + 720 (QuartzCore + 179304) [0x1a5dbac68]
-[NSImage(CALayerSupport) CA_copyRenderValue] + 52 (AppKit + 1517960) [0x1a0fe0988]
-[NSImage CGImageForProposedRect:context:hints:] + 440 (AppKit + 1246368) [0x1a0f9e4a0]
-[NSImage _usingBestRepresentationForRect:context:hints:body:] + 148 (AppKit + 1247980) [0x1a0f9eaec]
__48-[NSImage CGImageForProposedRect:context:hints:]_block_invoke + 80 (AppKit + 1248792) [0x1a0f9ee18]
-[NSCIImageRep CGImageForProposedRect:context:hints:] + 112 (AppKit + 6200292) [0x1a1457be4]
+[CIContext contextWithOptions:] + 40 (CoreImage + 549532) [0x1a8df129c]
-[CIContext initWithOptions:] + 588 (CoreImage + 65744) [0x1a8d7b0d0]
+[CIContext(Internal) internalContextWithMTLDevice:options:] + 76 (CoreImage + 66568) [0x1a8d7b408]
CIMetalCommandQueueCreate + 52 (CoreImage + 66692) [0x1a8d7b484]
-[CaptureMTLDevice newCommandQueue] + 168 (GPUToolsCapture + 130200) [0x1029e7c98]
-[CaptureMTLCommandQueue initWithBaseObject:captureDevice:] + 204 (GPUToolsCapture + 799812) [0x102a8b444]
GTMTLGuestAppClientAddMTLCommandQueueInfo + 108 (GPUToolsCapture + 313572) [0x102a148e4]
__ulock_wait2 + 8 (libsystem_kernel.dylib + 60540) [0x19d24bc7c]
*??? (kernel.release.t6020 + 6102048) [0xfffffe0008cd5c20] (blocked by turnstile waiting for Phocus [11343] [unique pid 1001657] thread 0xb41b08 - part of a deadlock)
and
Thread 0xb41b08 DispatchQueue "CI::RenderCompletionQueue"(535) 1000 samples (1-1000) priority 46 (base 46)
start_wqthread + 8 (libsystem_pthread.dylib + 52464) [0x1035f4cf0]
_pthread_wqthread + 288 (libsystem_pthread.dylib + 20736) [0x1035ed100]
_dispatch_workloop_worker_thread + 580 (libdispatch.dylib + 129956) [0x1026afba4]
_dispatch_root_queue_drain_deferred_wlh + 652 (libdispatch.dylib + 133360) [0x1026b08f0]
_dispatch_lane_invoke + 468 (libdispatch.dylib + 68516) [0x1026a0ba4]
_dispatch_lane_serial_drain + 860 (libdispatch.dylib + 64160) [0x10269faa0]
_dispatch_client_callout + 20 (libdispatch.dylib + 26788) [0x1026968a4]
_dispatch_call_block_and_release + 32 (libdispatch.dylib + 19300) [0x102694b64]
CI::Object::unref() const + 120 (CoreImage + 35360) [0x1a8d73a20]
CI::MetalContext::~MetalContext() + 16 (CoreImage + 192260) [0x1a8d99f04]
CI::MetalContext::~MetalContext() + 236 (CoreImage + 192536) [0x1a8d9a018]
-[CaptureMTLCommandQueue dealloc] + 44 (GPUToolsCapture + 797916) [0x102a8acdc]
GTMTLGuestAppClientRemoveMTLCommandQueueInfo + 236 (GPUToolsCapture + 314240) [0x102a14b80]
GTMTLGuestAppClient_allCaptureObjectsUnsafe + 392 (GPUToolsCapture + 298776) [0x102a10f18]
AllMetalLayers + 64 (GPUToolsCapture + 518224) [0x102a46850]
MakeLayerInfos + 320 (GPUToolsCapture + 518608) [0x102a469d0]
-[CALayer frame] + 88 (QuartzCore + 74624) [0x1a5da1380]
__ulock_wait2 + 8 (libsystem_kernel.dylib + 60540) [0x19d24bc7c]
*??? (kernel.release.t6020 + 6102048) [0xfffffe0008cd5c20] (blocked by turnstile waiting for Phocus [11343] [unique pid 1001657] thread 0xb3b0f8 - part of a deadlock)