Render advanced 3D graphics and perform data-parallel computations using graphics processors using Metal.

Posts under Metal tag

200 Posts
Sort by:

Post

Replies

Boosts

Views

Activity

MultiThreaded rendering with actor
Hi, I'm trying to modify the ScreenCaptureKit Sample code by implementing an actor for Metal rendering, but I'm experiencing issues with frame rendering sequence. My app workflow is: ScreenCapture -> createFrame -> setRenderData Metal draw callback -> renderAsync (getData from renderData) I've added timestamps to verify frame ordering, I also using binarySearch to insert the frame with timestamp, and while the timestamps appear to be in sequence, the actual rendering output seems out of order. // ScreenCaptureKit sample func createFrame(for sampleBuffer: CMSampleBuffer) async { if let surface: IOSurface = getIOSurface(for: sampleBuffer) { await renderer.setRenderData(surface, timeStamp: sampleBuffer.presentationTimeStamp.seconds) } } class Renderer { ... func setRenderData(surface: IOSurface, timeStamp: Double) async { _ = await renderSemaphore.getSetBuffers( isGet: false, surface: surface, timeStamp: timeStamp ) } func draw(in view: MTKView) { Task { await renderAsync(view) } } func renderAsync(_ view: MTKView) async { guard await renderSemaphore.beginRender() else { return } guard let frame = await renderSemaphore.getSetBuffers( isGet: true, surface: nil, timeStamp: nil ) else { await renderSemaphore.endRender() return } guard let texture = await renderSemaphore.getRenderData( device: self.device, surface: frame.surface) else { await renderSemaphore.endRender() return } guard let commandBuffer = _commandQueue.makeCommandBuffer(), let renderPassDescriptor = await view.currentRenderPassDescriptor, let renderEncoder = commandBuffer.makeRenderCommandEncoder(descriptor: renderPassDescriptor) else { await renderSemaphore.endRender() return } // Shaders .. renderEncoder.endEncoding() commandBuffer.addCompletedHandler() { @Sendable (_ commandBuffer)-> Swift.Void in updateFPS() } // commit frame in actor let success = await renderSemaphore.commitFrame( timeStamp: frame.timeStamp, commandBuffer: commandBuffer, drawable: view.currentDrawable! ) if !success { print("Frame dropped due to out-of-order timestamp") } await renderSemaphore.endRender() } } actor RenderSemaphore { private var frameBuffers: [FrameData] = [] private var lastReadTimeStamp: Double = 0.0 private var lastCommittedTimeStamp: Double = 0 private var activeTaskCount = 0 private var activeRenderCount = 0 private let maxTasks = 3 private var textureCache: CVMetalTextureCache? init() { } func initTextureCache(device: MTLDevice) { CVMetalTextureCacheCreate(kCFAllocatorDefault, nil, device, nil, &self.textureCache) } func beginRender() -> Bool { guard activeRenderCount < maxTasks else { return false } activeRenderCount += 1 return true } func endRender() { if activeRenderCount > 0 { activeRenderCount -= 1 } } func setTextureLoaded(_ loaded: Bool) { isTextureLoaded = loaded } func getSetBuffers(isGet: Bool, surface: IOSurface?, timeStamp: Double?) -> FrameData? { if isGet { if !frameBuffers.isEmpty { let frame = frameBuffers.removeFirst() if frame.timeStamp > lastReadTimeStamp { lastReadTimeStamp = frame.timeStamp print(frame.timeStamp) return frame } } return nil } else { // Set let frameData = FrameData( surface: surface!, timeStamp: timeStamp! ) // insert to the right position let insertIndex = binarySearch(for: timeStamp!) frameBuffers.insert(frameData, at: insertIndex) return frameData } } private func binarySearch(for timeStamp: Double) -> Int { var left = 0 var right = frameBuffers.count while left < right { let mid = (left + right) / 2 if frameBuffers[mid].timeStamp > timeStamp { right = mid } else { left = mid + 1 } } return left } // for setRenderDataNormalized func tryEnterTask() -> Bool { guard activeTaskCount < maxTasks else { return false } activeTaskCount += 1 return true } func exitTask() { activeTaskCount -= 1 } func commitFrame(timeStamp: Double, commandBuffer: MTLCommandBuffer, drawable: MTLDrawable) async -> Bool { guard timeStamp > lastCommittedTimeStamp else { print("Drop frame at commit: \(timeStamp) <= \(lastCommittedTimeStamp)") return false } commandBuffer.present(drawable) commandBuffer.commit() lastCommittedTimeStamp = timeStamp return true } func getRenderData( device: MTLDevice, surface: IOSurface, depthData: [Float] ) -> (MTLTexture, MTLBuffer)? { let _textureName = "RenderData" var px: Unmanaged<CVPixelBuffer>? let status = CVPixelBufferCreateWithIOSurface(kCFAllocatorDefault, surface, nil, &px) guard status == kCVReturnSuccess, let screenImage = px?.takeRetainedValue() else { return nil } CVMetalTextureCacheFlush(textureCache!, 0) var texture: CVMetalTexture? = nil let width = CVPixelBufferGetWidthOfPlane(screenImage, 0) let height = CVPixelBufferGetHeightOfPlane(screenImage, 0) let result2 = CVMetalTextureCacheCreateTextureFromImage( kCFAllocatorDefault, self.textureCache!, screenImage, nil, MTLPixelFormat.bgra8Unorm, width, height, 0, &texture) guard result2 == kCVReturnSuccess, let cvTexture = texture, let mtlTexture = CVMetalTextureGetTexture(cvTexture) else { return nil } mtlTexture.label = _textureName let depthBuffer = device.makeBuffer(bytes: depthData, length: depthData.count * MemoryLayout<Float>.stride)! return (mtlTexture, depthBuffer) } } Above's my code - could someone point out what might be wrong?
4
0
107
4h
macOS 15.x crashes in MetalPerformanceShadersGraph
In our app we use CoreML. But ever since macOS 15.x was released we started to get a great bunch of crashes like this: Incident Identifier: 424041c3-884b-4e50-bb5a-429a83c3e1c8 CrashReporter Key: B914246B-1291-4D44-984D-EDF84B52310E Hardware Model: Mac14,12 Process: <REMOVED> [1509] Path: /Applications/<REMOVED> Identifier: com.<REMOVED> Version: <REMOVED> Code Type: arm64 Parent Process: launchd [1] Date/Time: 2024-11-13T13:23:06.999Z Launch Time: 2024-11-13T13:22:19Z OS Version: Mac OS X 15.1.0 (24B83) Report Version: 104 Exception Type: SIGABRT Exception Codes: #0 at 0x189042600 Crashed Thread: 36 Thread 36 Crashed: 0 libsystem_kernel.dylib 0x0000000189042600 __pthread_kill + 8 1 libsystem_c.dylib 0x0000000188f87908 abort + 124 2 libsystem_c.dylib 0x0000000188f86c1c __assert_rtn + 280 3 Metal 0x0000000193fdd870 MTLReportFailure.cold.1 + 44 4 Metal 0x0000000193fb9198 MTLReportFailure + 444 5 MetalPerformanceShadersGraph 0x0000000222f78c80 -[MPSGraphExecutable initWithMPSGraphPackageAtURL:compilationDescriptor:] + 296 6 Espresso 0x00000001a290ae3c E5RT::SharedResourceFactory::GetMPSGraphExecutable(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, NSDictionary*) + 932 . . . 43 CoreML 0x0000000192d263bc -[MLModelAsset modelWithConfiguration:error:] + 120 44 CoreML 0x0000000192da96d0 +[MLModel modelWithContentsOfURL:configuration:error:] + 176 45 <REMOVED> 0x000000010497b758 -[<REMOVED> <REMOVED>] (<REMOVED>) No similar crashes on macOS 12-14! MetalPerformanceShadersGraph.log Any clue what is causing this? Thanks! :)
0
0
109
1d
how to get a null acceleration structure w/o trigging an API validation error
I want to turn off my ray-tracing conditionally. There's is_null_acceleration_structure but when I don't bind an acceleration structure (or pass nil to setFragmentAccelerationStructure), I get the following API validation error: -[MTLDebugRenderCommandEncoder validateCommonDrawErrors:]:5782: failed assertion `Draw Errors Validation Fragment Function(vol_deferred_lighting): missing instanceAccelerationStructure binding at index 6 for accelerationStructure[0]. I can turn off API validation and it works, but it seems like I should be able to use nil for the acceleration structure w/o triggering a validation error. Seems like a bug, right? I suppose I can work around this by creating a separate pipeline with the ray-tracing disabled via a function constant instead of using is_null_acceleration_structure. (Can we get a ray-tracing tag for questions?)
1
0
82
1d
MTKView delegate ownership during view controller transitions
The Problem When transitioning between view controllers that each have their own MTKView but share a Metal renderer backend, we run into delegate ownership conflicts. Only one MTKView can successfully render at a time, since setting the delegate on one view requires removing it from the other, leading to paused views during transitions. For my app, I need to display the same visuals across multiple views and have them all render correctly. Current Implementation Approach I've created a container object that manages the MTKView and its relationship with the shared renderer: class RenderContainer { let metalView: MTKView private let renderer: MetalRenderer func startRendering() { metalView.delegate = renderer metalView.isPaused = false } func stopRendering() { metalView.isPaused = true metalView.delegate = nil } } View controllers manage the rendering lifecycle in their view appearance methods: override func viewWillAppear(_ animated: Bool) { super.viewWillAppear(animated) renderContainer.startRendering() } override func viewWillDisappear(_ animated: Bool) { super.viewWillDisappear(animated) renderContainer.stopRendering() } Observations & Issues During view controller transitions, one MTKView must stop rendering before the other can start. Also there is no guarantee that the old view will stop rendering before the new one starts, with the current API design. This creates a visual "pop" during animated transitions Setting isPaused = true helps prevent unnecessary render calls but doesn't solve the core delegate ownership problem The shared renderer maintains its state but can only output to one view at a time Questions What's the recommended approach for handling MTKView delegate ownership during animated transitions? Are there ways to maintain visual continuity without complex view hierarchies? Should I consider alternative architectures for sharing the Metal content between views? Any insights for this scenario would be appreciated.
0
0
66
1d
ARSCNView ignores output of SCNTechnique (sometimes)
I am using SCNTechnique in combination with ARSCNView. The technique is doing so minor post-processing. I have written several filter variant for this post-processing, but I'm facing an issue when with one of the filters/fragment shaders, SCNTechnique discards my output and just presents the plain camera feed on screen instead. This is clearly visible in the Metal pipeline, using the GPU frame debugger. Let me stress that my setup works for 90% of my filters, but not this one and I want to know why. iOS 18.1, iPhone 13 Mini. Xcode 16.1. Encoder 0 & 1 are injected by the system. Render encoder 2 & 3 correspond to my SCNTechnique's render passes: one to manipulate pixel data (darken it in this case) and another to BLIT it back to the main texture. I know the separate buffer is not strictly for this particular operation, but it shouldn't matter. Note that the issue occurs in encoder 4 (not mine but ARKit's). In Render Encoder 4, scn_postprocess_AR_fragment handle my texture (#0, ending in f980) and another from the camera feed (Texture 2). I know this pass is typically used for grain because that's what it used to do before I disabled grain on ARSCNView (+ the buffer still contains grain paramaters). I have other post-processing filters that work just fine. By what magic is ARKit determining to use Texture 2 instead of my Texture 0? Sure, I could keep digging into the minute differences between my shaders to find out which LoC affects how some ARKit shader down the line operates, but it's awfully opaque so far.
0
0
122
4d
Which Apple technologies to use for simple 2d motion graphics software?
I plan to create a simple motion graphics software for macOS that animates text, basic shapes, and handles audio. I'll use SwiftUI for the UI. What are the commonly used technologies for rendering animated graphics? Core Animation is suitable for UI animations but not for exporting and controlling UI animations. Basic requirements: Timeline user interface Animation of text and basic shapes Viewer in SwiftUI GUI with transport control (play, pause, scrub, …) Export to video file Is Metal or Core Graphics typically used directly? I want to keep it as simple as possible.
0
0
150
6d
Rendering YCbCr input using Metal
I would like to take YCbCr CVPixelBuffers from AVCaptureVideoDataOutput, apply some processing in RGB space, render to an MTKView, and pass to AVAssetWriter for recording. Right now, I'm doing this all manually – deswing the incoming data if necessary, choose the right matrix to convert to RGB, apply processing, etc. I also have to convert back to YCbCr before feeding the frames to AVAssetWriter because encoding performs much better if I do. Is there any efficient, built-in way to achieve the same? I can't use AVCaptureVideoPreviewLayer, since I need to do some further processing before display. I can't use AVCaptureVideoDataOutput's videoSettings to get automatic BGRA conversion because that would lose bit depth for 10 bit video formats (and isn't available on all formats anyway). I see these Accelerate functions, but they seemingly don't use the GPU, nor do they support all the formats and bit depths I'd need. I found reference to some undocumented MTLPixelFormats that seem to do exactly what I want, but I don't want to rely on something like this unless it's explicitly endorsed. This would also incur an RGB/YCbCr conversion on every texture read and write, right? Is there anything I'm missing here?
0
0
164
1w
OS choosing performance state poorly for GPU use case
I am building a MacOS desktop app (https://anukari.com) that is using Metal compute to do real-time audio/DSP processing, as I have a problem that is highly parallelizable and too computationally expensive for the CPU. However it seems that the way in which I am using the GPU, even when my app is fully compute-limited, the OS never increases the power/performance state. Because this is a real-time audio synthesis application, it's a huge problem to not be able to take advantage of the full clock speeds that the GPU is capable of, because the app can't keep up with real-time. I discovered this issue while profiling the app using Instrument's Metal tracing (and Game tracing) modes. In the profiling configuration under "Metal Application" there is a drop-down to select the "Performance State." If I run the application under Instruments with Performance State set to Maximum, it runs amazingly well, and all my problems go away. For comparison, when I run the app on its own, outside of Instruments, the expensive GPU computation it's doing takes around 2x as long to complete, meaning that the app performs half as well. I've done a ton of work to micro-optimize my Metal compute code, based on every scrap of information from the WWDC videos, etc. A problem I'm running into is that I think that the more efficient I make my code, the less it signals to the OS that I want high GPU clock speeds! I think part of why the OS is confused is that in most use cases, my computation can be done using only a small number of Metal threadgroups. I'm guessing that the OS heuristics see that only a small fraction of the GPU is saturated and fail to scale up the power/clock state. I'm not sure what to do here; I'm in a bit of a bind. One possibility is that I intentionally schedule busy work -- spin threadgroups just to waste energy and signal to the OS that I need higher clock speeds. This is obviously a really bad idea, but it might work. Is there any other (better) way for my app to signal to the OS that it is doing real-time latency-sensitive computation on the GPU and needs the clock speeds to be scaled up? Note that game mode is not really an option, as my app also runs as an AU plugin inside hosts like Garageband, so it can't be made fullscreen, etc.
3
0
233
1w
Cannot use Metal graphics overview HUD with multiple CAMetalLayers
I have multiple CAMetalLayers that I render content to and noticed that the graphics overview HUD does not function properly when I have more than one CAMetalLayer. The values reported will be very strange. For example, FPS may report 999 or some large negative value. It the HUD simply not designed to work with multiple CAMetalLayers or MTKViews? When I disable all but one of my CAMetalLayers, the HUD works as expected.
1
0
202
20h
Metal Inline Functions
Hi! How to define and call an inline function in Metal? Or simple function that will return some value. Case: inline uint index4D(constant _4D& shape, constant uint& n, constant uint& c, constant uint& h, constant uint& w) { return n * shape.C * shape.H * shape.W + c * shape.H * shape.W + h * shape.W + w; } When I call it in my kernel function I get No matching function for call error. Thx in advance.
2
0
176
2w
Normally distributed MPSMatrixRandom number generation generates NaN
When generating large arrays of random numbers, NaNs show up. They also show up at the same indices when using the same seed, leading me to believe that this is a bug with MPSMatrixRandom's normally distributed Float32 random number distribution. Happens with both Philox and MTGP32. Is this intentional and how do I work around this? See the original post for a MWE in Swift and Julia: https://github.com/JuliaGPU/Metal.jl/issues/474
0
1
178
3w
Metal and NVIDIA graphic driver
Hi, A user sent us a crash report that indicates an error occurring just after loading the default Metal library of our app. Application Specific Information: Crashing on exception: *** -[__NSArrayM objectAtIndex:]: index 0 beyond bounds for empty array The report pointed me to these (simplified) lines of codes in the library setup: _vertexFunctions = [[NSMutableArray alloc] init]; _fragmentFunctions = [[NSMutableArray alloc] init]; id<MTLLibrary> library = [device newDefaultLibrary]; 2 vertex shaders and 5 fragment shaders are then loaded and stored in these two arrays using this method: -(BOOL) addShaderNamed:(NSString *)name library:(id<MTLLibrary>)library isFragment:(BOOL)isFragment { id shader = [library newFunctionWithName:name]; if (!shader) { ALOG(@"Error : Unable to find the shader named : “%@”", name); return NO; } [(isFragment ? _fragmentFunctions : _vertexFunctions) addObject:shader]; return YES; } As you can see, the arrays are not filled if the method fails... however, a few lines later, they are used without checking if they are really filled, and that causes the crash... But this coding error doesn't explain why no shader of a certain type (or both types) have been added to the array, meaning: why -newFunctionWithName: returned nil for all given names (since the implied array appears completely empty)? Clue This error has only be detected once by a user running the app on macOS 10.13 with a NVIDIA Web Driver instead of the default macOS graphic driver. Moreover, it wasn't possible to reproduce the problem on the same OS using the native macOS driver. So my question is: is it some known conflicts between NVIDIA drivers and the use of Metal libraries? Or does this case would require some specific options in the Metal implementation? Any help appreciated, thanks!
0
0
187
3w
Resolution for Games
Hi, When using a High Definition Display, is there a way to render at exactly the target resolution on the physical screen? My understanding is that the default behavior is to render to a backing store with a resolution (in pixels) which can be twice the size of the logical resolution (in points). Then we let the OS handle the down-scaling to the actual target resolution on the screen. This is all nice for non-graphics intensive apps, but it means that my game will render at a higher resolution than needed, which seems like an obvious loss of performance. My expectation is that, for graphics intensive application such as games, we should be able to query and render to the final resolution on the display. Can it / should it be done? Thank you for your help :) FYI I did find a document which explains how to setup your CAMetalLayer to render at a custom resolution. I suspect that this may be what I have to do?
2
0
377
3w
Proper way of handing opening ImmersiveSpace?
if you check the code here, https://developer.apple.com/documentation/compositorservices/interacting-with-virtual-content-blended-with-passthrough var body: some Scene { ImmersiveSpace(id: Self.id) { CompositorLayer(configuration: ContentStageConfiguration()) { layerRenderer in let pathCollection: PathCollection do { pathCollection = try PathCollection(layerRenderer: layerRenderer) } catch { fatalError("Failed to create path collection \(error)") } let tintRenderer: TintRenderer do { tintRenderer = try TintRenderer(layerRenderer: layerRenderer) } catch { fatalError("Failed to create tint renderer \(error)") } Task(priority: .high) { @RendererActor in Task { @MainActor in appModel.pathCollection = pathCollection appModel.tintRenderer = tintRenderer } let renderer = try await Renderer(layerRenderer, appModel, pathCollection, tintRenderer) try await renderer.renderLoop() Task { @MainActor in appModel.pathCollection = nil appModel.tintRenderer = nil } } layerRenderer.onSpatialEvent = { pathCollection.addEvents(eventCollection: $0) } } } .immersionStyle(selection: .constant(appModel.immersionStyle), in: .mixed, .full) .upperLimbVisibility(appModel.upperLimbVisibility) the only way it's dealing with the error is fatalError. And don't think I can throw anything or return anything else? Is there a way I can gracefully handle this and show a message box in UI? I was hoping I could somehow trigger a failure and have https://developer.apple.com/documentation/swiftui/openimmersivespaceaction return fail. but couldn't find a nice way to do so. Let me know if you have ideas.
1
0
249
3w
CompositorServices Or RealityKit
I have been concentrating on developing the visionOS application. While I am currently quite familiar with RealityKit, CompositorServices has also captured my attention. I have not yet acquired knowledge of CompositorServices. Could you please clarify whether it is essential for me to learn CompositorServices? Additionally, I would appreciate it if you could provide insights into the advantages of RealityKit and CompositorServices.
1
0
266
Oct ’24