Integrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.

ARKit Documentation

Post

Replies

Boosts

Views

Activity

VisionOS slow image tracking and enterprise API questions
To set the stage: I made a prototype of an app for a company, the app is to be used internally right now. Prototype runs perfectly on iOS, so now I got VP to port the app to its final destination. The first thing I found out is that the image tracking on VP is useless for moving images (and that's the core of my app). Also distance at which image is lost seems to be way shorter on VP. Now I'm trying to figure out if it's possible to fix/work around it in any way and I'm wondering if Enterprise API would change anything. So: Is it possible to request Enterprise API access as a single person with basic Apple Developer subscription? I looked around the forum and only got more confused. Does QR code detection and tracking work any better than image detection, or anchor updates are the same? Does the increased "object detection" frequency affect in any way image/QR tracking, or is it (as name implies) only for object tracking? Would increasing the CPU/GPU headroom make any change to image/QR detection frequency? Is there something to disable to make anchor updates more frequent? I don't need complex models, shadows, physics, etc. Greetings Michal
2
0
140
2d
Is this the easiest way create scene planes that allow for collision with Realitykit entities
In my Vision OS app I am using plane detection and I want to create planes that have physics I want to create an effect that my reality kit entities rest on real world detected planes. I was curious to see that the code below that I found in the Samples is the most efficient way of doing this. func processPlaneDetectionUpdates() async { for await anchorUpdate in planeTracking.anchorUpdates { let anchor = anchorUpdate.anchor if anchorUpdate.event == .removed { planeAnchors.removeValue(forKey: anchor.id) if let entity = planeEntities.removeValue(forKey: anchor.id) { entity.removeFromParent() } return } planeAnchors[anchor.id] = anchor let entity = Entity() entity.name = "Plane \(anchor.id)" entity.setTransformMatrix(anchor.originFromAnchorTransform, relativeTo: nil) // Generate a mesh for the plane (for occlusion). var meshResource: MeshResource? = nil do { let contents = MeshResource.Contents(planeGeometry: anchor.geometry) meshResource = try MeshResource.generate(from: contents) } catch { print("Failed to create a mesh resource for a plane anchor: \(error).") return } var material = UnlitMaterial(color: .red) material.blending = .transparent(opacity: .init(floatLiteral: 0)) if let meshResource { // Make this plane occlude virtual objects behind it. entity.components.set(ModelComponent(mesh: meshResource, materials: [material])) } // Generate a collision shape for the plane (for object placement and physics). var shape: ShapeResource? = nil do { let vertices = anchor.geometry.meshVertices.asSIMD3(ofType: Float.self) shape = try await ShapeResource.generateStaticMesh(positions: vertices, faceIndices: anchor.geometry.meshFaces.asUInt16Array()) } catch { print("Failed to create a static mesh for a plane anchor: \(error).") return } if let shape { entity.components.set(CollisionComponent(shapes: [shape], isStatic: true)) let physics = PhysicsBodyComponent(mode: .static) entity.components.set(physics) } let existingEntity = planeEntities[anchor.id] planeEntities[anchor.id] = entity contentEntity.addChild(entity) existingEntity?.removeFromParent() } } } extension MeshResource.Contents { init(planeGeometry: PlaneAnchor.Geometry) { self.init() self.instances = [MeshResource.Instance(id: "main", model: "model")] var part = MeshResource.Part(id: "part", materialIndex: 0) part.positions = MeshBuffers.Positions(planeGeometry.meshVertices.asSIMD3(ofType: Float.self)) part.triangleIndices = MeshBuffer(planeGeometry.meshFaces.asUInt32Array()) self.models = [MeshResource.Model(id: "model", parts: [part])] } } extension GeometrySource { func asArray<T>(ofType: T.Type) -> [T] { assert(MemoryLayout<T>.stride == stride, "Invalid stride \(MemoryLayout<T>.stride); expected \(stride)") return (0..<count).map { buffer.contents().advanced(by: offset + stride * Int($0)).assumingMemoryBound(to: T.self).pointee } } func asSIMD3<T>(ofType: T.Type) -> [SIMD3<T>] { asArray(ofType: (T, T, T).self).map { .init($0.0, $0.1, $0.2) } } subscript(_ index: Int32) -> (Float, Float, Float) { precondition(format == .float3, "This subscript operator can only be used on GeometrySource instances with format .float3") return buffer.contents().advanced(by: offset + (stride * Int(index))).assumingMemoryBound(to: (Float, Float, Float).self).pointee } } extension GeometryElement { subscript(_ index: Int) -> [Int32] { precondition(bytesPerIndex == MemoryLayout<Int32>.size, """ This subscript operator can only be used on GeometryElement instances with bytesPerIndex == \(MemoryLayout<Int32>.size). This GeometryElement has bytesPerIndex == \(bytesPerIndex) """ ) var data = [Int32]() data.reserveCapacity(primitive.indexCount) for indexOffset in 0 ..< primitive.indexCount { data.append(buffer .contents() .advanced(by: (Int(index) * primitive.indexCount + indexOffset) * MemoryLayout<Int32>.size) .assumingMemoryBound(to: Int32.self).pointee) } return data } func asInt32Array() -> [Int32] { var data = [Int32]() let totalNumberOfInt32 = count * primitive.indexCount data.reserveCapacity(totalNumberOfInt32) for indexOffset in 0 ..< totalNumberOfInt32 { data.append(buffer.contents().advanced(by: indexOffset * MemoryLayout<Int32>.size).assumingMemoryBound(to: Int32.self).pointee) } return data } func asUInt16Array() -> [UInt16] { asInt32Array().map { UInt16($0) } } public func asUInt32Array() -> [UInt32] { asInt32Array().map { UInt32($0) } } } I was also curious to know if I can do this without ARKit using SpatialTrackingSession. My understanding is that using SpatialTrackingSession in RealityKit I can only get the transforms of the AnchorEntities but it won't have geometry information to create the collision shapes.
2
0
106
3d
VisionOS 2.0 Main Camera Access Enterprise Entitlement Not Recognized in XCode
I am working on a project that requires access to the main camera on the Vision Pro. My main account holder applied for the necessary enterprise entitlement and we were approved and received the Enterprise.license file by email. I have added the Enterprise.license file to my project, and manually added the com.apple.developer.arkit.main-camera-access.allow entitlement to the entitlement file and set it to true since it was not available in the list when I tried to use the + Capability button in the Signing & Capabilites tab. I am getting an error: Provisioning profile "iOS Team Provisioning Profile: " doesn't include the com.apple.developer.arkit.main-camera-access.allow entitlement. I have checked the provisioning profile settings online, and there is no manual option for adding the main camera access entitlement, and it does not seem to be getting the approval from the license.
5
0
150
4d
Potential bug in Anchor updates on visionOS using the ARKit C API
I have an application running on visionOS 2.0 that uses the ARKit C API to create anchors and listen for updates. I am running an ARKit session with a WorldTrackingProvider (and a CameraFrameProvider, if that is relevant) Then, I am registering a callback using ar_world_tracking_provider_set_anchor_update_handler_f When updates arrive I iterate over the updated anchors using ar_world_anchors_enumerate_anchors_f. Then, as described in the https://developer.apple.com/documentation/visionos/tracking-points-in-world-space documentation, I walk around and hold down the Digital Crown to reposition the current space. This resets the world origin to my current position. When this happens, anchor updates arrive. In most cases, the anchor updates return the new transform (using ar_world_anchor_get_origin_from_anchor_transform) but sometimes I get an anchor update that reports the transform of the anchor from before the world origin was repositioned. Meaning instead of staying in place in the physical world, the world anchor moves relative to me. I can work around this by calling ar_world_tracking_provider_copy_all_world_anchors_f which provides me with the correct transform, but this async method also adds some noticeable delay to the anchor updates. Is this already a known issue?
0
0
105
6d
Cannot find the entitlement of Enterprise API for Vision pro
We are developing VisionOS app now, we have applied the Enterprise API for visionOS, including Main Camera Access for Vision Pro, and already get the "Enterprise.license" in the mail apple sent us, we use the developer account import the license file into Xcode: but in Xcode, we cannot find the entitlement of Enterprise API: if we put com.apple.developer.arkit.main-camera-access.allow into Entitlement file of the project manually,Xcode will alarm: and we find that the app itself dont have "Additional Capabilities" which include the Enterprise API: what should we do to have the entitlement file for the Enterprise API, so we can use the enterprise API?
5
1
148
6d
What does setWorldOrigin() do?
I stumbled across the function setWorldOrigin(relativeTransform:) from the ARSession which is documented here: https://developer.apple.com/documentation/arkit/arsession/2942278-setworldorigin I made a custom ARSession where i override this function and print and modify the relativeTransform parameter. The print shows that this function is called with an updated relativeTransform value but it seems that it has no impact e.g. on the world origin when starting or continuing a scan, the tiny puppet house in RoomPlan or any tracking position that i get from ARKit. Has anybody experience with this method or knows what parts are influenced by setWorldOrigin()?
0
0
108
1w
Reality Composer Project and Xcode 16
After upgrading to Xcode 16 my app, which utilizes imported project files from my iPad's Reality Composer app, now has two issues that I have found so far. I am using an ARView as a UIViewRepresentable with SwiftUI. (Prior to upgading to Xcode 16 everything worked well.) First, there are now several duplicate rcp_export.usdz resources in the "Copy Bundle Resources" build phase section. Even though each file is in a separate folder with a unique UUID, it was causing a compile error saying there are duplicate files. I was able to open the RC project folder and delete the older rcp_project versions which now allows the app to compile. I mention it as it may or may not be related to the second issue. Second, Xcode isn't generating the project code for rcproject, so when I call the RCProject.loadSceneAsync function I am getting an error that says "Cannot find 'RCProject' in scope"
0
2
138
1w
AR app crashes on iOS18: SlamAnchor.cpp:37 : HasValidPose()
We tried out our Unity-based AR app for the very first time under iOS 18 and noticed an immediate, repeatable crash. When run in Xcode 16, we get this error message: Assert: /Library/Caches/com.apple.xbs/Sources/AppleCV3D/library/VIO/CAPI/src/SlamAnchor.cpp:37 : HasValidPose() Assert: /Library/Caches/com.apple.xbs/Sources/AppleCV3D/library/VIO/CAPI/src/SlamAnchor.cpp:37 : HasValidPose() That's a blocker to us. We're using Unity 2022.3.27f1.
5
0
213
1w
RGB-D and Point Clouds in visionOS
Dear all, We are building an XR application demonstrating our research on open-vocabulary 3D instance segmentation for assistive technology. We intend on bringing it to visionOS using the new Enterprise APIs. Our method was trained on datasets resembling ScanNet which contain the following: localized (1) RGB camera frames (2) with Depth (3) and camera intrinsics (4) point cloud (5) I understand, we can query (1), (2), and (4) from the CameraFrameProvider. As for (3) and (4), it is unclear to me if/how we can obtain that data. In handheld ARKit, this example project demos how the depthMap can be used to simulate raw point clouds. However, this property doesn't seem to be available in visionOS. Is there some way for us to obtain depth data associated with camera frames? "Faking" depth data from the SceneReconstructionProvider-generated meshes is too coarse for our method. I hope I'm just missing some detail and there's some way to configure CameraFrameProvider to also deliver depth and/or point clouds. Thanks for any help or pointer in the right direction! ~ Alex
0
0
137
1w
WorldTrackingProvider Not running. Arkitsession terminated
Hi everyone I am working on a small project that requires World Anchors so that I can persist my content through whenever the user chooses to leave/close the app. However I can't manage to make my Arkit session to run even though I think all the privacy permissions have been set and allowed correctly. Here is a sample code in an empty scene: // // WorldTrackingView.swift // SH_AVP_Demo // // Created by 李希 on 9/19/24. // import SwiftUI import RealityKit import RealityKitContent //import VisionKit import ARKit import Foundation import UIKit import simd struct WorldTrackingView_test: View { @State var myCube = Entity() @Environment(.scenePhase) var myScenePhase var body: some View { RealityView { content in //Load Scene if let Scene = try? await Entity.load(named: "WorldTrackingScene", in: realityKitContentBundle){ //Add scene to the view content.add(Scene) //Look for the cube entity if let cubeEntity = Scene.findEntity(named: "Cube"){ myCube = cubeEntity // Create collission for the cube myCube.generateCollisionShapes(recursive: true) // Allow inputs to interact myCube.components.set(InputTargetComponent(allowedInputTypes: .indirect)) // set some ground shadows myCube.components.set(GroundingShadowComponent(castsShadow: true)) } } } // Add drag gesture that targets any entity in the scene .gesture(DragGesture().targetedToAnyEntity() //Do something when the cube position changes .onChanged{ value in value.entity.position = value.convert(value.location3D, from: .local, to: value.entity.parent!) myCube = value.entity // Test and see if the Arkit runs with different data providers var session = ARKitSession() var worldData = WorldTrackingProvider() let planeData = PlaneDetectionProvider() let sceneData = SceneReconstructionProvider() do { Task{ try await session.run([worldData]) for await update in worldData.anchorUpdates { switch update.event { case .added, .updated: // Update the app's understanding of this world anchor. print("Anchor position updated.") case .removed: // Remove content related to this anchor. print("Anchor position now unknown.") } } } }catch{ print("session not running \(error.localizedDescription)") return } } //At the end of the gesture save anchor .onEnded{ value in } ) } } #Preview(immersionStyle: .mixed) { WorldTrackingView() } All is does is to generate a cube in an immersive view. The cube has collision and input components added to so that I can interact with it using a drag gesture. I decided to start an arkit session with a WorldTrackingProvider() but I keep getting the following error: ARPredictorRemoteService <0x117e0c620>: Service configured with error: Error Domain=com.apple.arkit.error Code=501 "(null)" Remote Service was invalidated: <ARPredictorRemoteService: 0x117e0c620>, will stop all data_providers. ARRemoteService: remote object proxy failed with error: Error Domain=NSCocoaErrorDomain Code=4099 "The connection to service with pid 81 named com.apple.arkit.service.session was invalidated from this process." UserInfo={NSDebugDescription=The connection to service with pid 81 named com.apple.arkit.service.session was invalidated from this process.} ARRemoteService: weak self released before invalidation ARRemoteService: remote object proxy failed with error: Error Domain=NSCocoaErrorDomain Code=4099 "The connection to service with pid 81 named com.apple.arkit.service.prediction was invalidated from this process." UserInfo={NSDebugDescription=The connection to service with pid 81 named com.apple.arkit.service.prediction was invalidated from this process.} ARRemoteService: weak self released before invalidation If I switch it with a PlaneDetectionProvider() or a SceneReconstructionProvider() I get print statements in my terminal, but none if i replace it with a WorldTrackingProvider(). Any idea what could be causing this? Same code was working before a recent for xcode I believe.
1
0
140
1w
Issue: ARKit Camera Frame Provider Not Authorized in visionOS App
I’m developing a visionOS app using EnterpriseKit, and I need access to the main camera for QR code detection. I’m using the ARKit CameraFrameProvider and ARKitSession to capture frames, but I’m encountering this error when trying to start the camera stream: ar_camera_frame_provider_t: Failed to start camera stream with error: <ar_error_t Error Domain=com.apple.arkit Code=100 "App not authorized."> Context: VisionOS using EnterpriseKit for camera access and QR code scanning. My Info.plist includes necessary permissions like NSCameraUsageDescription and NSWorldSensingUsageDescription. I’ve added the com.apple.developer.arkit.main-camera-access.allow entitlement as per the official documentation here. My app is allowed camera access as shown in the logs (Authorization status: [cameraAccess: allowed]), but the camera stream still fails to start with the “App not authorized” error. I followed Apple’s WWDC 2024 sample code for accessing the main camera in visionOS from this session. Sample of My Code: import ARKit import Vision class QRCodeScanner: ObservableObject { private var arKitSession = ARKitSession() private var cameraFrameProvider = CameraFrameProvider() private var pixelBuffer: CVPixelBuffer? init() { Task { await requestCameraAccess() } } private func requestCameraAccess() async { await arKitSession.queryAuthorization(for: [.cameraAccess]) do { try await arKitSession.run([cameraFrameProvider]) } catch { print("Failed to start ARKit session: \(error)") return } let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions: [.left]) guard let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else { return } Task { for await cameraFrame in cameraFrameUpdates { guard let mainCameraSample = cameraFrame.sample(for: .left) else { continue } self.pixelBuffer = mainCameraSample.pixelBuffer // QR Code detection code here } } } } Things I’ve Tried: Verified entitlements in both Info.plist and .entitlements files. I have added the com.apple.developer.arkit.main-camera-access.allow entitlement. Confirmed camera permissions in the privacy settings. Followed the official documentation and WWDC 2024 sample code. Checked my provisioning profile to ensure it supports ARKit camera access. Request: Has anyone encountered this “App not authorized” error when accessing the main camera via ARKit in visionOS using EnterpriseKit? Are there additional entitlements or provisioning profile configurations I might be missing? Any help would be greatly appreciated! I haven't seen any official examples using new API for main camera access and no open source examples either.
1
1
161
1w
Add new joint to ARKit skeleton
Hi everyone, I want to add new joint in addition to joints that provided by ARKit. for example extract the position of wrist and elbow, then add new joint between them in the middle of arm. I can't find a good documentation that can explain ARKit very well. If there is another information that I can use, please share it with me. thanks.
0
0
163
2w
Multiple floor levels in one story
In lots of houses there are different levels but are still on the same floor. What i mean is that there are things like stairs on the entrance that only have a few steps and would count basically as the same story. RoomPlan already does a nice job recognizing them during the scanning but after the StructureBuilder or the optimization step it is not really satisfying. Has anyone managed to handle those cases? Or do you have to scan a specific way to capture such small differences within a level?
0
0
138
2w
Overwhelming shutter sound when capturing HD Image from ARFrame
Hello, I'm using the ARSessionDelegate function: func session(_ session: ARSession, didUpdate frame: ARFrame) to extract an HD Image let hdframe = try? await session.captureHighResolutionFrame().capturedImage which I am later on use to detect text on the image using VN. I'm using the HD picture, because the text bits I'm looking for can be very tiny. let requestHandler = VNImageRequestHandler(cgImage: image)//, orientation: .up, options: [:]) let textRequest = VNRecognizeTextRequest() let vnRequests = [textRequest] try requestHandler.perform(vnRequests) My issue is that, each time a captured HD image is extracted from the AR Scene, a shutter sound is played. I'm aware that shutter sounds are important for privacy, but I'm doing this in a very high frequency, which means that my app is currently unusable, when not muted. My two questions are: Is there any way to disable the sound in this case? Is there a better way to constantly scan the AR video stream for text than this approach?
0
0
120
2w