Hi, My camera access method is like this
func processCameraUpdates() async {
print("Process camera called")
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left])
guard let cameraFrameUpdates =
cameraFrameProvider.cameraFrameUpdates(for: formats[0]) else {
return
}
for await cameraFrame in cameraFrameUpdates {
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
continue
}
pixelBuffer = mainCameraSample.pixelBuffer
print("Pixel buffer updated)")
}
}
In my ImmersiveSpace I am calling that method in this way
task {
// Main camera access
await placeManager.processCameraUpdates()
}
This works fine as long as app is in active / opened / foreground. Once I close the app and re-open, I cannot capture any image. What am I missing here? Do I need to do something when scene become active?
ARKit
RSS for tagIntegrate iOS device camera and motion features to produce augmented reality experiences in your app or game using ARKit.
Post
Replies
Boosts
Views
Activity
In this code:
https://developer.apple.com/documentation/visionos/incorporating-real-world-surroundings-in-an-immersive-experience
It contains a physical collision reaction between virtual objects and the real world, which is realized by creating a grid with physical components. However, I don't understand the information in the document very well. Who can give me a solution? Thank you!
The structure builder provides walls and floors for each captured story, but not a ceiling. For my case it is necessary that the scanned geometry is closed to open up the possibility to place objects on the ceiling for example and therefore it is important that there is an estimated ceiling for different rooms within a story.
Is there any info that apple has something like this on the roadmap in the future because i think this can open opportunities especially when thinking about industrial application of the API.
If somebody has more insights on this topic pls share :)
Hi everyone,
I'm working on an AR application where I need to accurately locate the center of the pupil and measure anatomical distances between the pupil and eyelids. I’ve been using ARKit’s face tracking, but I’m having trouble pinpointing the exact center of the pupil.
My Questions:
Locating Pupil Center in ARKit: Is there a reliable way to detect the exact center of the pupil using ARKit? If so, how can I achieve this?
Framework Recommendation: Given the need for fine detail in measurements, would ARKit be sufficient, or would it be better to use the Vision framework for more accurate 2D facial landmark detection? Alternatively, would a hybrid approach, combining Vision for precision and ARKit for 3D tracking, be more effective?
What I've Tried:
Using ARKit’s ARFaceAnchor to detect face landmarks, but the results for the pupil position seem imprecise for my needs.
Considering Vision for 2D detection, but concerned about integrating it into a 3D AR experience.
Any insights, code snippets, or guidance would be greatly appreciated!
Thanks in advance!
I have noticed the enterprise api of main camera access can only use to develop. I need to test by testflight and deliver on ABM. So when can I do this?
I am developing an app based on visionOS and need to utilize the main camera access provided by the Enterprise API. I have applied for an enterprise license and added the main camera access capability and the license file in Xcode. In my code, I used
await arKitSession.queryAuthorization(for: [.cameraAccess])
to request user permission for camera access. After obtaining the permission, I used arKitSession to run the cameraFrameProvider.
However, when running
for await cameraFrame in cameraFrameUpdates{
print("hello")
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
continue
}
pixelBuffer = mainCameraSample.pixelBuffer
}
, I am unable to receive any frames from the camera, and even print("hello") within the braces do not execute. The app does not crash or throw any errors.
Here is my full code:
import SwiftUI
import ARKit
struct cameraTestView: View {
@State var pixelBuffer: CVPixelBuffer?
var body: some View {
VStack{
Button(action:{
Task {
await loadCameraFeed()
}
}){
Text("test")
}
if let pixelBuffer = pixelBuffer {
let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
let context = CIContext(options: nil)
if let cgImage = context.createCGImage(ciImage, from: ciImage.extent) {
Image(uiImage: UIImage(cgImage: cgImage))
}
}else{
Image("exampleCase")
.resizable()
.scaledToFill()
.frame(width: 400,height: 400)
}
}
}
func loadCameraFeed() async {
// Main Camera Feed Access Example
let formats = CameraVideoFormat.supportedVideoFormats(for: .main, cameraPositions:[.left])
let cameraFrameProvider = CameraFrameProvider()
let arKitSession = ARKitSession()
// main camera feed access example
var cameraAuthorization = await arKitSession.queryAuthorization(for: [.cameraAccess])
guard cameraAuthorization == [ARKitSession.AuthorizationType.cameraAccess:ARKitSession.AuthorizationStatus.allowed] else {
return
}
do {
try await arKitSession.run([cameraFrameProvider])
} catch {
return
}
let cameraFrameUpdates = cameraFrameProvider.cameraFrameUpdates(for: formats[0])
if cameraFrameUpdates != nil {
print("identify cameraFrameUpdates")
} else{
print("fail to get cameraFrameUpdates")
return
}
for await cameraFrame in cameraFrameUpdates! {
print("hello")
guard let mainCameraSample = cameraFrame.sample(for: .left) else {
continue
}
pixelBuffer = mainCameraSample.pixelBuffer
}
}
}
#Preview(windowStyle: .automatic) {
cameraTestView()
}
When I click the button, the console prints:
identify cameraFrameUpdates
It seems like it stuck in getting cameraFrame from cameraFrameUpdates.
Occurring on VisionOS 2.0 Beta (just updated), Xcode 16 Beta 6 (just updated).
Does anyone have a workaround for this? I would be grateful if anyone can help.
Hello,
Im not able to get any 3d object visible in ARView.
struct ARViewContainer: UIViewRepresentable {
var trackingState: ARCamera.TrackingState? = nil
func makeUIView(context: Context) -> ARView {
// Create the view.
let view = ARView(frame: .zero)
// Set the coordinator as the session delegate.
view.session.delegate = context.coordinator
let anchor = AnchorEntity(plane: .horizontal)
let box = ModelEntity(mesh: MeshResource.generateBox(size: 0.3), materials: [SimpleMaterial(color: .red, isMetallic: true)])
box.generateCollisionShapes(recursive: true)
anchor.addChild(box)
view.scene.addAnchor(anchor)
// Return the view.
return view
}
final class Coordinator: NSObject, ARSessionDelegate {
var parent: ARViewContainer
init(_ parent: ARViewContainer) {
self.parent = parent
}
func session(_ session: ARSession, cameraDidChangeTrackingState camera: ARCamera) {
print("Camera tracking state: \(camera.trackingState)")
parent.trackingState = camera.trackingState
}
}
func makeCoordinator() -> Coordinator {
Coordinator(self)
}
func updateUIView(_ uiView: ARView, context: Context) { }
}
View is loaded correctly but anything cant appear. I also tried to create 3D object in
func updateUIView(_ uiView: ARView, context: Context) {
let anchor = AnchorEntity(plane: .horizontal)
let box = ModelEntity(mesh: MeshResource.generateBox(size: 0.3), materials: [SimpleMaterial(color: .red, isMetallic: true)])
box.generateCollisionShapes(recursive: true)
anchor.addChild(box)
uiView.scene.addAnchor(anchor)
print("Added into the view")
}
Print statement is printed but there is still no object in the ARView. Is it bug or what am I missing?
Hello All,
I'm desperate to found a solution and I need your help please.
I've create a simple cube in Vision OS. I can get it by hand (close my hand on it) and move it pretty where I want. But, I would like to throw it (exemple like a basket ball). Not push it, I want to have it in hand and throw it away of me with a velocity and direction = my hand move (and finger opened to release it).
Please put me on the wait to do that.
Cheers and thanks
Mathis
Steps to Reproduce:
Create a SwiftUI view that initializes an ARKit session and a camera frame provider.
Attempt to run the ARKit session and retrieve camera frames.
Extract the intrinsics and extrinsics matrices from the camera frame’s sample data.
Attempt to project a 3D point from the world space onto the 2D screen using the retrieved camera parameters.
Encounter issues due to lack of detailed documentation on the correct usage and structure of the intrinsics and extrinsics matrices.
struct CodeLevelSupportView: View {
@State
private var vm = CodeLevelSupportViewModel()
var body: some View {
RealityView { realityViewContent in }
.onAppear {
vm.receiveCamera()
}
}
}
@MainActor
@Observable
class CodeLevelSupportViewModel {
let cameraSession = CameraFrameProvider()
let arSession = ARKitSession()
init() {
Task {
await arSession.requestAuthorization(for: [.cameraAccess])
}
}
func receiveCamera() {
Task {
do {
try await arSession.run([cameraSession])
guard let sequence = cameraSession.cameraFrameUpdates(for: .supportedVideoFormats(for: .main, cameraPositions: [.left])[0]) else {
print("failed to get cameraAccess authorization")
return
}
for try await frame in sequence {
guard let sample = frame.sample(for: .left) else {
print("failed to get camera sample")
return
}
let leftEyeScreenImage:CVPixelBuffer = sample.pixelBuffer
let leftEyeViewportWidth:Int = CVPixelBufferGetWidth(leftEyeScreenImage)
let leftEyeViewportHeight:Int = CVPixelBufferGetHeight(leftEyeScreenImage)
let intrinsics = sample.parameters.intrinsics
let extrinsics = sample.parameters.extrinsics
let oneMeterInFront:SIMD3<Float> = .init(x: 0, y: 0, z: -1)
projectWorldLocationToLeftEyeScreen(worldLocation: oneMeterInFront, intrinsics: intrinsics, extrinsics: extrinsics, viewportSize: (leftEyeViewportWidth,leftEyeViewportHeight))
}
} catch {
}
}
}
//After the function implementation is completed, it should return a CGPoint?, representing the point of this worldLocation in the LeftEyeViewport. If this worldLocation is not visible in the LeftEyeViewport (out of bounds), return nil.
func projectWorldLocationToLeftEyeScreen(worldLocation:SIMD3<Float>,intrinsics:simd_float3x3,extrinsics:simd_float4x4,viewportSize:(width:Int,height:Int)) {
//The API documentation does not provide the structure of intrinsics and extrinsics, making it hard to done this function.
}
}
We are encountering an issue with our app on Vision Pro with OS version 1.3. The app runs perfectly in the VisionOS Simulator, but when tested on the actual device, no content is displayed.
Here’s the expected behavior: When the app launches, a video should play in a window. Once the video ends, another information window should open. After a series of these information windows, we will load to an immersive space to handle 3D elements.
We've set the "Preferred Default Scene Session Role" to "Window Application Session Role" in info.plist, but the issue persists.
Below is the code we're using. Any advice or suggestions would be greatly appreciated.
import SwiftUI
@main
struct myApp: App {
@StateObject var sharedData = SharedDataModel()
@State private var isFactoryEnabled = false
var body: some Scene {
WindowGroup(id: "LaunchScreen", content: {
LaunchScreen()
})
.windowStyle(.plain)
.environmentObject(sharedData)
WindowGroup(id: "LoginView", content: {
ZStack {
let _ = UserDefaults.standard.set(false, forKey: "_UIConstraintBasedLayoutLogUnsatisfiable")
let _ = print(FileManager.default.urls(for: .documentDirectory, in: .userDomainMask).first!.path)
LoginView()
}
}).windowStyle(.plain)
.environmentObject(sharedData)
WindowGroup(id: "TrainingSelection", content: {
if !sharedData.showNavigationHintView{
NavigationHintView()
.glassBackgroundEffect()
.cornerRadius(30)
}
else {
TrainingSelection()
}
}).windowStyle(.plain)
.environmentObject(sharedData)
WindowGroup(id: "Salutations", content: {
Salutations()
}).windowStyle(.plain)
.environmentObject(sharedData)
WindowGroup {
ContentView()
}
.environmentObject(sharedData)
ImmersiveSpace(id: "myImmersiveSpace") {
ImmersiveView(viewModel: .init())
}
.environmentObject(sharedData)
}
}
import SwiftUI
import AVFoundation
import RealityKit
import RealityKitContent
struct LaunchScreen: View {
@State private var player: AVPlayer?
@State private var navigateToContentView = false
@EnvironmentObject var audioPlayer: AudioPlayer
var body: some View {
ZStack {
ZStack {
if navigateToContentView {
WarningView()
.transition(.opacity)
.glassBackgroundEffect()
.cornerRadius(15)
} else {
if let player = player {
AVPlayerView(player: player)
.onAppear {
player.play()
addObserver()
}
.cornerRadius(30)
} else {
Text("Unable to Load the Video")
.foregroundColor(.white)
.onAppear {
loadVideo()
}
}
}
}
.edgesIgnoringSafeArea(.all)
.animation(.easeIn, value: 1)
}
.glassBackgroundEffect()
}
private func loadVideo() {
if let videoUrl = Bundle.main.url(forResource: "launchScreen", withExtension: "mp4") {
player = AVPlayer(url: videoUrl)
} else {
print("Unable to Load the Video")
}
}
private func addObserver() {
NotificationCenter.default.addObserver(
forName: .AVPlayerItemDidPlayToEndTime,
object: player?.currentItem,
queue: .main
) { _ in
self.navigateToContentView = true
}
}
}
Hello I want to ask help from VisionOS devs inside Apple, if it is possible to extend or disable(toggle) the Play Space boundary which is 1.5 meter or 10 feets right now, it is really a shame with such great display and computing power we can't run any room scale VR, I'm currently working on a Undergrad Thesis which choose to use the AVP but I didn't know about this boundary until I've build my room in Unity and put onto my device, is it possible to cut us some slacks regarding the boundary? much thanks
Like title, I want to ask how to use this APIs: CameraFrameProvider
I got the warning : Cannot find 'CameraFrameProvider' in scope
Xcode 16.0 beta 4
imported ARKit
imported Vision
When I use RoomPlan, I notice performance issues in larger rooms or those with a lot of furniture. Is there a way to configure RoomPlan to focus only on detecting properties of a surface (window, door opening and wall) during scanning, possibly through an argument or setting? Filtering afterward is an option, but it doesn't address the slowdown during the scan.
Hi,
I was wondering during developing for visionOS why when I try to use queryDeviceAnchor() with WorldTrackingProvider() after opening the immersive space in the update(context: SceneUpdateContext) function, it initially seems to provide the DeviceAnchor data every frame but stops at some point (about 5-10 seconds after pressing the Button which opens the immersive space) and then stops updating constantly and only updates somehow randomly if I move my head abruptly to the left, right, etc. Somehow, the tracking doesn't seem to work as it should directly on the AVP device.
Any help would be greatly appreciated!
See my code down below:
ContentView.swift
import SwiftUI
struct ContentView: View {
@Environment(\.openImmersiveSpace) private var openImmersiveSpace
@Environment(\.scenePhase) private var scenePhase
var body: some View {
VStack {
Text("Head Tracking Prototype")
.font(.largeTitle)
Button("Start Head Tracking") {
Task {
await openImmersiveSpace(id: "appSpace")
}
}
}
.onChange(of: scenePhase) {_, newScenePhase in
switch newScenePhase {
case .active:
print("...")
case .inactive:
print("...")
case .background:
break
@unknown default:
print("...")
}
}
}
}
HeadTrackingApp.swift
import SwiftUI
@main
struct HeadTrackingApp: App {
init() {
HeadTrackingSystem.registerSystem()
}
var body: some Scene {
WindowGroup {
ContentView()
}
ImmersiveSpace(id: "appSpace") {
}
}
}
HeadTrackingSystem.swift
import SwiftUI
import ARKit
import RealityKit
class HeadTrackingSystem: System {
let arKitSession = ARKitSession()
let worldTrackingProvider = WorldTrackingProvider()
required public init(scene: RealityKit.Scene) {
setUpSession()
}
func setUpSession() {
Task {
do {
try await arKitSession.run([worldTrackingProvider])
} catch {
print("Error: \(error)")
}
}
}
public func update(context: SceneUpdateContext) {
guard worldTrackingProvider.state == .running else { return }
let avp = worldTrackingProvider.queryDeviceAnchor(atTimestamp: CACurrentMediaTime())
print(avp!)
}
Hey guys,
I was wondering if anyone could help me. I'm currently trying to run an ARKitSession() with a WorldTrackingProvider() that makes use of DeviceAnchor. In the simulator everything seems to work fine and the WorldTrackingProvider runs, but if I'm trying to run the app on my AVP, the WorldTrackingProvider pauses after the initialization. I'm new to Apple development and I would be thankful for any helpful input!
Below my current code:
HeadTrackingApp.swift
import SwiftUI
@main
struct HeadTrackingApp: App {
init() {
HeadTrackingSystem.registerSystem()
}
var body: some Scene {
WindowGroup {
ContentView()
}
}
}
ContentView.swift
import SwiftUI
struct ContentView: View {
var body: some View {
VStack {
Text("Head Tracking Prototype")
.font(.largeTitle)
}
}
}
HeadTrackingSystem.swift
import SwiftUI
import ARKit
import RealityKit
class HeadTrackingSystem: System {
let arKitSession = ARKitSession()
let worldTrackingProvider = WorldTrackingProvider()
var avp: DeviceAnchor?
required public init(scene: RealityKit.Scene) {
setUpSession()
}
func setUpSession() {
Task {
do {
print("Starting ARKit session...")
try await arKitSession.run([worldTrackingProvider])
print("Initial World Tracking Provider State: \(worldTrackingProvider.state)")
self.avp = worldTrackingProvider.queryDeviceAnchor(atTimestamp: CACurrentMediaTime())
if let avp = getAVPPositionOrientation() {
print("AVP data: \(avp)")
} else {
print("No AVP position and orientation available.")
}
} catch {
print("Error: \(error)")
}
}
}
func getAVPPositionOrientation() -> DeviceAnchor? {
return avp
}
}
I am running a modified RoomPllan app in my test environment I get two ARSessions active, sometimes more. It appears that the first one is created by Scene Kit because it is related go ARSCNView. Who controls that and what gets processed through it? I noticed that I get a lot of Session Interruptions from Sensor Failure when I am doing World Tracking and the first one happens almost immediately.
When I get the room capture delegates fired up I start getting images to the delegate via a second session that is collecting images. How do I tell which session is the scene kit session and which one is the RoomCapture session on thee fly when it comes through the delegate? Is there a difference in the object desciptor that I can use as a differentiator? Relying on the Address of the ARSession buffer being different is okay if you get your timing right. It wasn't clear from any of the documentation that there would be TWO or more AR Sessions delivering data through the delegates. The books on the use of ARKIT are not much help in determining the partition of responsibilities between the origins. The buffer arrivals at the functions supported by the delegates do not have a clear delineation of what function is delivered through which delegate discernible from the highly fragmented documentation provided by the Developer document library. Can someone give me some guidance here? Are there sources for CLEAR documentation of what is delivered via which delegate for the various interfaces?
Its my understanding that to use the CameraFrameProvider, which provides access to the Apple Vision Pro front facing camera feed the enterprise main camera access "com.apple.developer.arkit.main-camera-access.allow" entitlement is required.
Is there a method to prototype apps on a that use the CameraFrameProvider running on an apple vision pro that has developer mode enable without having the "com.apple.developer.arkit.main-camera-access.allow" entitlement?
When I using Image Tracking in Vision OS2 beta, add an AVPlayer to play one MP4 file when tracking some picture. I Can't get removed event in "for await update in imageInfo.anchorUpdates {" code, so I can't stop or remove the palyer when Image disappear.
Then I used updated event and check "if anchor.isTracked" to remove or add the player again, and It worked.
Now, if I dont move my head, show or hide the picture, It worked like assume. But if the picture dont move, and I move my head away, I cant get updated event, and the player still play even I cant see it. No updated event, and no removed event for me.
Is this a bug?
Hello,
I am trying to use the new Enterprise API to capture main camera frames using the CameraFrameProvider. Until now, I could not make it work. I followed the sample code provided in this thread (literally copy past it): https://forums.developer.apple.com/forums/thread/758364.
When I run the application on the Vision Pro, no frame is captured. I get a message in the XCode's console that no entitlement is found. However, the entitlement is created and the license file is also in the project. Besides, all authorization keys are added in the plist file.
What I am missing? How to know if the license file is wrong?
Thank you.
ARKit to capture data
What we want to do : use the ARKit to capture data around an object (pictures). Is there a way to :
Increase the number of picture captured by default (120) to a higher number without increase the time required to capture data ? We managed to increase the number of pictures to 1000, but the data capture now lasts 20minutes, which is too long. Is there a way to capture a video instead of pictures ?
Capture IMU data : how can we use the ARKit to capture IMU data around an object ?