In WWDC24, visionOS hand tracking has a new function that can make an entity track the hand faster (but at the expense of a certain degree of accuracy), and the video only explains how to implement ARKit, so please ask how to implement the anchorEntiy in the reality view.
General
RSS for tagDiscuss Spatial Computing on Apple Platforms.
Post
Replies
Boosts
Views
Activity
There is a flickering occurring on 3D assets when switching immersive spaces, which is not the nicest user experience. The flickering does occur either when loading the scenes directly from the RealityKitContent package, or from memory (pre-loaded assets).
Since we cannot upload a video illustrating the undesirable behaviour, I have to describe how to setup the project for you to observe it.
To replicate the issue, follow these steps:
Create a new visionOS app using Xcode template, see image.
Configure the project to launch directly into an immersive space (set Preferred Default Scene Session Role to Immersive Space Application Session Role in Info.plist), see image.
Replace all swift files with those you will find in the attached texts.
In the RealityKitContent package, create a scene named YellowSpheres as illustrated below.
In the RealityKitContent package, create a scene named RedSpheres as illustrated below.
Launch the app in debug mode via Xcode and onto the AVP device or simulator
Continuously switch immersive spaces by pressing on buttons Show RedSpheres and Show YellowSpheres.
Observe the 3d assets flicker upon opening of the immersive spaces.
AppModel
RedSpheresImmersiveView
YellowSpheresImmersiveView
TestFlickeringBetweenImmersiveSpacesApp
While trying to control the following two scenes in 1 ImmersiveSpace, we found the following memory leak when we background the app while a stereoscopic video is playing.
ImmersiveView's two scenes:
Scene 1 has 1 toggle button
Scene 2 has same toggle button with a 180 degree skysphere playing a stereoscopic video
Attached are the files and images of the memory leak as captured in Xcode.
To replicate this memory leak, follow these steps:
Create a new visionOS app using Xcode template as illustrated below.
Configure the project to launch directly into an immersive space (set Preferred Default Scene Session Role to Immersive Space Application Session Role in Info.plist.
Replace all swift files with those you will find in the attached texts.
In ImmersiveView, replace the stereoscopic video to play with a large 3d 180 degree video of your own bundled in your project.
Launch the app in debug mode via Xcode and onto the AVP device or simulator
Display the memory use by pressing on keys command+7 and selecting Memory in order to view the live memory graph
Press on the first immersive space's button "Open ImmersiveView"
Press on the second immersive space's button "Show Immersive Video"
Background the app
When the app tray appears, foreground the app by selecting it
The first immersive space should appear
Repeat steps 7, 8, 9, and 10 multiple times
Observe the memory use going up, the graph should look similar to the below illustration.
In ImmersiveView, upon backgrounding the app, I do:
a reset method to clear the video's memory
dismiss of the Immersive Space containing the video (even though upon execution, visionOS raises the purple warning "Unable to dismiss an Immersive Space since none is opened". It appears visionOS dismisses any ImmersiveSpace upon backgrounding, which makes sense..)
Am I not releasing the memory correctly?
Or, is there really a memory leak issue in either SwiftUI's ImmersiveSpace or in AVFoundation's AVPlayer upon background of an app?
App file TestVideoLeakOneImmersiveView
First ImmersiveSpace file InitialImmersiveView
Second ImmersiveSpace File ImmersiveView
Skysphere Model File Immersive180VideoViewModel
File AppModel
I have two Apple Vision Pros so that I can make and test a multi-player immersive reality game. But I am one developer, so I need to be able to take one Apple Vision Pro off, and put the other one on to see what the other device is seeing, and to ensure my game information is correctlly being sent over the network with multipeer connectivity.
But when I take one off, the Apple Vision Pro immediately goes to sleep. With Apple Vision Pro OS 1, I could put a piece of paper into the pro and they would stay on for hours, and I could take them on and off and debug my game. But now with VisionOS 2, even with the paper they soon go to sleep.
Is there a setting I can change or override as a developer to stop this auto sleep or auto lock?
I need to check things like:
when two devices are on the network, can I see them both so that players can select each other from a menu?
can i send object positions back and forth
Thank you.
Does anyone have experience of creating their own EntityActions?
Say for example I wanted one that faded up the opacity of an entity, then once it had completed set another property on one of the entity's components.
I understand that I could use the FromToByAction to control the opacity (and have this working), but I am interested to learn how to create my own dedicated EntityAction, and finding the documentation hard to fathom.
I got as far as creating a struct conforming to EntityAction protocol:
var animatedValueType: (any AnimatableData.Type)?
}
Subscribing to update events on this:
FadeUpAction.subscribe(to: .updated) { event in
guard let animationState = event.animationState else {
return
}
// My animation state is always nil, so I never get here!
let newValue = \\\Some Calc...
animationState.storeAnimatedValue(newValue)
}
And setting it up as an animation on an entity:
let action = FadeUpAction()
if let animation = try? AnimationResource.makeActionAnimation(
for:action,
duration: 2.0,
bindTarget: .opacity
) {
entity.playAnimation(animation)
}
...but haven't been able to understand how to extract the current timeDelta or set the value in the event handler.
Any pointers?
My name is Tom Shannon, a developer with Omnia (d.b.a Aequilibrium Inc.). We were recently approved for some of the Enterprise APIs for the Vision Pro.
You can reference the history through our Case-ID: 9237594
We are contacting you for assistance as we have downloaded the entitlement license provided and added it to our target for an application under the bundle id: com.omnia.spatialbrowser
Then under my project and with my developer account, which is under the Aequilibrium Inc. account (279PV9XKZ2), we tried to add the Barcode Scanner Enterprise API entitlement, but this does not show up as an option for us.
I am on XCode 16.1 beta (16B5001e) for reference! Any help would be greatly appreciated.
Best,
I have an app with a visionOS target, and I want to add an iOS target. Both are based on RealityKit.
I want to use a SpatialTapGesture to get the tap coordinate local to the entity tapped.
In visionOS this is easy:
SpatialTapGesture(coordinateSpace: .local)
.targetedToAnyEntity()
.onEnded { tap in
let entity = tap.entity
let localPoint3D = tap.convert(tap.location3D, from: .local, to: entity)
// …
}
However, according to the docs, the convert function seems to exist only in visionOS, not in iOS.
So how can I do this conversion in iOS?
PS: This was already posted on StackOverflow without success. There, I tried to find a workaround, but I failed.
Hi, I have a video player app that lost its audio spatialization since the VisionOS 2 update. I am using the VideoPlayerComponent (https://developer.apple.com/documentation/realitykit/videoplayercomponent), to implement my videos as entities, as I want a custom look and controls to my player.
In VisionOS 1, there was automatic audio spatialization. Depending where my video entity is, the app automatically enables head tracking audio spatialization. Since VisionOS 2 however, I cannot get my video entities to play Spatial Audio. I've looked into DestinationVideo and even set up AVAudioSessionSpatialExperience but Spatial Audio is still not working.
Appreciate any help. Thanks.
Hi,
I currently have Enterprise API access and have observed that the main camera API only provides RGB data. I am trying to access point cloud information from LIDAR, but it seems ARKit doesn't offer this directly via the standard APIs that iPad uses.
I wanted to ask if there are any possible options to access depth data or enhanced camera capabilities using the Enterprise API.
Specifically:
Does having Enterprise API access unlock any additional camera-related APIs in AVFoundation that could provide depth information or more advanced control over the camera?
Are there any workarounds or alternative methods to obtain depth data from the camera?
Hi,
When opening an ImmersiveSpace with the .mixed style, is it possible to keep the user's current selected system immersive environment?
Currently, the system immersive environment will be dismissed.
ImmersiveSpace(id: "some id") {
SomeRealityView()
}
.immersionStyle(selection: .constant(.mixed), in: .mixed)
Hi,
One of the great features introduced in WWDC24 is the Translation API. But unfortunately it's currently unavailable on visionOS.
My question is, does Apple have any plan to support it on visionOS as well? If so, what's the ETA for this feature?
I would really like to see it on visionOS, otherwise I'll have to pay Google to use their translation API.
I'm setting:
.immersionStyle(selection: .constant(.progressive(0.1...1.0, initialAmount: 0.1)), in: .progressive(0.1...1.0, initialAmount: 0.1))
In UnityVisionOSSettings.swift before build out in Xcode.
I'm having an issue where this only works on occasion. Seems random. I'll either get no immersion level available (crown dial is greyed out and no changes can be made) or it will only allow 0.5 - 1.0 immersion (dial will go below 0.5 but springs back to 0.5 when released).
With no changes to my setup or how I'm setting immersionStyle I've been able to get this to work as I would expect. Wondering if there is some bug that would be causing this to fail. I've tested a simple NativeSDK progressive immersion style with same code for custom setting and it works everytime, so it's something related to Unity.
Here is the entire UnityVisionOSSettings that, from as far as I can tell, are controlling this:
`// GENERATED BY BUILD
import Foundation
import SwiftUI
import PolySpatialRealityKit
import UnityFramework
let unityStartInBatchMode = false
extension UnityPolySpatialApp {
func initialWindowName() -> String { return "Unbounded" }
func getAllAvailableWindows() -> [String] { return ["Bounded-0.500x0.500x0.500", "Unbounded"] }
func getAvailableWindowsForMatch() -> [simd_float3] { return [] }
func displayProviderParameters() -> DisplayProviderParameters { return .init(
framebufferWidth: 1830,
framebufferHeight: 1600,
leftEyePose: .init(position: .init(x: 0, y: 0, z: 0),
rotation: .init(x: 0, y: 0, z: 0, w: 1)),
rightEyePose: .init(position: .init(x: 0, y: 0, z: 0),
rotation: .init(x: 0, y: 0, z: 0, w: 1)),
leftProjectionHalfAngles: .init(left: -1, right: 1, top: 1, bottom: -1),
rightProjectionHalfAngles: .init(left: -1, right: 1, top: 1, bottom: -1)
)
}
@SceneBuilder
var mainScenePart0: some Scene {
ImmersiveSpace(id: "Unbounded", for: UUID.self) { uuid in
PolySpatialContentViewWrapper(minSize: .init(1.000, 1.000, 1.000), maxSize: .init(1.000, 1.000, 1.000))
.environment(\.pslWindow, PolySpatialWindow(uuid.wrappedValue, "Unbounded", .init(1.000, 1.000, 1.000)))
.onImmersionChange() { oldContext, newContext in
PolySpatialWindowManagerAccess.onImmersionChange(oldContext.amount, newContext.amount)
}
KeyboardTextField().frame(width: 0, height: 0).modifier(LifeCycleHandlerModifier())
} defaultValue: { UUID() } .upperLimbVisibility(.automatic)
.immersionStyle(selection: .constant(.progressive(0.1...1.0, initialAmount: 0.1)), in: .progressive(0.1...1.0, initialAmount: 0.1))
WindowGroup(id: "Bounded-0.500x0.500x0.500", for: UUID.self) { uuid in
PolySpatialContentViewWrapper(minSize: .init(0.100, 0.100, 0.100), maxSize: .init(0.500, 0.500, 0.500))
.environment(\.pslWindow, PolySpatialWindow(uuid.wrappedValue, "Bounded-0.500x0.500x0.500", .init(0.500, 0.500, 0.500)))
KeyboardTextField().frame(width: 0, height: 0).modifier(LifeCycleHandlerModifier())
} defaultValue: { UUID() } .windowStyle(.volumetric).defaultSize(width: 0.500, height: 0.500, depth: 0.500, in: .meters).windowResizability(.contentSize) .upperLimbVisibility(.automatic) .volumeWorldAlignment(.gravityAligned)
}
@SceneBuilder
var mainScene: some Scene {
mainScenePart0
}
struct LifeCycleHandlerModifier: ViewModifier {
func body(content: Content) -> some View {
content
.onOpenURL(perform: { url in
UnityLibrary.instance?.setAbsoluteUrl(url.absoluteString)
})
}
}
}`
Hi,
My app allows users to share and view spatial photos.
For viewing spatial photos, I'm using a plane in a RealityView that has a camera index switch material node, which takes the stereo images as the inputs.
For sharing native spatial photos taken on the vision pro, prior to visionOS 2.0, I extract the stereo image pair and merge them into a single side-by-side image to upload to the app's backend.
However, since visionOS 2.0 introduced generating spatial photos from normal photos, I've been seeing some unexpected behaviours in my app, while on the other hand, they can be viewed correctly in the system Photos app:
Sometimes the extracted images have different size, the right image is smaller than the left image. See the first image in the google drive below, taken with iPhone 15 Pro.
Even if the image pair have the same size, when viewed in my app, it has some artefacts, especially around the edge of objects which are closer to the camera. See the second image in the google drive below, taken with iPhone 11.
Google drive link here:
https://drive.google.com/drive/folders/1UTfpxvO3-ChqshwfyzY5E_KCgk8VgUaa
I know that now Quicklook preview application can support viewing spatial photos now, but I would like to keep it the way I implemented in the app, for compatibility concerns.
Below is a code snippet that deals with the extraction. Please point out the correct way to extract stereo image pair from a generated spatial photo.
Happy to submit a code-level support request if more information is needed.
// the data is from photos picker item
let data = try await photo.loadTransferable(type: Data.self)
let source = CGImageSourceCreateWithData(data as CFData, nil)
let sbsImage = source.extractSpatialPhoto()
extension CGImageSource {
func extractSpatialPhoto() -> UIImage? {
guard let leftCGImage = extractSpatialImage(at: 0),
let rightCGImage = extractSpatialImage(at: 1)
else {
return nil
}
let leftImage = UIImage(ciImage: leftCGImage)
let rightImage = UIImage(ciImage: rightCGImage)
guard leftImage.size == rightImage.size else {
return nil
}
// merge left + right
let size = CGSize(width: leftImage.size.width * 2, height: leftImage.size.height)
UIGraphicsBeginImageContextWithOptions(size, true, 1.0)
leftImage.draw(in: CGRect(x: 0, y: 0, width: leftImage.size.width, height: leftImage.size.height))
rightImage.draw(in: CGRect(x: leftImage.size.width, y: 0, width: rightImage.size.width, height: rightImage.size.height))
let mergedImage = UIGraphicsGetImageFromCurrentImageContext()
UIGraphicsEndImageContext()
return mergedImage
}
// not sure if this actually works
func extractSpatialImage(at index: Int) -> CIImage? {
guard let cgImage = CGImageSourceCreateImageAtIndex(self, index, nil) else {
return nil
}
var ciImage = CIImage(cgImage: cgImage)
if let properties = CGImageSourceCopyPropertiesAtIndex(self, index, nil) as? [String: Any],
let heifDictionary = properties[kCGImagePropertyHEIFDictionary as String] as? [String: Any],
let extrinsics = heifDictionary[kIIOMetadata_CameraExtrinsicsKey as String] as? [String: Any],
let position = extrinsics[kIIOCameraExtrinsics_Position as String] as? [Double]
{
// Default baseline is 64mm (0 for left camera, 0.064m for right camera)
let standardBaseline = 0.064
// Check if it's the right image (should be at [0.064, 0, 0])
let isRightImage = (index == 1)
let expectedPosition = isRightImage ? standardBaseline : 0.0
// Calculate the translation needed to align to standard baseline
let positionDelta = position[0] - expectedPosition
// Apply translation only if there's a mismatch in position
if positionDelta != 0 {
let transform = CGAffineTransform(translationX: CGFloat(positionDelta), y: 0)
ciImage = ciImage.transformed(by: transform)
}
}
return ciImage
}
}
I'm working on a school project that allows users to open a .USDZ file (using Quick Look) on the webpage while using Apple Vision Pro to put the object in their physical envirnment, the project is deployed on Vercel. I'm testing the page with my apple vision pro, when I click open the .USDZ file, I'm seeing a triangle with an exclamation mark while it's trying to load, but it won't load. Does anybody know how to troubleshoot this issue?
Hello,
I'm currently working on a project that requires real-world object recognition and labeling. I understand that due to the security and privacy issues, we are unable to access the vision pro camera feed. Is there any other external way to solve this problem?
Thank you!
I have been implementing the LongPressGesture to have a menu come up upon a long press. I love the functionality and it is very close to being where I want it to be.
I don't know if this is a visionOS-specific thing, but I am hoping to control the corner radius of the pulled-out element behind my "button." I've wrangled hover effects in the past with overlays, but I'm not sure what to target in this case.
Worst case, I'll have to change the border radius on all of my tiles to match this LongPressGesture-controlled behavior, or I could possibly change the radius onLongPressGesture to match.
Is there a simpler solution? Thanks!
How to detect whether two entities collide in RealityView and execute some instructions after the collision. It is assumed that both entities have collision components, and one of them also has an anchor component (such as a binding hand).
How to make a specified entity in RealityView be captured by users:
This entity has physical and collision components, and the user will not change when he does not grasp the action. However, when the user makes a grab hand gesture and is very close to the entity (there can be a small deviation), an Anchor component will be enabled to bind the entity to the hand, but when the user lets go, he will fall along the y-axis of the current position (affected by the physical component).
I hope you can help me. Thank you.
Hi,
Following the wwdc24 video - What’s new in Quick Look for visionOS, I've managed to open a 3D model using PreviewApplication by calling
let previewItem = PreviewItem(url: modelURL, displayName: "Easter", editingMode: .disabled)
_ = PreviewApplication.open(items: [previewItem])
However, the "Save to Downloads" option is aways there(see attached screenshot). As the models are user generated content, and I don't want the download option to be available to all users. How to disable it?
I am having trouble with initializing the SharePlay. It works but we have to leave the game (click the close button) and rejoin it, sometimes several times, for it to establish the connection.
I am also having trouble sharing images over SharePlay with GroupSessionJournal. I am not able to get it to transfer any amount of data or even get recognition on the other participants in the SharePlay that an image is being sent. We have look at all the information we can find online and are not able to establish a connection. I am not sure if I am missing a step, or if I am incorrectly sending the data through the GroupSessionJournal.
Here are the steps I took take to replicate the issue I have:
FaceTime another person with the app.
Open the app and click the SharePlay button to SharePlay it with the other person.
Establish the SharePlay and by making sure that the board states are syncronized across participants. If its not click the close button and click open app again to rejoin the SharePlay. (This is one of the bugs that I would like to fix. This is just a work around we developed to establish the SharePlay. We would like it so that when you click SharePLay and they join the session it works.)
Once the SharePlay has been established, change the image by clicking change 1 image.
Select a jpg image.
The image that represents 1 should be not set. If you dont see the image click on any of the X in the squares and it will change to the image.
The image should appear on the other participant in the SharePlay. (This does not happen and is what we have not been able to figure out how to get working.)
Here are the classes for the example project I created:
Content View
Game Model Class
Activity Manager
Main Starter Class