Evolve your ARKit app for spatial experiences

More Videos

Evolve your ARKit app for spatial experiences

Discover how you can bring your app's AR experience to visionOS. Learn how ARKit and RealityKit have evolved for spatial computing: We'll highlight conceptual and API changes for those coming from iPadOS and iOS and guide you to sessions with more details to help you bring your AR experience to this platform.

Resources
- - HD Video
  - SD Video
Related Videos

WWDC23
Download

♪ Mellow instrumental hip-hop ♪ ♪ Omid Khalili: Hello! My name is Omid.
Oliver and I are engineers on the ARKit team and we are thrilled to review the concepts -- some familiar and some new -- that you'll need know about when bringing your iOS AR app to our new platform.
ARKit was introduced on iOS in 2017 and with it, we introduced three key concepts to building augmented reality applications.
With world tracking, ARKit is able to track your device's position in the world with six degrees of freedom.
This allows anchoring virtual content with a position and orientation to the real world.
Scene understanding provides insight about the real world around you.
Using the provided geometry and semantic knowledge, your content can be intelligently placed and realistically interact with surroundings.
Finally, rendering engines can correctly register and composite your virtual content over captured images utilizing camera transforms and intrinsics provided by ARKit.
Initially, we started with a SceneKit view to use ARKit's camera transforms and render 3D content on iOS.
We then introduced RealityKit, laying out the foundation for an engine capable of highly realistic physically-based rendering and accurate object simulation with your surroundings.
To enable spatial computing, ARKit and RealityKit have matured and are deeply integrated into the operating system.
For example, ARKit's tracking and Scene Understanding are now running as system services, backing everything from window placement to spatial audio.
The system takes on responsibilities that used to belong to applications.
Camera pass-through and matting of the user's hands are now built-in, so your application gets these capabilities for free.
Another built-in capability is that ARKit world maps are continuously persisted by a system service, so your application doesn't need to do it anymore.
We believe that this will free you up to focus on building the best application and content possible for this platform.
Here is an example demonstrating these capabilities, along with new ones introduced with this platform.
For example, ARKit now provides hand tracking to your app, which allows people to reach out and directly interact with virtual content that can then interact with their surroundings.
In order to take advantage of all the new capabilities and immersive experiences this new platform offers, you'll need to update your iOS ARKit-based experience.
This is a great opportunity to reimagine your app and AR experience for spatial computing.
As a part of this transition, you'll be using familiar concepts that we've introduced with ARKit and RealityKit.
We're going to cover how these concepts have carried over, how they've evolved, and how you can take advantage of them.
Let's get started! First, we'll explore some new ways you can present your app for spatial computing, and introduce new content tools available to you.
Next, we will talk about Reality Kit, which is the engine to use to render and interact with your content.
We'll see how RealityView lets your app leverage spatial computing similar to ARView on iOS.
Then, we'll talk about the different ways your app can bring content into people's surroundings.
Raycasting is something many iOS applications use to place content.
We'll show an example of how to combine ARKit data and RealityKit to enable raycasting for spatial computing.
And finally, we'll review the updates to ARKit and see the new ways to utilize familiar concepts from iOS.
Lets get into preparing to migrate your experience for spatial computing.
Spatial computing allows you to take your iOS AR experience and expand it beyond the window.
This platform offers new ways to present your application that you'll want to consider as you bring your iOS experience over.
Here is an example from our Hello World sample app.
You can now display UI, including windows and three-dimensional content, anywhere around you.
By default, applications on this platform launch into the Shared Space.
The Shared Space is where apps exist side by side, much like multiple apps on a Mac desktop.
Inside a Shared Space, your app can open one or more windows to display content.
Additionally, your app can create a three-dimensional volume.
For example, now you can show a list of available board games in one window, the rules in another, and open the selected game in its own volume.
The game can be played while keeping a Safari window open to read up on winning strategies.
The content you add to the window and volume stays contained within its bounds to allow sharing the space with other applications.
In some cases, you may want your app to have more control over the level of immersion in your experience -- maybe to play a game that interacts with your room.
For this, your app can open a dedicated Full Space in which only your app's windows, volumes, and 3D objects appear.
Once in a Full Space, your application has access to more features.
Using RealityKit's anchor entities, you can target and attach objects to the surroundings like tables, floors, and even parts of your hands like the palm or wrist.
Anchor entities work without requiring user permission.
ARKit data is something else your app can only access in a Full Space.
With permission, ARKit will provide data about the real-world surfaces, scene geometry, and skeletal hand tracking, expanding your app's ability for realistic physics and natural interactions.
Windows, volumes, and spaces are all SwiftUI scene types.
There's so much more for you to learn about these.
For starters, you can go to the session mentioned here.
Next, let's review the main steps needed to prepare your content for bringing it to spatial computing.
Memorable AR experiences on iOS begin with great 3D content; the same is true for spatial experiences on this platform.
And when it comes to 3D content, it's great to rely on an open standard like Universal Scene Description, or USD for short.
USD is production proven, and scales from creators making single assets to large studios working on AAA games and films.
Apple was an early adopter of USD, adding it to our platforms in 2017 and growing support since.
Today, USD is at the heart of 3D content for spatial computing.
With USD assets ready, you can bring them into our new developer tool, Reality Composer Pro, to compose, edit, and preview your 3D content.
If you're using CustomMaterials for your 3D content on iOS, then you will need to rebuild them using its shader graph.
You also have the ability to edit your RealityKit components directly through the UI.
And finally, you can import your Reality Composer Pro project directly into Xcode, allowing you to easily bundle all of your USD assets, materials, and custom components into your Xcode project.
We have some great sessions to help you learn more about Reality Composer Pro and how to build your own custom materials for spatial computing.
Now that we've seen the different ways to present your application, let's learn more about the features RealityView offers as you bring your experience over.
We just saw how spatial computing allows apps to display content in your space.
One of the key differences coming from iOS is how different elements can be presented side by side.
Notice how your 3D content and 2D elements can appear and work along side each other.
Coming from iOS, you're going to use familiar frameworks to create each of these.
You'll use SwiftUI to build the best 2D UI and get system gestures events like the ones on iOS.
And you'll use RealityKit to render your 3D content for spatial experiences.
The way to interface with both of these at the same time is through RealityView - a new SwiftUI view that we're introducing to cater to the unique needs of spatial computing.
RealityView truly bridges SwiftUI and RealityKit, allowing you to combine 2D and 3D elements and create a memorable spatial experience.
You'll be using the RealityView to hold all the entities you wish to display and interact with.
You can get gesture events and connect them to the entities in your view to control them.
And with access to ARKit's scene understanding, you can enable realistic simulations with people's surroundings and even their hands using RealityKit's collision components.
Before we look at how using RealityKit carries over from iOS, let's do a quick refresher on how to work with RealityKit's Entity Component System.
In the Reality Kit Entity Component System, each entity is a container for 3D content.
Different components are added to an entity to define its look and behavior.
This can include a model component for how it should render; a collision component, for how it can collide with other entities; and many more.
You can use RealityComposer Pro to prepare RealityKit components like collision components and get them added to your entities.
Systems contain code to act on entities that have the required components.
For example, the system required for gesture support only operates on entities that have a CollisionComponent and InputTargetComponent.
A lot of the concepts used by RealityView for spatial computing carry over from those of ARView on iOS.
Lets see how these two stack up.
Both views are event-aware containers to hold the entities you wish to display in your app.
You can add Gesture Support to your views to enable selection and interaction with entities.
With SwiftUI for spatial computing, you can reach out to select or drag your entities.
Both ARView and RealityView provide a collection of your entities.
ARView uses a Scene for this.
RealityView has a Content to add your entities to.
You can add AnchorEntities to them, allowing you to anchor your content to the real world.
On both platforms, you create an entity to load your content model and an AnchorEntity to place it.
One main difference between the platforms is in the behavior of anchor entities.
ARView on iOS uses an ARSession and your app must receive permission to run scene understanding algorithms necessary for anchor entities to work.
RealityView is using System Services to enable anchorEntities.
This means that spatial experiences can anchor content to your surroundings without requiring permissions.
Apps using this approach do not receive the underlying scene understanding data or transforms.
Not having transform data for your app to place content has some implications that Oliver will talk about later in his section.
As we've seen, there are many familiar concepts that carry over coming from iOS, but there are also new capabilities that RealityKit provides for spatial computing.
We've only scratched the surface of what's possible with RealityKit on this new platform, and you may want to check out the session below to follow up on more.
Now over to Oliver, who will talk more about RealityView and how to bring in your content from iOS.
Oliver Dunkley: Thanks, Omid! Let's continue by exploring the different ways you can bring in your existing content to spatial computing.
Lets start in the Shared Space.
We can add 3D content to a window or volume, and use system gestures to interact with it.
To display your assets, you just add them directly to RealityView's Content.
You do this by creating an entity to hold your model component and position it by setting the transform component.
You can also set up gesture support to modify the transform component.
Note that all entities added to the view's content exist in the same space relative to the space's origin and can therefore interact with one another.
In the Shared Space, content cannot be anchored to your surroundings.
Let's consider our options if we transition our app to a Full Space.
One of the key differences coming from the Shared Space is that apps can now additionally anchor content to people's surroundings.
Anchoring your content here can happen in two ways.
Lets first look at using RealityKit's AnchorEntity to place content without requiring permissions to use ARKit data in your app.
RealityKit's AnchorEntities allow you to specify a target for the system to find and automatically anchor your content to.
So for example, in order to place a 3D model on a table surface in front of you, you can use a RealityKit AnchorEntity with a target set to table.
Different from iOS, AnchorEntities can be used without having to prompt for user permissions.
People's privacy is preserved by not sharing the underlying transforms of the AnchorEntity with your application.
Note: this implies children of different anchor entities are not aware of one another.
New to anchorEntities, you can target hands, which opens up a whole new realm of interesting interaction opportunities.
For example, you could anchor content to a person's palm and have it follow their hands as they move them.
This is all done by the system, without telling your app where the persons hands actually are.
AnchorEntitys provide a quick, privacy-friendly way for your app to anchor content to people's surroundings.
Coming back to a Full Space, we can also leverage ARKit to incorporate system-level knowledge of the people's surroundings.
This enables you to build your own custom placement logic.
Let's take a look at how this works.
Similar to iOS, your application receives anchoring updates for scene understanding data.
You can integrate this anchor data into your app logic to achieve all sorts of amazing experiences.
For example, you could use the bounds of a plane to center and distribute your content onto.
Or, you could use planes and their classifications to find the corner of a room by looking for the intersection of two walls and a floor.
Once you've decided where to place your content, you add a world anchor for ARKit to track and use it to update your entity's transform component.
This not only allows your content to remain anchored to the real world, as the underlying world map is updated, but it also opens the door to anchor persistence, which we will explore shortly.
All the entities added to your space can interact with one another as well as with the surroundings.
This all works because scene understanding anchors are delivered with transforms relative to the space's origin.
User Permission is required to use ARKit capabilities.
You just saw how integrating ARKit data into your app logic can enable more advanced features.
So far we have talked about letting your app place content.
Let's explore how we can let people guide placement.
On iOS, you can use raycasting to translate 2D input to a 3D position.
But with this new platform, we don't need this 2D-3D bridge anymore, as we can use hands to naturally interact with experiences directly.
Raycasting remains powerful; it lets people reach out beyond arms length.
There are various ways to set up raycasting.
Fundamentally, you need to setup RealityKit's collision components to raycast against.
Collision components can also be created from ARKit's mesh anchors to raycast against people's surroundings.
Let's explore two examples of how to raycast for spatial computing: first, using system gestures, and the second using hands data.
After obtaining a position, we can place an ARKit worldAnchor to keep our content anchored.
Let's consider the following example.
Imagine our app revolves around placing inspirational 3D assets for modelers.
Maybe in this particular scenario, a person wants to use our app to place a virtual ship on their workbench for some modeling project.
Here is our workbench we want to place our ship on.
We'll start with an empty RealityView.
ARKit's scene understanding provides mesh anchors that we'll use to represent the surroundings.
They provide geometry and semantic information we can use.
Remember that meshes for scene reconstruction data are delivered as a series of chunks.
We'll create an entity to represent this mesh chunk, and we'll correctly place this entity in a full space using the mesh anchor's transform.
Our entity then needs a collision component to hit test against.
We'll use RealityKit's ShapeResources method to generate a collision shape from the meshAnchor for our entity.
We'll then add our correctly placed entity which supports hit testing.
We'll build an entity and collision component for each mesh chunk we receive to represent all the surroundings.
As scene reconstruction is refined, we may get updates to meshes or have chunks removed.
We should be ready to update our entities on these changes as well.
We now have a collection of entities representing the surroundings.
All these entities have collision components and can support a raycast test.
Let's first explore raycasting using system gestures, and then continue the example using hands data.
We can raycast and get a position to place our ship using system gestures.
Gestures can only interact with entities that have both Collision and InputTarget components, so we add one to each of our mesh entities.
By adding a SpatialTapGesture to the RealityView, people can raycast by looking at entities and tapping.
This resulting event holds a position in world space representing the place people looked at when tapping.
Instead of using system gestures, we could also have used ARKit's hand anchors to build a ray.
Lets take a step back and explore this option.
To know where people point, we first need a representation of the person's hand.
ARkit's new hand anchors gives us everything we need.
We can use finger joint information to build the origin and direction of the ray for our query.
Now that we have the origin and direction of our ray, we can do a raycast against the entities in our scene.
The resulting CollisionCastHit provides the entity that was hit, along with its position and a surface normal.
Once we identify a position in the world to place our content, we'll add a world anchor for ARKit to continuously track this position for us.
ARKit will update this world anchor's transform as the world map is refined.
We can create a new entity to load our ship's model, and set its transform using the world anchor update, positioning it where the user wanted.
Finally, we can add the entity to our content to render it over the workbench.
Whenever ARKit updates the world anchor we added, we update the transform component of our ship entity, making sure it stays anchored to the real world.
And that's it! We used our hands to point to a location in our surroundings and placed content there.
Raycasting is not only useful for placing content, but also for interacting with it.
Let's see what it takes to raycast against our virtual ship.
RealityKit collision components are very powerful.
We can let the ship entity participate in collisions by simply adding an appropriate collision component to it, which Reality Composer Pro can help us with.
After enabling the ship's collision component and building a new ray from the latest hand joint positions, we can do another raycast and tell if the user is pointing at the ship or the table.
The previous examples demonstrated the power and versatility of combining RealityKit's features with ARKit's scene understanding to build truly compelling experiences.
Lets see how using ARkit has changed for spatial computing.
Fundamentally, just as on iOS, ARKit still works by running a session to receive anchor updates.
How you configure and run your session, receive anchors updates, and persist world anchors has changed on this new platform.
Let's take a look! On iOS, ARKit provides different configurations to chose from.
Each configuration bundles capabilities necessary for your experience.
Here, for example, we selected ARWorldTrackingConfiguration, and will enable sceneReconstruction for meshes and planeDetection for planes.
We can then create our ARSession and run it with the selected configuration.
On this new platform, ARKit now exposes a data provider for each scene understanding capability.
Hand tracking is a new capability offered by ARKit and gets its own provider as well.
Each data provider's initializer takes the parameters needed to configure that provider instance.
Now instead of choosing from a catalogue of preset configurations, you get an à la carte selection of the providers you need for your application.
Here for example, we choose a SceneReconstructionProvider to receive mesh anchors and a PlaneDetectionProvider to receive plane anchors.
We create the providers and initialize the mesh classification and plane types we wish to receive.
Then we create an ARKitSession and run it with the instantiated providers.
Now that we have seen how configuring your session has been simplified, let's go and understand which way these new data providers change the way your app actually receives ARKit data.
On iOS, a single delegate receives anchor and frame updates.
Anchors are aggregated and delivered with ARFrames to keep camera frames and anchors in sync.
Applications are responsible for displaying the camera pixel buffer, and using camera transforms to register and render tracked virtual content.
Mesh and plane anchors are delivered as base anchors, and it is up to you to disambiguate them to figure out which is which.
On our new platform, it is the data providers that deliver anchor updates.
Here are the providers we previously configured.
Once you run an ARKitSession, each provider will immediately begin asynchronously publishing anchor updates.
The SceneReconstructionProvider gives meshAnchors, and the planeDetectionProvider gives us PlaneAnchors.
No disambiguation is necessary! Anchor updates come as soon as they're available, and are decoupled from updates of other data providers.
It is important to note that ARFrames are no longer provided.
Spatial computing applications do not need frame or camera data to display content, since this is now done automatically by the system.
Without having to package anchor updates with an ARFrame, ARKit can now deliver them immediately, reducing latency, allowing your application to quickly react to updates in the person's surroundings.
Next, let's talk about worldAnchor persistence.
You will love these changes! In our raycasting examples, we used world anchors to place and anchor virtual content to real-world positions.
Your app can persist these anchors, enabling it to automatically receive them again, when the device returns to the same surroundings.
Let's first quickly recap how persistence worked on iOS.
On iOS, it is the application's responsibility to handle world map and anchor persistence.
This included requesting and saving ARKit's world map with your added anchor, adding logic to reload the correct world map at the right time, then waiting for relocalization to finish before receiving previously persisted anchors and continuing the application experience.
On this new platform, the system continuously persists the world map in the background, seamlessly loading, unloading, creating, and relocalizing to existing maps as people move around.
Your application does not have to handle maps anymore, the system now does it for you! You simply focus on using world anchors to persist locations of virtual content.
When placing content, you'll be using the new WorldTrackingProvider to add WorldAnchors to the world map.
The system will automatically save these for you.
The WorldTrackingProvider will update the tracking status and transforms of these world anchors.
You can use the WorldAnchor identifier to load or unload the corresponding virtual content.
We just highlighted a few updates to the ARKit principles that you knew from iOS, but there is so much more to explore! For a deeper dive, with code examples, we recommend you watch "Meet ARKit for spatial computing." Let's conclude this session! In this session, we provided a high-level understanding of how ARKit and RealityKit concepts have evolved from iOS, the changes you need to consider, and which sessions to watch for more details.
This platform takes on many tasks your iOS app had to handle, allowing you to really focus on building beautiful content and experiences using frameworks and concepts you're already familiar with.
We are thrilled to see how you leverage spatial computing and all of its amazing capabilities to evolve your app! Thank you for watching! ♪
Looking for something specific? Enter a topic above and jump straight to the good stuff.

An error occurred when submitting your query. Please check your Internet connection and try again.

Resources

Related Videos

WWDC23