Streaming is available in most browsers,
and in the Developer app.
-
What’s new in DockKit
Discover how intelligent tracking in DockKit allows for smoother transitions between subjects. We will cover what intelligent tracking is, how it uses an ML model to select and track subjects, and how you can use it in your app.
Chapters
- 0:00 - Introduction
- 2:49 - Introduction to Intelligent Tracking
- 4:07 - How it works
- 7:08 - Custom control in your app
- 9:52 - Button controls for DocKit
- 14:03 - New camera modes
- 14:46 - Monitor accessory battery
Resources
Related Videos
WWDC23
-
Download
Hello everyone. My name is Dhruv Samant, an engineer on the DockKit team, and today, I'm thrilled to share the exciting updates and innovations we've been working on for DockKit. Last year, we introduced DockKit, a groundbreaking framework that elevates the iPhone content creation and video calling experiences to new heights. I have great news, the first DockKit-powered stands are now available for purchase at Apple Stores. Let us take a quick look at how you can set up and start using your DockKit stand.
To get started, tap to pair your iPhone with your DockKit device.
A pairing card will provide step-by-step instructions to complete the pairing process.
Once paired, simply dock your iPhone.
And there you have it. Your DockKit device is ready to go. Now, I can launch the iOS camera app, and the dock will automatically keep me in the frame.
I can move around, and the dock follows me seamlessly.
DockKit tracking also works in FaceTime. In fact, any app that uses Camera will now track you and keep you in the frame. With a DockKit device, and our user-friendly APIs, you can now design personalized and extraordinary experiences ranging from video capture, video conferencing to education and healthcare. With these devices, our customers can now focus on their content without worrying about being in frame. This year we have an exciting update to DockKit, Intelligent Subject Tracking. Intelligent tracking uses Machine Learning to enhance the DockKit tracking experience. In this session we will talk about what Intelligent Tracking is, how it works and how you can use it in your applications. Next, we'll shift our focus to button controls and unveil a new category of DockKit accessories. Then, we'll introduce new camera modes that now support DockKit functionality. Finally, we will look at how you can easily monitor the battery of your DockKit accessory. Great! Let's dive into it. So, what is Intelligent Subject Tracking? Intelligent Subject Tracking aims to address the age-old question of who to focus on in a video scene. Imagine a scenario where some subjects are interacting with each other, or interacting with the camera, while others are in the background. It can be extremely challenging to determine the most relevant individuals to track and maintain focus on. For instance, in this simple scene, as a cameraperson you will likely want to focus on the 2 subjects in the front interacting with each other and ignore the person in the back. As the scene gets more complex, we need more sophisticated ways to make these decisions. This is where Intelligent Subject Tracking comes into picture.
Using advanced algorithms and Machine Learning, Intelligent Subject Tracking analyzes a scene in real-time. It identifies the main subjects, such as individual faces or objects, and then determines the most relevant person to track based on various factors like movement, speech, and proximity to the camera. This tracking is smooth and seamless, allowing you to focus on creating content without any manual intervention. Now lets look at the underlying algorithms, and frameworks that make this possible. Building upon the multi-person tracker in iOS 17, which used iPhone's image intelligence to estimate trajectories of multiple subjects in a scene, we've developed a brand new Intelligent Tracking Pipeline in iOS 18. This pipeline takes the data from the multi-person tracker and processes it through an advanced Subject Selection Machine Learning Model. This model analyzes various attributes like body pose, face pose, attention, and speaking confidence, to determine the most relevant subject to focus on in a scene.
We then have a subject framing module that takes the chosen subjects as input and, using advanced algorithms, determines the most visually appealing way to frame them.
Once we have a final computed scene, we use the motor positional and velocity feedback to achieve the final actuator commands to send to the DockKit accessory.
While our Intelligent Tracking system can handle many scenarios remarkably well, we acknowledge the expertise of a human cameraman. Consequently, we've introduced Watch Control for all DockKit devices With Watch Control, users can exert more precise control over tracking and framing in the iOS Camera App. They can also manually control the DockKit accessory to further refine their shot. Now, instead of just talking about it, let's see it in action! I have a DockKit stand and my iPhone here with our new Intelligent Tracking Pipeline running.
Since I am the only person in the frame, its an easy decision to track me. Now, I would like to invite my friend Steve to come and share the stage with me. Hey, Steve.
Steve is now also a person of interest and Intelligent Tracking will select and track both of us. I can now simply use my Apple Watch to tap on Steve's face, and DockKit will now only track Steve.
Steve, can you take a couple of steps in that direction? As you can see DockKit is now only tracking Steve. Steve is a good friend of mine but I would like to now continue with the demonstration by unselecting him. To do this, I can just swipe on my watch to manually move my DockKit accessory and tap on my face to start tracking me again.
These are some great changes we brought to DockKit system tracking, without you having to add any additional code to your app. However, we want to give you even more control. To that end, we are providing access to our Machine Learning signals in your app. This will enable you to create innovative and distinctive features that your customers will love. If your application is using DockKit intelligent tracking, you can get a summary of tracked subjects with important attributes.
We can query the tracking summary using the tracking states async sequence. The tracking state includes a time which is the time when the state was captured and a list of tracked subjects. A tracked subject can either be a person or an object.
A tracked subject will have an identifier, face rectangle, and saliency rank. If the subject is a person, we also provide the speaking confidence and looking at the camera confidence.
The saliency rank represents our assessment of the most important subject in the scene. Rank starts from 1 and increases monotonically. A lower rank indicates higher importance for that particular subject. For instance, Rank 1 is more salient than rank 2. The speaking confidence is the likelihood score of the person speaking. A confidence score of 0 indicates that the person is not speaking and a confidence score of 1 indicates that the person is speaking. Similarly, lookingAtCameraConfidence is the likelihood score of the person looking directly at the camera. Now let us see how you can use these parameters in your app to create customized tracking features. Say now I am writing an app to always track the active speakers. First, I query tracking state as an async sequence and save it in my variable trackingState. I update this variable whenever DockKit provides me with an update. Now to track the active speakers, I have a function that first gets the list of all tracked subjects that are persons and then filters them to get all persons speaking with a confidence greater than 80%. I can then pass this list to the selectSubjects API and just like that DockKit will now track all active speakers in the scene. This is great because it allows you to utilize our Machine Learning signals to determine what is most important for your users and design innovative and distinctive features effortlessly. So this is how you can benefit from intelligent tracking in your app. We are also adding button support for DockKit accessories. Let’s dive into how buttons work for first party and third party applications. For Camera and FaceTime, DockKit supports three types of accessory events out of the box: Shutter, flip, and zoom. With the shutter event, users can quickly capture a photo or video. The flip event allows users to switch between the front and back cameras seamlessly. The zoom event enables users to zoom in or out of the scene. We also deliver these events to your app, allowing you to implement custom behaviors that will enhance your users experience. Camera shutter and flip events are toggle events that do not have any value associated with them. Camera zoom event has a relative factor. For example, a value of 2.0 should double the size of an image and halve the field of view. You can handle the factor as you please. The accessory can also send a custom button event, including an ID to identify the button and a boolean value indicating whether it was pressed or not.
This provides greater flexibility, enabling you to design custom behaviors that your customers will truly appreciate. Building on this, we're introducing a new class of DockKit accessories that will greatly benefit from button support: Gimbals. Gimbals are a game-changer for action-packed sports and photography. They allow you to lock onto the athlete, eliminating the need for manual panning and tilting. These new DockKit gimbals will help stabilize camera movement to create smooth, and professional-looking videos. Let us see gimbals in action! I happen to have a dockKit gimbal with me.
I have already paired it with my iPhone. I can simply dock my phone, and it magically connects to the gimbal.
Now I can open the camera app, and the gimbal tracks me beautifully.
I can now do cool new things with this gimbal. I can hold it in my hand, and it still tracks me.
I can press the flip button on the gimbal to switch to the back camera and show you my beautiful room.
Look who we have here. Its Steve again, it’s nice to see you, Steve. I can now hit record to start recording a video and use the scroll wheel to zoom into Steve. With gimbals, DockKit powers dynamic hand-held experiences. These new buttons help enable these experiences. Let us explore how you can take advantage of button controls in your app. Say I am writing a camera app to take panoramas. I have a DockKit gimbal with a custom button with ID 5 and I want to take advantage of the button to start and stop rotating my gimbal to take a panorama. First I write 2 functions, one to start my Panorama rotation, and one to stop it. Then I subscribe to accessory events. When the user triggers the button 5 event on the dock, the event is communicated to my app. If button 5 is pressed, I start rotating the DockKit accessory with a constant velocity. I stop rotating when the button is unpressed. With just a few lines of code I can now design beautiful camera experiences with my DockKit accessory. Leveraging on our work in intelligent subject tracking, and remote control I am excited to announce that in iOS 18 we have expanded DockKit support to new camera modes in the iOS camera app - photo, panorama and cinematic mode. We can now track subjects in the Camera app for photo mode. You can use your Apple Watch or DockKit gimbals to capture subjects and sceneries. In Pano mode, just one button press allows you to take beautiful panoramas to capture expansive representation of objects and environments autonomously. In Cinematic mode, you can now track the person in focus cinematically. In iOS 18, we've also added a feature that allows you to monitor the battery of your DockKit accessory within your app. You can utilize this data to implement custom behaviors and display relevant status messages to your users. You can subscribe to the async sequence battery states. A dock can report charging states for multiple batteries. A battery is identified by a name, and it reports the current battery percentage and charging state. For instance, a Dock connected to power might report battery level as 50% and charge state as charging.
We introduced Intelligent Tracking which acts as an AI cameraman to select and track subjects in a scene autonomously. You can also manually control the cameraman using an Apple Watch or direct the cameraman using APIs. We also introduced DockKit gimbals to support fast-paced sports photography. With these new APIs and accessories, I hope to unlock new exciting use-cases for DockKit. The innovation potential with DockKit is enormous, and I'm excited to see where this journey takes us!
-
-
Looking for something specific? Enter a topic above and jump straight to the good stuff.
An error occurred when submitting your query. Please check your Internet connection and try again.