Hello!
I would like to develop a visionOS application that tracks a single object in a user's environment. Skimming through the documentation I found out that this feature is currently unsupported in ARKit (we can only recognize images). But it seems it should be doable by combining CoreML and Vision frameworks. So I have a few questions:
- Is it the best approach or is there a simpler solution?
- What is the best way to train a CoreML model without access to the device? Will videos recorded by iPhone 15 be enough?
Thank you in advance for all the answers.