Are there going to be any sessions on Image Playgrounds API for iOS?
"Explore machine learning on Apple platforms" mentions the writing and points to sessions, but only mentions Image Playground without pointing to sessions.
Explore the power of machine learning and Apple Intelligence within apps. Discuss integrating features, share best practices, and explore the possibilities for your app here.
Post
Replies
Boosts
Views
Activity
The What’s New in Create ML session in WWDC24 went into great depth with time-series forecasting models (beginning at: 15:14) and mentioned these new models, capabilities, and tools for iOS 18. So, far, all I can find is API documentation. I don’t see any other session in WWDC24 covering these new time-series forecasting Create ML features.
Is there more substance/documentation on how to use these with Create ML? Maybe I am looking in the wrong place but I am fairly new with ML.
Are there any food truck / donut shop demo/sample code like in the video?
It is of great interest to get ahead of the curve on this within business applications that may take advantage of this with inventory / ordering data.
After watching the What's new in App Intents session I'm attempting to create an intent conforming to URLRepresentableIntent. The video states that so long as my AppEntity conforms to URLRepresentableEntity I should not have to provide a perform method . My application will be launched automatically and passed the appropriate URL.
This seems to work in that my application is launched and is passed a URL, but the URL is in the form: FeatureEntity/{id}.
Am I missing something, or is there a trick that enables it to pass along the URL specified in the AppEntity itself?
struct MyExampleIntent: OpenIntent, URLRepresentableIntent {
static let title: LocalizedStringResource = "Open Feature"
static var parameterSummary: some ParameterSummary {
Summary("Open \(\.$target)")
}
@Parameter(title: "My feature", description: "The feature to open.")
var target: FeatureEntity
}
struct FeatureEntity: AppEntity {
// ...
}
extension FeatureEntity: URLRepresentableEntity {
static var urlRepresentation: URLRepresentation {
"https://myurl.com/\(.id)"
}
}
The Translation API introduced at Session 10117 is impressive, but limiting it to SwiftUI is restrictive.
This API works great in the demo, but for more complex apps, it lacks flexibility because it is bound to SwiftUI Views.
Please consider making it available in non-SwiftUI environments.
I created a model that classifies certain objects using yolov8. I noticed that the model is not working properly in my application. While the model works fine in Xcode preview, in the application it either returns the same result with 99% accuracy for each classification or does not provide any result.
In Preview it looks like this:
Predictions:
extension CameraVC : AVCapturePhotoCaptureDelegate {
func photoOutput(_ output: AVCapturePhotoOutput, didFinishProcessingPhoto photo: AVCapturePhoto, error: (any Error)?) {
guard let data = photo.fileDataRepresentation() else {
return
}
guard let image = UIImage(data: data) else {
return
}
guard let cgImage = image.cgImage else {
fatalError("Unable to create CIImage")
}
let handler = VNImageRequestHandler(cgImage: cgImage,orientation: CGImagePropertyOrientation(image.imageOrientation))
DispatchQueue.global(qos: .userInitiated).async {
do {
try handler.perform([self.viewModel.detectionRequest])
} catch {
fatalError("Failed to perform detection: \(error)")
}
}
lazy var detectionRequest: VNCoreMLRequest = {
do {
let model = try VNCoreMLModel(for: bestv720().model)
let request = VNCoreMLRequest(model: model) { [weak self] request, error in
self?.processDetections(for: request, error: error)
}
request.imageCropAndScaleOption = .centerCrop
return request
} catch {
fatalError("Failed to load Vision ML model: \(error)")
}
}()
This is where i print recognized objects:
func processDetections(for request: VNRequest, error: Error?) {
DispatchQueue.main.async {
guard let results = request.results as? [VNRecognizedObjectObservation] else {
return
}
var label = ""
var all_results = []
var all_confidence = []
var true_results = []
var true_confidence = []
for result in results {
for i in 0...results.count{
all_results.append(result.labels[i].identifier)
all_confidence.append(result.labels[i].confidence)
for confidence in all_confidence {
if confidence as! Float > 0.7 {
true_results.append(result.labels[i].identifier)
true_confidence.append(confidence)
}
}
}
label = result.labels[0].identifier
}
print("True Results " , true_results)
print("True Confidence ", true_confidence)
self.output?.updateView(label:label)
}
}
I converted the model like this:
from ultralytics import YOLO
model = YOLO(model_path)
model.export(format='coreml', nms=True, imgsz=[720,1280])
Heyy, on my Iphone 15 Pro, i cant use the new Siri. It shows me the old design of Siri. I got IOS 18.0 Developer Beta. I got no clue, why it dont works for me.
We are currently working on implementing a baby cry detection model in the frontend of our app but have encountered some challenges with the mel spectrogram transformation.
Our mel spectrogram class, developed in python, leverages librosa for generating mel spectrograms (librosa.feature.melspectrogram and librosa.power_to_db). While we have successfully exported the model to a .mlmodel file, the results we obtain in Swift differ significantly from those generated by our Python code.
Could this discrepancy be due to the use of librosa in Python, which might not be directly compatible with Swift? Or should the transformation process be inherently consistent once exported to a .mlmodel file?
My App has several resources that I'd like to spring open through App Intents. For example a series of Dictionaries. These resources however in the app are behind a log in (for security) and are entitlements that are purchased. They may own 4 of 7 dictionaries.
If I want to have an intent that says, "Open Dictionary: (Dict Name)" how do I best handle situations where the user may no longer be logged in or have the entitlement for that specific dictionary?
Thanks
I'm trying to convert a TensorFlow model that I didn't create and know approximately nothing about to CoreML so that I can use it in some functional tests. I can't tell you much about the model, but you can read about it on the blog from the team that created it: https://research.google/blog/improving-mobile-app-accessibility-with-icon-detection/
I can't convert this model to a TensorFlow Lite model because it uses a few full TensorFlow operations (which I could work around) and it exceeds the 4-tensor output limit (which I can't, AFAIK). So instead, I'm trying to convert the model to CoreML so that I can run it on-device.
The issue I'm running into is that every approach fails in different ways. If I load the model with tf.saved_model.load and pass that as the first parameter to the convert call, it says
NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got <tensorflow.python.trackable.autotrackable.AutoTrackable object at 0x30d90c250>
If I pass model.signatures['serving_default'] as the first parameter to convert, I get
NotImplementedError: Expected model format: [SavedModel | concrete_function | tf.keras.Model | .h5 | GraphDef], got ConcreteFunction [...a page or two of info about the function here...]
If I try to wrap it in a Keras layer using the instructions provided in the converter, it fails because a sequential model can't have multiple outputs.
If I try to use a tf.keras.layers.TFSMLayer to load the model, it fails because there are multiple tags, and there's no way to specify tags when constructing the layer. (It tells me that I need to add 'tags' to load the model, but if I do that, it tells me that tags isn't a valid parameter to the call.)
If I load the model with tf.saved_model.load and specify a single tag, then re-save it in a different location with tf.saved_model.save to generate a new model with only a single tag, then do
input_layer = tf.keras.Input(shape=(768, 768, 3), dtype="int8")
layer = tf.keras.layers.TFSMLayer("./serve_model", call_endpoint='serving_default')
outputs = layer(input_layer)
model = tf.keras.Model(input_layer, outputs)
I get
AttributeError: 'Functional' object has no attribute '_get_save_spec'
At one point, I also tried this:
class LayerFromSavedModel(tf.keras.layers.Layer):
def __init__(self):
super(LayerFromSavedModel, self).__init__()
self.vars = legacy_model.variables
def call(self, inputs):
return legacy_model.signatures['serving_default'](inputs)
input = tf.keras.Input(shape=(3000, 3000, 3))
model = tf.keras.Model(input, LayerFromSavedModel()(input))
and saw a similar failure.
I've run out of ideas here. Is there simply no support whatsoever in the converter for importing a TensorFlow 2 SavedModel into CoreML, or am I missing something fundamental?
Hello. Where can I find some examples on creating custom genmojis in Swift and reusing it in an App?
Hellooo,
I’m looking to implement an OpenAI assistant using APIs but I want to do this locally on a group of files.
I want to be able to train a GPT on the contents of a folder for example.
Does anyone have any experience in this?
It seems OpenAI needs a lot of uploading on each request if I were to do this with their API after playing around (but this feels like I’m missing something). It’s also quite costly to use.
I was hoping to use local machine learning and models but this is quite limited in what it can do (eg Lumachain)
Here is an App using CoreML API with ML package format, it works fine in iOS17, while it is crashed when calling [MLModel modelWithContentsOfURL ] to load model running in iOS18. It seems an exception is raised "Failed to set compute_device_types_mask E5RT: Cannot provide zero compute device types. (1)". Is it a bug of iOS18 beta version , and it will be fixed in the future?
The stack is as below:
Exception Codes: #0 at 0x1e9280254
Crashed Thread: 49
Application Specific Information:
*** Terminating app due to uncaught exception 'NSGenericException', reason: 'Failed to set compute_device_types_mask E5RT: Cannot provide zero compute device types. (1)'
Last Exception Backtrace:
0 CoreFoundation 0x0000000199466418 __exceptionPreprocess + 164
1 libobjc.A.dylib 0x00000001967cde88 objc_exception_throw + 76
2 CoreFoundation 0x0000000199560794 -[NSException initWithCoder:]
3 CoreML 0x00000001b4fcfa8c -[MLE5ProgramLibraryOnDeviceAOTCompilationImpl createProgramLibraryHandleWithRespecialization:error:] + 1584
4 CoreML 0x00000001b4fcf3cc -[MLE5ProgramLibrary _programLibraryHandleWithForceRespecialization:error:] + 96
5 CoreML 0x00000001b4fc23d8 __44-[MLE5ProgramLibrary prepareAndReturnError:]_block_invoke + 60
6 libdispatch.dylib 0x00000001a12e1160 _dispatch_client_callout + 20
7 libdispatch.dylib 0x00000001a12f07b8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
8 CoreML 0x00000001b4fc3e98 -[MLE5ProgramLibrary prepareAndReturnError:] + 220
9 CoreML 0x00000001b4fc3bc0 -[MLE5Engine initWithContainer:configuration:error:] + 220
10 CoreML 0x00000001b4fc3888 +[MLE5Engine loadModelFromCompiledArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 344
11 CoreML 0x00000001b4faf53c +[MLLoader _loadModelWithClass:fromArchive:modelVersionInfo:compilerVersionInfo:configuration:error:] + 364
12 CoreML 0x00000001b4faedd4 +[MLLoader _loadModelFromArchive:configuration:modelVersion:compilerVersion:loaderEvent:useUpdatableModelLoaders:loadingClasses:error:] + 540
13 CoreML 0x00000001b4f9b900 +[MLLoader _loadWithModelLoaderFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 424
14 CoreML 0x00000001b4faaeac +[MLLoader _loadModelFromArchive:configuration:loaderEvent:useUpdatableModelLoaders:error:] + 460
15 CoreML 0x00000001b4fb0428 +[MLLoader _loadModelFromAssetAtURL:configuration:loaderEvent:error:] + 240
16 CoreML 0x00000001b4fb00c4 +[MLLoader loadModelFromAssetAtURL:configuration:error:] + 104
17 CoreML 0x00000001b5314118 -[MLModelAssetResourceFactoryOnDiskImpl modelWithConfiguration:error:] + 116
18 CoreML 0x00000001b5418cc0 __60-[MLModelAssetResourceFactory modelWithConfiguration:error:]_block_invoke + 72
19 libdispatch.dylib 0x00000001a12e1160 _dispatch_client_callout + 20
20 libdispatch.dylib 0x00000001a12f07b8 _dispatch_lane_barrier_sync_invoke_and_complete + 56
21 CoreML 0x00000001b5418b94 -[MLModelAssetResourceFactory modelWithConfiguration:error:] + 276
22 CoreML 0x00000001b542919c -[MLModelAssetModelVendor modelWithConfiguration:error:] + 152
23 CoreML 0x00000001b5380ce4 -[MLModelAsset modelWithConfiguration:error:] + 112
24 CoreML 0x00000001b4fb0b3c +[MLModel modelWithContentsOfURL:configuration:error:] + 168
Hei Sir is not working in ios beta 18
I consistently receive corrupted results from tf.signal.fft3d() when it is within a function that has a @tf.function decorator. The results are all zero (0.) for entries after a certain x index (see image). Surprisingly, the issue depends on the matrix size. For example, (1023, 1023, 287) works but (1023, 1023, 575) does not. The issue is problematic because it occurs silently and not for all matrix sizes, i.e. can easily slip through tests.
The error occurs only when tensorflow-metal is installed. The Tensorflow version is 2.16.1. My hardware is a Macbook Pro M3 Max with 40 GPU cores, 128 GB RAM running MacOS Sonoma version 14.5 (23F79). A Python environment to reproduce the bug can be created as follows:
conda create --name tfmetalbug python=3.11.9
conda activate tfmetalbug
pip install tensorflow tensorflow-metal
conda install matplotlib
The following code reproduces the issue:
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# Wrap fft3d with tf.function
@tf.function
def fft3d_wrapper_function(x):
return tf.signal.fft3d(x)
# Generate a 3D image
img = tf.random.normal(shape=(1023, 1023, 575), stddev=1., dtype=float) # generate random 3d image
img = tf.dtypes.cast(img, tf.complex64) # convert to complex values
# Compute the 3D FFT
img_fft = fft3d_wrapper_function(img)
# Visualize the 3D FFT
plt.imshow(np.real(img_fft)[:, img_fft.shape[1]//2+10, :], cmap="gray", vmin=-0.001, vmax=0.001)
plt.savefig("fft3d_wrapper_function.png")
For me, removing the @tf.function decorator has resolved the issue.
I am testing the new scaled dot product attention CoreML op on macOS 15 beta 1. Based on the session video I was expecting to see a speedup when running on GPU however I see roughly equivalent performance to the same model on macOS 14.
I ran tests with two models:
one that simply repeats y = sdpa(y, k, v) 50 times
gpt2 124M converted from nanoGPT (the only change is not returning loss from the forward method)
I converted both models using coremltools 8.0b1 with minimum deployment targets of macOS 14 and also macOS 15. In Xcode, I can see that the new op was used for the macOS 15 target. Running on macOS 15 both target models take the same time, and that time matches the runtime on macOS 14.
Should I be seeing performance improvements?
When in Safari, you can say something like, "Siri, text this link to mom" or "Siri, save this link to reminders" and it will do it with the currently viewed link. Shortcuts also has a "Get what's on screen" action that can be added. How do I expose the user's current context to my App Intent?
The DINO v1/v2 models are particularly interesting to me as they produce embeddings for the detected objects rather than ordinary classification indexes.
That makes them so much more useful than the CNN based models.
I would like to prepare some of the models posted on Huggingface to run on Apple Silicon, but it seems that the default conversion with TorchScript will not work. The other default conversions I've looked at so far also don't work. Conversion based on an example input doesn't capture enough of the model.
I know that some have managed to convert it as I have a demo with a coreml model that seems to work, but I would like to know how to do the conversion myself.
Has anyone managed to convert any of the DINOv2 models?
I'm playing with the new Vision API for iOS18, specifically with the new CalculateImageAestheticsScoresRequest API.
When I try to perform the image observation request I get this error:
internalError("Error Domain=NSOSStatusErrorDomain Code=-1 \"Failed to create espresso context.\" UserInfo={NSLocalizedDescription=Failed to create espresso context.}")
The code is pretty straightforward:
if let image = image {
let request = CalculateImageAestheticsScoresRequest()
Task {
do {
let cgImg = image.cgImage!
let observations = try await request.perform(on: cgImg)
let description = observations.description
let score = observations.overallScore
print(description)
print(score)
} catch {
print(error)
}
}
}
I'm running it on a M2 using the simulator.
Is it a bug? What's wrong?
I am very new to App Intents and I am trying to add them to my On Device LLM ChatBot app so my users can get answers to any questions anywhere in iOS.
I have the following code and it is working wonderfully in the Shortcuts app.
import AppIntents
struct AskAi: AppIntent {
static var openAppWhenRun: Bool = false
static let title: LocalizedStringResource = "Ask Ai About"
static let description = "Gets an answer from Ai for your question."
@Parameter(title: "Question")
var question: String
static var parameterSummary: some ParameterSummary {
Summary("Ask Ai About \(\.$question)")
}
@MainActor
func perform() async throws -> some IntentResult & ReturnsValue<String> {
let bot: Bot = Bot()
await bot.respond(to: self.question)
return .result(
value: bot.output
)
}
}
class AppShortcuts: AppShortcutsProvider {
static var appShortcuts: [AppShortcut] {
AppShortcut(
intent: AskAi(),
phrases: [
"Ask \(.applicationName) \(\.$question)",
"Get \(.applicationName) answer for \(\.$question)",
"Open \(\.$question) using \(.applicationName) ",
"Using \(.applicationName) get help with \(\.$question)"
],
shortTitle: "Ask Ai",
systemImageName: "sparkles"
)
}
}
I can create a shortcut for this AppIntent and that allows me say speak the response.
I can call my shortcut via iOS 18 Beta 1 by the Shortcut name I set in the Shortcuts app and that allows it to work.
It does not work at all by just Asking Siri any of the phrases I have defined.
The info.plist has an app name alias defined just to be sure.
I even added the Siri capability in Xcode-beta.
I also tried using the ProvidesDialog return type too.
Whatever I do the AppIntent is invisible to Siri.
Siri tries to search the web, looking for my app name in the contacts or have an error Apple Cash which has nothing to do with what I was talking about.
Is there anything else I am missing for setting up iOS AppIntents to work with Siri?
Hi, I'd like to use a MLX model already in the MLX Community in my App. I understand I first need to convert it to Core ML format.
Is there an easy way to do that considering MLX is an Apple project?
It would be good if it was easier then I'd be more motivated to use MLX to train my models.
Thanks
Richard