I need a simple text-to-speech avatar in my iOS app. iOS already has Memojis ready to go - but I cannot find anywhere in the dev docs on how to access Memojis to use in as a tool in app development. Am I missing something? Also - can anyone point me to any resources besides the Apple docs for using AVSpeechSynthesis?
General
RSS for tagExplore the power of machine learning within apps. Discuss integrating machine learning features, share best practices, and explore the possibilities for your app.
Post
Replies
Boosts
Views
Activity
I cannot find the bug ... but run this code (python) on torch device mps0 is slow
quicker and cpu0 or cpu1 ... but where is the bug? or run it on neural engine with cpu1?
you need a setup like this:
#!/bin/bash
export HOMEBREW_BREW_GIT_REMOTE="https://github.com/Homebrew/brew" # put your Git mirror of Homebrew/brew here
export HOMEBREW_CORE_GIT_REMOTE="https://github.com/Homebrew/homebrew-core" # put your Git mirror of Homebrew/homebrew-core here
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
eval "$(/opt/homebrew/bin/brew shellenv)"
brew update --force --quiet
chmod -R go-w "$(brew --prefix)/share/zsh"
export OPENBLAS=$(/opt/homebrew/bin/brew --prefix openblas)
export CFLAGS="-falign-functions=8 ${CFLAGS}"
brew install wget
brew install unzip
conda init --all
conda create -n torch-gpu python=3.10
conda activate torch-gpu
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 -c pytorch
conda install -c conda-forge jupyter jupyterlab
python3 -m pip install --upgrade pip
python3 -m pip install insightface==0.2.1 onnx imageio scikit-learn scikit-image moviepy
python3 -m pip install googledrivedownloader
python3 -m pip install imageio==2.4.1
python3 -m pip install Cython
python3 -m pip install --no-use-pep517 numpy
python3 -m pip install torch
python3 -m pip install image
python3 -m pip install timm
python3 -m pip install PlL
python3 -m pip install h5py
for i in `seq 1 6`; do
python3 test.py
done
conda deactivate
exit 0
test.py:
import torch
import math
# this ensures that the current MacOS version is at least 12.3+
print(torch.backends.mps.is_available())
# this ensures that the current current PyTorch installation was built with MPS activated.
print(torch.backends.mps.is_built())
dtype = torch.float
device = torch.device("cpu",0)
#device = torch.device("cpu",1)
#device = torch.device("mps",0)
# Create random input and output data
x = torch.linspace(-math.pi, math.pi, 2000, device=device, dtype=dtype)
y = torch.sin(x)
# Randomly initialize weights
a = torch.randn((), device=device, dtype=dtype)
b = torch.randn((), device=device, dtype=dtype)
c = torch.randn((), device=device, dtype=dtype)
d = torch.randn((), device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(2000):
# Forward pass: compute predicted y
y_pred = a + b * x + c * x ** 2 + d * x ** 3
# Compute and print loss
loss = (y_pred - y).pow(2).sum().item()
if t % 100 == 99:
print(t, loss)
# Backprop to compute gradients of a, b, c, d with respect to loss
grad_y_pred = 2.0 * (y_pred - y)
grad_a = grad_y_pred.sum()
grad_b = (grad_y_pred * x).sum()
grad_c = (grad_y_pred * x ** 2).sum()
grad_d = (grad_y_pred * x ** 3).sum()
# Update weights using gradient descent
a -= learning_rate * grad_a
b -= learning_rate * grad_b
c -= learning_rate * grad_c
d -= learning_rate * grad_d
print(f'Result: y = {a.item()} + {b.item()} x + {c.item()} x^2 + {d.item()} x^3')
Hello, I have been working to try to create a scanner to scan a PDF417 barcode from your photos library for a few days now and have come to a dead end. Every time that I run my function on the photo, my array of observations always returns as []. This example is me trying to use it with an automatic generated image because I think that if it works with this, it will work with a real screenshot. That being said, I have already tried with all sorts of images that aren't pre-generated, and they, still, have failed to work. Code below:
Calling the function
createVisionRequest(image: generatePDF417Barcode(from: "71238-12481248-128035-40239431")!)
Creating the Barcode:
static func generatePDF417Barcode(from key: String) -> UIImage? {
let data = key.data(using: .utf8)!
let filter = CIFilter.pdf417BarcodeGenerator()
filter.message = data
filter.rows = 7
let transform = CGAffineTransform(scaleX: 3, y: 4)
if let outputImage = filter.outputImage?.transformed(by: transform) {
let context = CIContext()
if let cgImage = context.createCGImage(outputImage, from: outputImage.extent) {
return UIImage(cgImage: cgImage)
}
}
return nil
}
Main function for scanning the barcode:
static func desynthesizeIDScreenShot(from image: UIImage, completion: @escaping (String?) -> Void) {
guard let ciImage = CIImage(image: image) else {
print("Empty image")
return
}
let imageRequestHandler = VNImageRequestHandler(ciImage: ciImage, orientation: .up)
let request = VNDetectBarcodesRequest { (request,error) in
guard error == nil else {
completion(nil)
return
}
guard let observations = request.results as? [VNDetectedObjectObservation] else {
completion(nil)
return
}
request.revision = VNDetectBarcodesRequestRevision2
let result = (observations.first as? VNBarcodeObservation)?.payloadStringValue
print("Observations", observations)
if let result {
completion(result)
print()
print(result)
} else {
print(error?.localizedDescription) //returns nil
completion(nil)
print()
print(result)
print()
}
}
request.symbologies = [VNBarcodeSymbology.pdf417]
try? imageRequestHandler.perform([request])
}
Thanks!
Tensorflow metal was working on my Power Mac Mac M3 until yesterday. Then my code started freezing. I ran the test script from https://developer.apple.com/metal/tensorflow-plugin/ and it now crashes - this used to work fine, but all of a sudden it does not. The results are shown below. Has anyone seen anything like this? Could this be a hardware problem?
MacBook-Pro-3: carl$ python mac_tensorflow_test.py
Epoch 1/5
1/782 [..............................] - ETA: 51:53 - loss: 6.0044 - accuracy: 0.0312Error: command buffer exited with error status.
The Metal Performance Shaders operations encoded on it may not have completed.
Error:
(null)
Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored)
<AGXG15XFamilyCommandBuffer: 0x1172515e0>
label = <none>
device = <AGXG15SDevice: 0x1588e6000>
name = Apple M3 Pro
commandQueue = <AGXG15XFamilyCommandQueue: 0x17427e400>
label = <none>
device = <AGXG15SDevice: 0x1588e6000>
name = Apple M3 Pro
retainedReferences = 1
Error: command buffer exited with error status.
The Metal Performance Shaders operations encoded on it may not have completed.
Error:
(null)
Ignored (for causing prior/excessive GPU errors) (00000004:kIOGPUCommandBufferCallbackErrorSubmissionsIgnored)
<AGXG15XFamilyCommandBuffer: 0x117257b40>
label = <none>
device = <AGXG15SDevice: 0x1588e6000>
name = Apple M3 Pro
commandQueue = <AGXG15XFamilyCommandQueue: 0x17427e400>
label = <none>
device = <AGXG15SDevice: 0x1588e6000>
name = Apple M3 Pro
retainedReferences = 1
Many more rows of similar printouts follow.
Hello,
I can see many apps that analyzes sound from microphone in real time. Is there another library like Audiokit or all of them are made with Audiokit??
Thanks
Hi i am trying to set up tensorflow-metal as instructed by https://developer.apple.com/metal/tensorflow-plugin/
when running line (python -m pip install tensorflow-metal) I get the following error:
ERROR: Could not find a version that satisfies the requirement tensorflow-metal (from versions: none)
ERROR: No matching distribution found for tensorflow-metal
According to the troubleshooting section: "Check that the Python version used in the environment is supported (Python 3.8, Python 3.9, Python 3.10)." My current version is Python 3.9.12.
Any insight would be great!
Hi,
I am looking for a routine to perform complex-valued linear algebra on the GPU in python for scientific programming, in particular quantum physics simulations.
At the moment I am looking for a routine for complex-valued matrix multiplication. I found MLX has a routine for float matrix multiplication, but it does not directly work for complex-valued matrices. I figured a work-around by splitting the complex valued matrix into real and imaginary part and working with the pair, but it makes it cumbersome to integrate with the remainder of the code. I was hoping for a library-based implementation similar to cupy.
I also tried out using the tensorflow linear algebra routines, but I couldn't get them to run on the GPU by now. Specifically, a testfile with a tensorflow.keras.applications.ResNet50 routine runs on the GPU, but the routines from tensorflow.linalg and tensorflow.math that I tested (matmul, expm, eigh) were not running on the GPU.
Any advice on how to make linear algebra calculations on mac GPUs work is highly appreciated! For my application the unified memory might be especially beneficial.
Thank you!
Can you use View with Transferable View in the one WindowGroup to another
ImmersiveSpace with RealityView?
I can drag, but the drop event isn't captured when with RealityView
var body: some View {
let droppable = Droppable( model: model )
RealityView { content in
// Add the initial RealityKit content
content.add(floorEntity)
}
.onDrop( of: ...
// or
.dropDestination( For ... {}
//or
.gesture( DragGesture()
.targetedToAnyEntity()
.onChanged({ value in
none of them triggers the drop
I'm working with MLSoundClassifier to try to look for 2 different sounds in a live audio stream. I have been debating with the team if it is better to train 2 separate models, one for each different sound, or train 1 model on both sounds? Has anyone had any experience with this. Some of us believe that we have received better results with the separate models and some with 1 single model trained on both sounds. Thank you!
I've only been using this late 2021 MBP 16 for nearly 2 years, and now the speaker is producing a crackling sound. Upon inquiring about repairs, customer service informed me that it would cost $728 to replace the speaker, which is a third of the price of the laptop itself. It's absolutely absurd that a $2200 laptop's speaker would fail within such a short period without any external damage. The repair cost being a third of the laptop's price is outrageous. I intend to initiate a petition in the US, hoping to connect with others experiencing the same problem. This is indicative of a subpar product, and customers shouldn't bear the burden of Apple's shortcomings. I plan to share my grievances on various social media platforms and if the issue persists, I will escalate it to the media for further exposure.
InvalidArgumentError: Cannot assign a device for operation don_nn/model_2/branch_hidden0/MatMul/ReadVariableOp: Could not satisfy explicit device specification '' because the node {{colocation_node don_nn/model_2/branch_hidden0/MatMul/ReadVariableOp}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0].
Problem
I am trying to use the jax.numpy.einsum function (https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.einsum.html). However, for some subscripts, this seems to fail.
Hardware
Apple M1 Max, 32GB RAM
Steps to Reproduce
follow installation steps from https://developer.apple.com/metal/jax/
conda create -n 'jax_metal_demo' python=3.11
conda activate jax_metal_demo
python -m pip install numpy wheel ml-dtypes==0.2.0
python -m pip install jax-metal
Save the following code in a file called minimal_example.py
import numpy as np
from jax import device_put
import jax.numpy as jnp
np.random.seed(0)
a = np.random.rand(11, 12, 13, 11, 12)
b = np.random.rand(11, 12, 13)
subscripts = 'ijklm,ijk->lmk'
# intended result
print(np.einsum(subscripts, a, b))
# will cause crash
a, b = device_put(a), device_put(b)
print(jnp.einsum(subscripts, a, b))
run the code
python minimal_example.py
Output
I waas expecting
Platform 'METAL' is experimental and not all JAX functionality may be correctly supported!
2024-02-12 16:45:34.684973: W pjrt_plugin/src/mps_client.cc:563] WARNING: JAX Apple GPU support is experimental and not all JAX functionality is correctly supported!
Metal device set to: Apple M1 Max
systemMemory: 32.00 GB
maxCacheSize: 10.67 GB
Traceback (most recent call last):
File "/Users/linus/workspace/minimal_example.py", line 15, in <module>
print(jnp.einsum(subscripts, a, b))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/linus/miniforge3/envs/jax_metal_demo/lib/python3.11/site-packages/jax/_src/numpy/lax_numpy.py", line 3369, in einsum
return _einsum_computation(operands, contractions, precision, # type: ignore[operator]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/linus/miniforge3/envs/jax_metal_demo/lib/python3.11/contextlib.py", line 81, in inner
return func(*args, **kwds)
^^^^^^^^^^^^^^^^^^^
jaxlib.xla_extension.XlaRuntimeError: UNKNOWN: /Users/linus/workspace/minimal_example.py:15:6: error: failed to legalize operation 'mhlo.dot_general'
print(jnp.einsum(subscripts, a, b))
^
/Users/linus/workspace/minimal_example.py:15:6: note: see current operation: %0 = "mhlo.dot_general"(%arg1, %arg0) {dot_dimension_numbers = #mhlo.dot<lhs_batching_dimensions = [2], rhs_batching_dimensions = [2], lhs_contracting_dimensions = [0, 1], rhs_contracting_dimensions = [0, 1]>, precision_config = [#mhlo<precision DEFAULT>, #mhlo<precision DEFAULT>]} : (tensor<11x12x13xf32>, tensor<11x12x13x11x12xf32>) -> tensor<13x11x12xf32>
--------------------
For simplicity, JAX has removed its internal frames from the traceback of the following exception. Set JAX_TRACEBACK_FILTERING=off to include these.
Conclusion
I would greatly appreciate any ideas for workarounds.
macbook pro m2 max/ 64G / macos:13.2.1 (22D68)
import tensorflow as tf
def runMnist(device = '/device:CPU:0'):
with tf.device(device):
#tf.config.set_default_device(device)
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10)
])
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer='adam',
loss=loss_fn,
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=10)
runMnist(device = '/device:CPU:0')
runMnist(device = '/device:GPU:0')
I want to use CoreML to process video data. The ML model will take multiple frames as input. How should I get multi frames from ios and process it?
Thanks in advance for any suggestions.
I'm exploring my Vision Pro and finding it unclear whether I can even achieve things like body pose detection etc.
https://developer.apple.com/videos/play/wwdc2023/111241/
It's clear that I can apply it to self provided images, but how about to the data coming from visionOS SDKs?
All I can find is this mesh data from ARKit, https://developer.apple.com/documentation/arkit/arkit_in_visionos - am I missing something or do we not yet have good APIs for this?
Appreciate any guidance! Thanks.
After migrating my ionic cordova app to ionic capacitor I am encountering a persistent white screen on a particular page. Along with this, I have observed the following error messages in the console:
Error Message: [com.apple.VisionKit.RemoveBackground] Request to remove background on an unsupported device. Error Domain=com.apple.VisionKit.RemoveBackground Code=-8 "(null)"
Error Message: [UILog] Called -[UIContextMenuInteraction updateVisibleMenuWithBlock:] while no context menu is visible. This won't do anything.
The actual page becomes visible after clicking on that white screen.
the same code is working fine for android build but facing issue on ios.
I'm using DataScannerViewController with SwiftUI to scan text and barcodes from a card. I would like the user to be able to hold the card in front of the device, but I am not finding a way to select the front camera with DataScannerViewController.
Does anyone know of a way to select the front camera?
I haven't used the GPU implementation for over a year now due to constant issues (I use tf.config.set_visible_devices([], 'GPU') to use CPU only.
I have also had a couple of issues with model convergence using GPU, however this issue seems more prominent, and possibly unrelated.
Here is an example of code that causes a memory leak using GPU (I cannot link the dataset, but it is called: Text classification documentation, by TANISHQ DUBLISH on Kaggle.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
df = pd.read_csv('df_file.csv')
df.head()
train_df = df.sample(frac=0.7, random_state=42)
val_df = df.drop(train_df.index).sample(frac=0.5, random_state=42)
test_df = df.drop(train_df.index).drop(val_df.index)
train_dataset = tf.data.Dataset.from_tensor_slices((train_df['Text'].values, train_df['Label'].values)).batch(32).prefetch(tf.data.AUTOTUNE)
val_dataset = tf.data.Dataset.from_tensor_slices((val_df['Text'].values, val_df['Label'].values)).batch(32).prefetch(tf.data.AUTOTUNE)
test_dataset = tf.data.Dataset.from_tensor_slices((test_df['Text'].values, test_df['Label'].values)).batch(32).prefetch(tf.data.AUTOTUNE)
text_vectorizer = tf.keras.layers.TextVectorization(max_tokens=100_000, output_mode='int', output_sequence_length=1000, pad_to_max_tokens=True)
text_vectorizer.adapt(train_df['Text'].values)
embedding = tf.keras.layers.Embedding(input_dim=len(text_vectorizer.get_vocabulary()), output_dim=128, input_length=1000)
inputs = tf.keras.layers.Input(shape=[], dtype=tf.string)
x = text_vectorizer(inputs)
x = embedding(x)
x = tf.keras.layers.LSTM(64)(x)
outputs = tf.keras.layers.Dense(5, activation='softmax')(x)
model_2 = tf.keras.Model(inputs, outputs, name='model_2_lstm')
model_2.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(), optimizer=tf.keras.optimizers.legacy.Adam(), metrics=['accuracy'])
model_2_history = model_2.fit(train_dataset, epochs=50, validation_data=val_dataset, callbacks=[
tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True),
tf.keras.callbacks.ModelCheckpoint(model_2.name, save_best_only=True),
tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', patience=5, verbose=1)
])
I am using SFSpeechRecognizer to perform speech recognition, but I am getting the following error.
[SpeechFramework] -[SFSpeechRecognitionTask localSpeechRecognitionClient:speechRecordingDidFail:]_block_invoke Ignoring subsequent local speech recording error: Error Domain=kAFAssistantErrorDomain Code=1101 "(null)"
Setting requiresOnDeviceRecognition to False works correctly, but previously it worked with True with no error.
The value of supportsOnDeviceRecognition was True, so the device is recognizing that it supports speech recognition.
iPad Pro 11inch iOS 16.5.
Is this expected behavior?
Hello,
My understanding of the paper below is that iOS ships with a MobileNetv3-based ML model backbone, which then uses different heads for specific tasks in iOS.
I understand that this backbone is accessible for various uses through the Vision framework, but I was wondering if it is also accessible for on-device fine-tuning for other purposes. Just as an example, if I want to have a model to detect some unique object in a photo, can I use the built in backbone or do I have to include my own in the app.
Thanks very much for any advice and apologies if I didn't understand something correctly.
Source: https://machinelearning.apple.com/research/on-device-scene-analysis