I am trying to use the Speech Synthesizer to speak the pronunciation of a word in British English rather than play a local audio file which I had before. However, I keep getting this in the debugger:
#FactoryInstall Unable to query results, error: 5 Unable to list voice folder Unable to list voice folder Unable to list voice folder IPCAUClient.cpp:129 IPCAUClient: bundle display name is nil Unable to list voice folder
Here is my code, any suggestions??
` func playSampleAudio() {
let speechSynthesizer = AVSpeechSynthesizer()
let speechUtterance = AVSpeechUtterance(string: currentWord)
// Search for a voice with a British English accent.
let voices = AVSpeechSynthesisVoice.speechVoices()
var foundBritishVoice = false
for voice in voices {
if voice.language == "en-GB" {
speechUtterance.voice = voice
foundBritishVoice = true
break
}
}
if !foundBritishVoice {
print("British English voice not found. Using default voice.")
}
// Configure the utterance's properties as needed.
speechUtterance.rate = AVSpeechUtteranceDefaultSpeechRate
speechUtterance.pitchMultiplier = 1.0
speechUtterance.volume = 1.0
// Speak the word.
speechSynthesizer.speak(speechUtterance)
}
Explore the integration of media technologies within your app. Discuss working with audio, video, camera, and other media functionalities.
Post
Replies
Boosts
Views
Activity
I have an app that allows users to send messages to each other via a link. I'm receiving many requests from my users to add songs. Can I use the Apple Music API to add 30-second song previews along with the user's message? If users want to listen to the full song, there will be a link to the full song on Apple Music. My app contains ads and subscriptions.
Is there a way to retrieve a list of all the favorited artists for a given user via the Apple Music API? I see endpoints for playback history and resources added to the library. But I want to retrieve the full list of favorited artists for the given user and I don't see any obvious choices in the documentation.
Thanks in advance!
My project has uses an AVAudioEngine with a very simple setup: A Speech recognizer running on a tap on the engine's input with separate AVAudioPlayerNodes handling playback.
try session.setCategory(.playAndRecord, mode: .default, options: [])
try session.setActive(true, options: .notifyOthersOnDeactivation)
try session.setAllowHapticsAndSystemSoundsDuringRecording(true)
filePlayerNode ---> engine.mainMixerNode
bufferPlayerNode --> engine.mainMixerNode
engine.mainMixerNode --> engine.outputNode
//bufferPlayer.scheduleBuffer() is called on its own queue
The input works fine since the buffers can be collected into a file and plays back correctly, and also because the recognizer works fine; but when I try to play the live audio by sending the buffer to the bufferPlayer on this or another device, the buffer audio plays at a very low volume, sometimes with severe distortions. If I lower the sample rate via AVAudioConverter, the distortions get worse.
I've tried experimenting with the AVAudioSession category options, having separate AVAudioEngines, and much, much more, yet I still haven't figured this out. It's gotten to the point where I've fixed almost all the arcane and minor issues in my audio system, yet I still can't play back my voice properly.
The ability to both play and record simultaneously is a basic feature of phones--when on speaker mode, a phone doesn't need to behave like a walkie-talkie. In my mind, it's inconceivable that the relatively new AVAudioEngine doesn't have a implementation for this, since the main issue (feedback loops) can be dealt with via a simple primitive circuit. Live video chat apps like FaceTime wouldn't be possible without this, yet to my surprise I found no answers online (what I did find were articles explaining how to write a file while playback is occurring).
Is there truly no way to do this on AVAudioEngine? Am I missing something fundamental? Any pointers would be greatly appreciated
My app uses the AVFoundation to pronounce some words. Running the app from Xcode, either to a simulator or device, I frequently get this crash at start-up
AXSpeech (13): EXC_BAD_ACCESS (code=EXC_I386_GPFLT).
It seems to occur randomly, maybe 20%-30% of the time I launch the app. When it does not crash, audio works as expected. When launched from a device, it never crashes (at least, so far).
Here's the code that outputs speech:
Declared at the top level of the View struct
@State var synth = AVSpeechSynthesizer()
In the View, as part of a Button's closure:
let utterance = AVSpeechUtterance(string: answer)
utterance.voice = AVSpeechSynthesisVoice(language: "en-US") synth.speak(utterance)
Any idea on how to stop this? It doesn't stop development, but sure slows it down, requiring multiple app starts often.
My app uses the AVFoundation to pronounce some words. Running the app from Xcode, either to a simulator or device, I frequently get this crash at start-up:
AXSpeech (13): EXC_BAD_ACCESS (code=EXC_I386_GPFLT).
It seems to occur randomly, maybe 20%-30% of the time I launch the app. When it does not crash, using audio works as expected. When launched from the device, it never crashes (so far, at least).
Here's the code that outputs speech:
Declared at the top level of the View struct:
@State var synth = AVSpeechSynthesizer()
In the View, as part of a Button's action closure:
let utterance = AVSpeechUtterance(string: answer)
utterance.voice = AVSpeechSynthesisVoice(language: "en_US") synth.speak(utterance)
Any idea on how to stop this? It's annoying having to launch the app multiple times to test on a simulator or device.
We are aiming to apply KeyRotation for FPS service for our content streaming service.
It is not clear from the spécifications how "FairPlay Streaming Programming Guide.pdf" How to apply this feature and let the "FPS DRM key server KSM " send multiple-keys at a time in one licence :(CKC) in the key rotation moment. without using the Renting and Leasing key features.
Note that we don't use Offline Playback capabilities for our streams.
Does anyone have any knowledge or experience with Apple's fisheye projection type? I'm guessing that it's as the name implies: a circular capture in a square frame (encoded in MV-HEVC) that is de-warped during playback.
It'd be nice to be able to experiment with this format without guessing/speculating on what to produce.
Running in a Mac (Catalyst) target or Apple Silicon (designed for iPad).
Just accessing the playbackStoreID from the MPMediaItem shows this error in the console:
-[ITMediaItem valueForMPMediaEntityProperty:]: Unhandled MPMediaEntityProperty subscriptionStoreItemAdamID.
The value returned is always “”.
This works as expected on iOS and iPadOS, returning a valid playbackStoreID.
import SwiftUI
import MediaPlayer
@main
struct PSIDDemoApp: App {
var body: some Scene {
WindowGroup {
Text("playbackStoreID demo")
.task {
let authResult = await MPMediaLibrary.requestAuthorization()
if authResult == .authorized {
if let item = MPMediaQuery.songs().items?.first {
let persistentID = item.persistentID
let playbackStoreID = item.playbackStoreID // <--- Here
print("Item \(persistentID), \(playbackStoreID)")
}
}
}
}
}
}
Xcode 15.1, also tested with Xcode 15.3 beta 2.
MacOS Sonoma 14.3.1
FB13607631
HELP! How could I play a spatial video in my own vision pro app like the official app Photos? I've used API of AVKit to play a spatial video in XCode vision pro simulator with the guild of the official developer document, this video could be played but it seems different with what is played through app Photos. In Photos the edge of the video seems fuzzy but in my own app it has a clear edge.
How could I play the spatial video in my own app with the effect like what is in Photos?
Hello,
I'm playing with the ScreenCaptureKit. My scenario is very basic: I capture frames and assign them to a CALayer to display in a view (for previewing) - basically what Apple's sample app does. I have two screens, and I capture only one of them, which has no app windows on it. And my app excludes itself from capture. So, when I place my app on the screen which is not being captured, I observe that most didOutputSampleBuffer calls receive frames with Idle status (which is expected for a screen which is not updating). However, if I bring my capturing app (which, to remind, is excluded from capture) to the captured screen, the frames start coming with Complete status (i.e. holding pixel buffers). And this is not what I expect - from capture perspective the screen is still not updating, as the app is excluded. The preview which the app displays proves that, showing empty desktop. So it seems like updates to a CALayer triggers screen capture frame regardless of the app being included or excluded. This looks like a bug to me, and can lead to noticable waste of resources and battery when frames are not just displayed on screen, but also processed somehow or/and sent over a network. Also, I'm observing another issue due to this behavior, where capture hangs when queueDepth is set to 3 in this same scenario (but I'll describe it separately).
Please, advise if I should file a bug somewhere, or maybe there is a rational explanation of this behavior.
Thank you
how do I add AudioToolbox on xcode 15.2?
I don't know when these were posted, but I noticed them in the AVFoundation documentation last night. There have been a lot of questions about working with this format, and these are useful. They also include code samples.
Reading multiview 3D video files: https://developer.apple.com/documentation/avfoundation/media_reading_and_writing/reading_multiview_3d_video_files
Converting side-by-side 3D video to multiview HEVC: https://developer.apple.com/documentation/avfoundation/media_reading_and_writing/converting_side-by-side_3d_video_to_multiview_hevc
I haven't found any really thorough documentation or guidance on the use of CIRAWFilter.linearSpaceFilter. The API documentation calls it
An optional filter you can apply to the RAW image while it’s in linear space.
Can someone provide insight into what this means and what the linear space filter is useful for? When would we use this linear space filter instead of a filter on the output of CIRAWFilter?
Thank you.
Does Video Toolbox’s compression session yield data I can decompress on a different device that doesn’t have Apple’s decompression? i.e. so I can network data to other devices that aren’t necessarily Apple?
or is the format proprietary rather than just regular h.264 (for example)?
If I can decompress without video toolbox, may I have reference to some examples for how to do this using cross-platform APIs? Maybe FFMPEG has something?
Is it possible to get the camera intrinsic matrix for a captured single photo on iOS?
I know that one can get the cameraCalibrationData from a AVCapturePhoto, which also contains the intrinsicMatrix. However, this is only provided when using a constituent (i.e. multi-camera) capture device and setting virtualDeviceConstituentPhotoDeliveryEnabledDevices to multiple devices (or enabling isDualCameraDualPhotoDeliveryEnabled on older iOS versions). Then photoOutput(_:didFinishProcessingPhoto:) is called multiple times, delivering one photo for each camera specified. Those then contain the calibration data.
As far as I know, there is no way to get the calibration data for a normal, single-camera photo capture.
I also found that one can set isCameraIntrinsicMatrixDeliveryEnabled on a capture connection that leads to a AVCaptureVideoDataOutput. The buffers that arrive at the delegate of that output then contain the intrinsic matrix via the kCMSampleBufferAttachmentKey_CameraIntrinsicMatrix metadata. However, this requires adding another output to the capture session, which feels quite wasteful just for getting this piece of metadata. Also, I would somehow need to figure out which buffer was temporarily closest to when the actual photo was taken.
Is there a better, simpler way for getting the camera intrinsic matrix for a single photo capture?
If not, is there a way to calculate the matrix based on the image's metadata?
There is a CustomPlayer class and inside it is using the MTAudioProcessingTap to modify the Audio buffer.
Let's say there are instances A and B of the Custom Player class.
When A and B are running, the process of B's MTAudioProcessingTap is stopped and finalize callback is coming up when A finishes the operation and the instance is terminated.
B is still experiencing this with some parts left to proceed. Same code same project is not happening in iOS 17.0 or lower.
At the same time when A is terminated, B can complete the task without any impact on B.
What changes to iOS 17.1 are resulting in these results? I'd appreciate it if you could give me an answer on how to avoid these issues.
let audioMix = AVMutableAudioMix()
var audioMixParameters: [AVMutableAudioMixInputParameters] = []
try composition.tracks(withMediaType: .audio).forEach { track in
let inputParameter = AVMutableAudioMixInputParameters(track: track)
inputParameter.trackID = track.trackID
var callbacks = MTAudioProcessingTapCallbacks(
version: kMTAudioProcessingTapCallbacksVersion_0,
clientInfo: UnsafeMutableRawPointer(
Unmanaged.passRetained(clientInfo).toOpaque()
),
init: { tap, clientInfo, tapStorageOut in
tapStorageOut.pointee = clientInfo
},
finalize: { tap in
Unmanaged<ClientInfo>.fromOpaque(MTAudioProcessingTapGetStorage(tap)).release()
},
prepare: nil,
unprepare: nil,
process: { tap, numberFrames, flags, bufferListInOut, numberFramesOut, flagsOut in
var timeRange = CMTimeRange.zero
let status = MTAudioProcessingTapGetSourceAudio(tap,
numberFrames,
bufferListInOut,
flagsOut,
&timeRange,
numberFramesOut)
if noErr == status {
....
}
})
var tap: Unmanaged<MTAudioProcessingTap>?
let status = MTAudioProcessingTapCreate(kCFAllocatorDefault,
&callbacks,
kMTAudioProcessingTapCreationFlag_PostEffects,
&tap)
guard noErr == status else {
return
}
inputParameter.audioTapProcessor = tap?.takeUnretainedValue()
audioMixParameters.append(inputParameter)
tap?.release()
}
audioMix.inputParameters = audioMixParameters
return audioMix
I have 3d image but when I insert iton my project its come without colors , any noe knows why?
Hi,
I have an idea for an audio application. It does make use of HRTFs in a different way. So I would like to get the HRTF that was made for the user and use it in the application.
Is that possible?
Hi
I was wondering if there any D Function implementations on Python to derive dASK?
Thank you