I am developing a visionOS app that captions speech in real environments. Currently, I am using Apple's built-in speech recognizer. However, when I was testing the app with a Vision Pro, the device seemed to only pick up the user's voice (in other words, the voices of the wearer of the Vision Pro device). For example, when the speech recognition task is running, and another person in front of me is talking, the system does not pick up the speech well.
I tried to set the AVAudioSession to be equally sensitive to all directions:
private func configureAudioSession() {
do {
try audioSession.setCategory(.record, mode: .measurement)
try audioSession.setActive(true)
if #available(visionOS 1.0, *) {
let availableDataSources = audioSession.availableInputs?.first?.dataSources
if let omniDirectionalSource = availableDataSources?.first(where: {$0.preferredPolarPattern == .omnidirectional}) {
try audioSession.setInputDataSource(omniDirectionalSource)
}
}
} catch {
print("Failed to set up audio session: \(error)")
}
}
And here is how I set up the speech recognition and configure the microphone inputs:
private func startSpeechRecognition(completion: @escaping (String) -> Void) {
do {
// Cancel the previous task if it's running.
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
// The AudioSession is already active, creating input node.
let inputNode = audioEngine.inputNode
try inputNode.setVoiceProcessingEnabled(false)
// Create and configure the speech recognition request
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a recognition request") }
recognitionRequest.shouldReportPartialResults = true
// Keep speech recognition data on device
if #available(iOS 13, *) {
recognitionRequest.requiresOnDeviceRecognition = true
}
// Create a recognition task for speech recognition session.
// Keep a reference to the task so that it can be canceled.
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest) { result, error in
// var isFinal = false
if let result = result {
// Update the recognizedText
completion(result.bestTranscription.formattedString)
} else if let error = error {
completion("Recognition error: \(error.localizedDescription)")
}
if error != nil || result?.isFinal == true {
// Stop recognizing speech if there is a problem
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
}
}
// Configure the microphone input
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
} catch {
completion("Audio engine could not start: \(error.localizedDescription)")
}
}
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Post
Replies
Boosts
Views
Activity
Description:
I am developing a recording-only application that supports background recording using AVAudioEngine. The app segments the recording into 60-second files for further processing. For example, a 10-minute recording results in ten 60-second files.
Problem:
The application functions as expected in the background. However, after the app receives an interruption (such as a phone call) and the interruption ends, I can successfully restart the recording. The problem arises when the app then transitions to the background; it fails to restart the recording. Specifically, after ending the call and transitioning the app to the background, the app encounters an error and is unable to restart AVAudioSession and AVAudioEngine. The only resolution is to close and restart the app, which is not ideal for user experience.
Steps to Reproduce:
1. Start recording using AVAudioEngine.
2. The app records and saves 60-second segments.
3. Receive an interruption (e.g., an incoming phone call).
4. End the call.
5. Transition the app to the background.
6. Transition the app to the foreground and the session will be activated again.
7. Attempt to restart the recording.
Expected Behavior:
The app should resume recording seamlessly after the interruption and background transition.
Actual Behavior:
The app fails to restart AVAudioSession and AVAudioEngine, resulting in a continuous error. The recording cannot be resumed without closing and reopening the app.
How I’m Starting the Recording:
Configuration:
internal func setAudioSessionCategory() {
do {
try audioSession.setCategory(
.playAndRecord,
mode: .default,
options: [.defaultToSpeaker, .mixWithOthers, .allowBluetooth]
)
} catch {
debugPrint(error)
}
}
internal func setAudioSessionActivation() {
if UIApplication.shared.applicationState == .active {
do {
try audioSession.setPrefersNoInterruptionsFromSystemAlerts(true)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
if audioSession.isInputGainSettable {
try audioSession.setInputGain(1.0)
}
try audioSession.setPreferredIOBufferDuration(0.01)
try setBuiltInPreferredInput()
} catch {
debugPrint(error)
}
}
}
Starting AVAudioEngine:
internal func setupEngine() {
if callObserver.onCall() { return }
inputNode = audioEngine.inputNode
audioEngine.attach(audioMixer)
audioEngine.connect(inputNode, to: audioMixer, format: AVAudioFormat.validInputAudioFormat(inputNode))
}
internal func beginRecordingEngine() {
audioMixer.removeTap(onBus: 0)
audioMixer.installTap(onBus: 0, bufferSize: 1024, format: AVAudioFormat.validInputAudioFormat(inputNode)) { [weak self] buffer, _ in
guard let self = self, let file = self.audioFile else { return }
write(file, buffer: buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
recordingTimer = Timer.scheduledTimer(withTimeInterval: recordingInterval, repeats: true) { [weak self] _ in
self?.handleRecordingInterval()
}
} catch {
debugPrint(error)
}
}
On the try audioEngine.start() call, I receive error code 561145187 in the catch block.
Logs/Error Messages:
• Error code: 561145187
Request:
I would appreciate any guidance or solutions to ensure the app can resume recording after interruptions and background transitions without requiring a restart.
Thank you for your assistance.
When listening to music in carplay, the sound quality is very bad and I can't turn down the volume of the music. it acts like a voice assistant instead of music. when turning the volume down and up, it's like adjusting the volume of the voice assistant when it should turn up the volume of the music. So instead of music, it acts like the voice assistant is activated. This was previously reported in iOS 18 Beta 2.
I have a 2024 Honda Ridgeline Black Edition.
If a queue (ApplicationMusicPlayer.Queue) is set with both library and non-library (catalog) items, the queue will play only one kind of item (library or non-library) or will just stop playing when the next item is of a different kind.
Using both Xcode 16 beta 4 and Xcode 15.4.
The issue was present in iOS 17 and is not resolved as of iOS 18 beta 4.
FB14491999
I would really like to know if it is possible to get a state for the ringer on iOS. In my application I want to repeat the logic of audio rules from Instagram during watching a video.
The video plays with sound. When the ringer switches to mute status the video's audio should be muted too. When you press the volume up or down button the audio should be unmuted.
I found how to catch volume up/down buttons with AVAudioSession.sharedInstance().observe(\.outputVolume) but I couldn't find anything that could help me with the ringer state. AVAudioSession.Category can't achieve this effect.
Also there is a possibility to check ringer state with Darwin notify lib like
var token = NOTIFY_TOKEN_INVALID
notify_register_dispatch(
"com.apple.springboard.ringerstate",
&token,
.main
) { token in
var state: UInt64 = 0
notify_get_state(token, &state)
print("Changed to", state == 1 ? "ON" : "OFF")
}
but I'm not sure that this won't lead to the application being rejected. I don't know is it a private API usage or not.
I will be glad to any advice and suggestions. Thanks
Hello !
I am working on an app connected to an external streamer .
I would like to display current playing song on the Lock Screen.
I tried to update the information in MPNowPlayingInfoCenter but I need to play a sound on my iPhone for the control to be displayed .
Is there a way to do it without playing a sound?
If not, playing a silent sound would be the only solution ? validated by Apple ? :-/
Thank you
Frederic
I'm building an app that will allow users to record voice notes. The functionality of all that is working great; I'm trying to now implement changes to the AudioSession to manage possible audio streams from other apps. I want it so that if there is audio playing from a different app, and the user opens my app; the audio keep playing. When we start recording, any third party app audio should stop, and can then can resume again when we stop recording.
This is my main audio setup code:
private var audioEngine: AVAudioEngine!
private var inputNode: AVAudioInputNode!
func setupAudioEngine() {
audioEngine = AVAudioEngine()
inputNode = audioEngine.inputNode
audioPlayerNode = AVAudioPlayerNode()
audioEngine.attach(audioPlayerNode)
let format = AVAudioFormat(standardFormatWithSampleRate: AUDIO_SESSION_SAMPLE_RATE, channels: 1)
audioEngine.connect(audioPlayerNode, to: audioEngine.mainMixerNode, format: format)
}
private func setupAudioSession() {
let audioSession = AVAudioSession.sharedInstance()
do {
try audioSession.setCategory(.playAndRecord, mode: .default, options: [.defaultToSpeaker, .allowBluetooth])
try audioSession.setPreferredSampleRate(AUDIO_SESSION_SAMPLE_RATE)
try audioSession.setPreferredIOBufferDuration(0.005) // 5ms buffer for lower latency
try audioSession.setActive(true)
// Add observers
setupInterruptionObserver()
} catch {
audioErrorMessage = "Failed to set up audio session: \(error)"
}
}
This is all called upon app startup so we're ready to record whenever the user presses the record button.
However, currently when this happens, any outside audio stops playing.
I isolated the issue to this line: inputNode = audioEngine.inputNode
When that's commented out, the audio will play -- but I obviously need this for recording functionality.
Is this a bug? Expected behavior?
I tried to play music on my iPhone and it keeps skipping over all of the songs and not playing any music.
I’m having issues with volume after installing ios 18 beta 4. The volume toggle in control centre is turned up to full volume and is disabled. It only gets enabled after playing music. As soon as I pause music it gets disabled and my iPhone turns mute. Also having issue when playing music on AirPods; as soon as I turn my screen on, the music pauses. It happens every time o do this action.
Hi there community,
First and foremost, a big thank you to everyone who takes the time to read this.
TL;DR: How, if even possible, can I record multiple audio streams simultaneously on an iOS application (iPad/iPhone)?
I'm working on a recorder for the iPad to gather data for a machine learning project focused on speech recognition. Our goal is to capture extensive speech data, which requires recording from multiple microphones. Specifically, I need to record from all mics connected to our Scarlett 4i4 audio interface and, most importantly, also record from the built-in mic on the iPad or iPhone at the same time.
As a newcomer to Swift development, I initially explored AVAudioRecorder. However, I quickly realized that it only supports one active audio node at a time, making multi-channel recording impossible. (perhaps you can proof me wrong, would make my day) Next, I transitioned to using AVAudioEngine, but encountered the same limitation: I couldn't manage to get input nodes for both the built-in mic and the Scarlett interface channels simultaneously. The application started behaving oddly, often resulting in identical audio data being recorded across all files.
Determined to find a solution, I delved deeper into the Core Audio framework, specifically using Audio Toolbox. My approach involved creating and configuring multiple Audio Units, each corresponding to a different audio input device. Here's a brief overview of my current implementation:
Listing Available Input Devices: I used AVAudioSession to enumerate all available input devices.
Creating Audio Units: For each device, I created an Audio Unit and attempted to configure it for recording.
Setting Up Callbacks: I set up input and output callbacks to handle the audio processing.
Despite my efforts over the last few days, I haven't had much success. The callbacks for the Audio Units don't seem to be invoked correctly, and I'm struggling to achieve simultaneous multi-channel recording. Below is a snippet of my latest attempt:
let audioUnitCallback: AURenderCallback = { (
inRefCon: UnsafeMutableRawPointer,
ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
inTimeStamp: UnsafePointer<AudioTimeStamp>,
inBusNumber: UInt32,
inNumberFrames: UInt32,
ioData: UnsafeMutablePointer<AudioBufferList>?
) -> OSStatus in
guard let ioData = ioData else {
return noErr
}
print("Input callback invoked")
let audioUnit = inRefCon.assumingMemoryBound(to: AudioUnit.self).pointee
var bufferList = AudioBufferList(
mNumberBuffers: 1,
mBuffers: AudioBuffer(
mNumberChannels: 1,
mDataByteSize: 0,
mData: nil
)
)
let status = AudioUnitRender(audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, &bufferList)
if status != noErr {
print("AudioUnitRender failed: \(status)")
return status
}
// Copy rendered data to output buffer
let buffer = UnsafeMutableAudioBufferListPointer(ioData)[0]
buffer.mData?.copyMemory(from: bufferList.mBuffers.mData!, byteCount: Int(bufferList.mBuffers.mDataByteSize))
buffer.mDataByteSize = bufferList.mBuffers.mDataByteSize
print("Rendered audio data")
return noErr
}
let outputCallback: AURenderCallback = { (
inRefCon: UnsafeMutableRawPointer,
ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
inTimeStamp: UnsafePointer<AudioTimeStamp>,
inBusNumber: UInt32,
inNumberFrames: UInt32,
ioData: UnsafeMutablePointer<AudioBufferList>?
) -> OSStatus in
guard let ioData = ioData else {
return noErr
}
print("Output callback invoked")
// Process the output data if needed
return noErr
}
In essence, I'm stuck and in need of guidance. Has anyone here successfully implemented multi-channel recording on iOS, especially involving both built-in microphones and external audio interfaces? Any shared experiences, insights, or suggestions on how to proceed would be immensely appreciated.
Thank you once again for your time and assistance!
Some iOS apps with signatures or bundle IDs will receive the AVAudioSessionInterruptionTypeBegan callback when the headphones are disconnected, but will not receive the AVAudioSessionInterruptionTypeEnded callback. Not all bundle IDs can cause appeal issues,
May I ask why different bundle IDs result in the above differences, and what are the settings that bind bundle IDs that affect the notification of AVAudioSessionInterruptionType
After integration with MusicKit, I have an issue with Watchdog. The crash log point on this stack trace:
ProcessState: Running
WatchdogEvent: scene-update
WatchdogVisibility: Background
WatchdogCPUStatistics: (
"Elapsed total CPU time (seconds): 72.560 (user 49.970, system 22.590), 39% CPU",
"Elapsed application CPU time (seconds): 11.270, 6% CPU"
) reportType:CrashLog maxTerminationResistance:Interactive>
Triggered by Thread: 0
Thread 0 Crashed:
0 libsystem_kernel.dylib 0x1dfa74808 mach_msg2_trap + 8
1 libsystem_kernel.dylib 0x1dfa78008 mach_msg2_internal + 80
2 libsystem_kernel.dylib 0x1dfa77f20 mach_msg_overwrite + 436
3 libsystem_kernel.dylib 0x1dfa77d60 mach_msg + 24
4 libdispatch.dylib 0x19e884b18 _dispatch_mach_send_and_wait_for_reply + 544
5 libdispatch.dylib 0x19e884eb8 dispatch_mach_send_with_result_and_wait_for_reply + 60
6 libxpc.dylib 0x1f386bac8 xpc_connection_send_message_with_reply_sync + 264
7 Foundation 0x195853998 __NSXPCCONNECTION_IS_WAITING_FOR_A_SYNCHRONOUS_REPLY__ + 16
8 Foundation 0x195850004 -[NSXPCConnection _sendInvocation:orArguments:count:methodSignature:selector:withProxy:] + 2160
9 Foundation 0x1958c820c -[NSXPCConnection _sendSelector:withProxy:arg1:] + 116
10 Foundation 0x1958c7e80 _NSXPCDistantObjectSimpleMessageSend1 + 60
11 MediaPlayer 0x1a8c0ff24 -[MPMusicPlayerController _validateServer] + 128
12 MediaPlayer 0x1a8c3f4f8 -[MPMusicPlayerApplicationController _establishConnectionIfNeeded] + 2144
13 MediaPlayer 0x1a8c0fbb8 -[MPMusicPlayerController onServer:] + 52
14 MediaPlayer 0x1a8c0ec94 -[MPMusicPlayerController _nowPlaying] + 372
15 MediaPlayer 0x1a8c161a4 -[MPMusicPlayerController nowPlayingItem] + 24
16 MusicKit 0x213253e78 -[MusicKit_SoftLinking_MPMusicPlayerController nowPlayingItem] + 24
17 MusicKit 0x2136ec1bc 0x2131b9000 + 5452220
18 MusicKit 0x2136ec70c 0x2131b9000 + 5453580
19 MusicKit 0x2136ed839 0x2131b9000 + 5457977
20 MusicKit 0x213221c65 0x2131b9000 + 429157
21 MusicKit 0x21354b741 0x2131b9000 + 3745601
22 libswift_Concurrency.dylib 0x1a1d0e775 completeTaskWithClosure(swift::AsyncContext*, swift::SwiftError*) + 1
According to the log - the app is in the background and the stack trace has only MusicKit. How could we disable or avoid this activity to avoid the Watchdog issue?
We are a music app, encountered a scene, there is no way to resume playing music, so I would like to ask about the technical plan, how to achieve it.
For example, when playing a video in another app, we pause the music playing and turn off the video, we should resume the music playing.
Our code is implemented, so listen AVAudioSessionInterruptionNotification, when we received the notice and judge AVAudioSessionInterruptionOptionShouldResume, we play music came again, Error 560557684(AVAudioSessionErrorCodeCannotInterruptOthers) was reported. We were very confused
NSError *error = nil;
AVAudioSession *audioSession = [AVAudioSession sharedInstance];
[audioSession setCategory:AVAudioSessionCategoryPlayback withOptions:0 error:&error];
[audioSession setActive:YES error:&error];
We compared the apple music app and found that apple music can resume playing.
Here is a video of the effects of our app:
https://drive.google.com/file/d/1J94S2kxkEpNvG536yzCnKmE7IN3cGzIJ/view?usp=sharing
Here's the apple music effect video:
https://drive.google.com/file/d/1c1Kdgkn2nhy8SdDvRJAFF2sPvqJ8fL48/view?usp=sharing
We want to improve our user experience. How can we do that?
I am looking for a way to know how much of the text is remaining (i.e., progress bar) when synthesizer.speak is called. I looked at this but it does not seem to provide any progress. is there any way to get the progress?
I have this code:
class SpeechSynthesizerDelegate: NSObject, AVSpeechSynthesizerDelegate {
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
print("Speech finished.")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didCancel utterance: AVSpeechUtterance) {
print("Speech canceled.")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didStart utterance: AVSpeechUtterance) {
print("Speech started.")
}
func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didPause utterance: AVSpeechUtterance) {
print("Speech paused.")
...
that I try to use like this
let synthesizer = AVSpeechSynthesizer()
let delegate = SpeechSynthesizerDelegate()
synthesizer.delegate = delegate
but when I call
synthesizer.speak(utterance)
the delegate methods are not being called. I am running this on Mac OS ventura. How can I fix this?
I am running the code sample here https://developer.apple.com/documentation/avfoundation/speech_synthesis/ in a REPL on Mac OS Ventura
import AVFoundation
// Create an utterance.
let utterance = AVSpeechUtterance(string: "The quick brown fox jumped over the lazy dog.")
// Configure the utterance.
utterance.rate = 0.57
utterance.pitchMultiplier = 0.8
utterance.postUtteranceDelay = 0.2
utterance.volume = 0.8
// Retrieve the British English voice.
let voice = AVSpeechSynthesisVoice(language: "en-GB")
// Assign the voice to the utterance.
utterance.voice = voice
// Create a speech synthesizer.
let synthesizer = AVSpeechSynthesizer()
// Tell the synthesizer to speak the utterance.
synthesizer.speak(utterance)
It runs without errors but I don't hear any sound and the call to
synthesizer.speak
returns immediately. How can I fix this? Note I am running in REPL so synthesizer is not going out of scope and getting garbage collected.
I have a user who is reporting an error and has been kind enough to share screen recordings to help diagnose. I am not experiencing this error, nor am I able to replicate on other devices I've tried, so I'm stuck trying to fix. His & other devices tested were all running iOS 17.5.1. Any details on the cause of this error or potential workarounds I could use to resolve would be greatly appreciated.
try await ApplicationMusicPlayer.shared.play()
throws:
The operation couldn't be completed (MPMusicPlayerControllerErrorDomain error 6.)
MusicAuthorization.currentStatus is .authorized
ApplicationMusicPlayer.shared.isPreparedToPlay is false
ApplicationMusicPlayer.shared.queue.currentEntry is nil (I've noticed this to be the case even when I am able to successfully play as well)
Queue was loaded using ApplicationMusicPlayer.shared.queue = [album] but I also tried ApplicationMusicPlayer.shared.queue = ApplicationMusicPlayer.Queue(album:startingAt:) and it made no difference. album.playParameters are correct. He experiences the error when attempting to play any album.
Any and all help is truly appreciated. Feedback Assistant filed has gone unanswered.
HI,
I'm developing an iOS app that accepts an audio signal as input with the goal of analyzing the signal.
For my experiment I purchased a cheap ADC-DAC produced by Sabrent.
It works well but the sampling rate is 44.1 khz but I need at least something more (96 khz).
I'm looking around but I find many DACs used to connect headphones.
Can any of you suggest me an ADC-DAC, preferably not too expensive with a sampling rate of at least 96khz, working with iphones?
MPMusicPlayerControllers nowPlayingItem no longer seems to be able to change a song. The code use to work but seems to be broken on iOS 16, 17 and now the iOS 18 beta.
When newSong is triggered, the song restarts but it does not change songs. Instead I get the following error: Failed to set now playing item error=<MPMusicPlayerControllerErrorDomain.5 "Unable to play item <MPConcreteMediaItem: 0x9e9f0ef70> 206357861099970620" {}>.
The documentation seems to indicate I’m doing things correctly.
class MusicPlayer {
var songTwo: MPMediaItem?
let player = MPMusicPlayerController.applicationMusicPlayer
func start() async {
await MPMediaLibrary.requestAuthorization()
let myPlaylistsQuery = MPMediaQuery.playlists()
let playlists = myPlaylistsQuery.collections!.filter { $0.items.count > 2}
let playlist = playlists.first!
let songOne = playlist.items.first!
songTwo = playlist.items[1]
player.setQueue(with: playlist)
play(songOne)
}
func newSong() {
guard let songTwo else { return }
play(songTwo)
}
private func play(_ song: MPMediaItem) {
player.stop()
player.nowPlayingItem = song
player.prepareToPlay()
player.play()
}
}
Music app stops playing when switching to the background
In apps that play music or music files, if you move to the home screen or run another app while the app is running, the music playback stops.
Our app does not have the code to stop playing when switching to the background.
We are guessing that some people experience this and others do not.
We usually guide users to reboot their devices and try again.
How can this phenomenon be improved in the code?
Or is this a bug or error in the OS?