Has anyone figured this one out? Pasting them gives us detritus problems, but I'm there must be some way of doing this? Thank you.
Audio
RSS for tagDive into the technical aspects of audio on your device, including codecs, format support, and customization options.
Post
Replies
Boosts
Views
Activity
I'm having trouble using SFSpeechRecognizer & SFSpeechRecognitionTask to show me the words from an audio file. I found a solution on stackoverflow to separate the audio file into smaller sizes. How would I do that programmatically using Swift for a macOS app Xcode project?
I would prefer not to separate the file into smaller files. I will submit another post with more information for that.
Dear Sirs,
I've written an audio driver based on AudioDriverKit.
In my audio callback function I'm receiving calls with io operation IOUserAudioIOOperationWriteEnd and IOUserAudioIOOperationBeginRead as expected which means I see IOUserAudioIOOperationWriteEnd operations during a playback in an application like VLC or the browser and I see IOUserAudioIOOperationBeginRead when recording in Audacity etc..
But when I open the SystemSettings and goto Sound and I select my driver as input I also see calls with IOUserAudioIOOperationWriteEnd which seem to be the just read input data. I can also watch this when starting up Teams. I think the purpose is to add the (mic) input also to the output so you have the chance to listen to yourself.
Nevertheless I'd like to fully avoid this but I don't see a way to distinguish between the playback audio data and the input audio data inside this callback. How could I do this?
Or even better is there a switch which would completely switch off these callbacks which forward the input to the output?
Thanks and best regards,
Johannes
Dear Sirs,
when writing an AudioServerPlugin I can use the hosts WriteToStorage/CopyFromStorage functions to save and restore custom properties on restarting the machine. Are there corresponding functions for an audio driver based on AudioDriverKit? What would be the recommended way to save and restore properties so that they are available again after a reboot in an audio driver based on AudioDriverKit?
Thanks and best regards,
Johannes
iOS Audio Lockscreen Problem in PWA
Description
When running a PWA on iOS; playing audio from the lockscreen works as expected until you leave the audio paused for 30 seconds. After this, the audio will cease to function until you return the PWA to the foreground.
Reproduction
In a PWA, create an HTML 5 audio element.
Load an audio file into it.
Set navigator.mediaSession data and action handlers for play and pause.
Everything is in working order and your audio plays and pauses from the lock screen.
Pause your audio and wait for 30 seconds.
Now, press the play button. Your audio will no longer function.
At this point, the only way to get the audio to function again is to open the PWA into the foreground. Once you do this, the audio will be in working order.
What is expected
In step number 6, when you press the play button, the audio should play. The lock screen audio should not enter a non-functional state or there should be some way to "wake up" the PWA.
Closing
If you follow these steps exactly on Android, you will see that the problem does not exist on those devices.
Whenever I have any bluetooth devices connected (radio, car, earphones) and want to record a voice message, the phone assumes I am recording from those devices, both in the messages app and any other app. Half of those devices I own don’t even have a microphone, then no message gets recorded. Can you implement a choice of microphone to be used when recording something? Some apps don’t even have the option to pick the audio output, which is annoying, but having to disable bluetooth to record something is definitely worse.
I have developed and operated a music player app, but when I installed the iOS 18 public beta version on my device and checked the app's operation, I found that the seek bar stops immediately after starting playback, and I cannot change the playback position on the seek bar.
Checking the logs, the following error is output when the seek bar stops:
ERROR AudioQueueCreateTimeline status=1953330284
This is a value I have never seen before, and this issue did not occur in iOS 17 or earlier. I would like to know if this issue can be resolved, and if not, how I should handle it.
When using the AVSpeechSynthesizer() , I get an error after a couple of seconds :"IPCAUClient.cpp:139 IPCAUClient: can't connect to server (-66748) <0x104309130>", and then it speaks the text.
The second time I call speak, there is no delay and error and it speaks immediately.
Where does this error and delay come from and how can I resolve it?
Intialization code:
self.audioSession = AVAudioSession.sharedInstance() // 2) handle audio session first, before trying to read the text
do {
try audioSession.setCategory(.playback, mode: .voicePrompt, options: .duckOthers)
try audioSession.setActive(false)
} catch let error {
Logger.model.debug("❓\(error.localizedDescription)")
}
speechSynthesizer = AVSpeechSynthesizer()
speechSynthesizer.usesApplicationAudioSession = true
Speak code:
let utterance = AVSpeechUtterance(string: text)
utterance.preUtteranceDelay = 0.1
utterance.rate = 0.5
utterance.pitchMultiplier = 0.75
utterance.prefersAssistiveTechnologySettings = false
self.speechSynthesizer.speak(utterance)
The last statement gives this error message!
Description:
I am developing a recording-only application that supports background recording using AVAudioEngine. The app segments the recording into 60-second files for further processing. For example, a 10-minute recording results in ten 60-second files.
Problem:
The application functions as expected in the background. However, after the app receives an interruption (such as a phone call) and the interruption ends, I can successfully restart the recording. The problem arises when the app then transitions to the background; it fails to restart the recording. Specifically, after ending the call and transitioning the app to the background, the app encounters an error and is unable to restart AVAudioSession and AVAudioEngine. The only resolution is to close and restart the app, which is not ideal for user experience.
Steps to Reproduce:
1. Start recording using AVAudioEngine.
2. The app records and saves 60-second segments.
3. Receive an interruption (e.g., an incoming phone call).
4. End the call.
5. Transition the app to the background.
6. Transition the app to the foreground and the session will be activated again.
7. Attempt to restart the recording.
Expected Behavior:
The app should resume recording seamlessly after the interruption and background transition.
Actual Behavior:
The app fails to restart AVAudioSession and AVAudioEngine, resulting in a continuous error. The recording cannot be resumed without closing and reopening the app.
How I’m Starting the Recording:
Configuration:
internal func setAudioSessionCategory() {
do {
try audioSession.setCategory(
.playAndRecord,
mode: .default,
options: [.defaultToSpeaker, .mixWithOthers, .allowBluetooth]
)
} catch {
debugPrint(error)
}
}
internal func setAudioSessionActivation() {
if UIApplication.shared.applicationState == .active {
do {
try audioSession.setPrefersNoInterruptionsFromSystemAlerts(true)
try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
if audioSession.isInputGainSettable {
try audioSession.setInputGain(1.0)
}
try audioSession.setPreferredIOBufferDuration(0.01)
try setBuiltInPreferredInput()
} catch {
debugPrint(error)
}
}
}
Starting AVAudioEngine:
internal func setupEngine() {
if callObserver.onCall() { return }
inputNode = audioEngine.inputNode
audioEngine.attach(audioMixer)
audioEngine.connect(inputNode, to: audioMixer, format: AVAudioFormat.validInputAudioFormat(inputNode))
}
internal func beginRecordingEngine() {
audioMixer.removeTap(onBus: 0)
audioMixer.installTap(onBus: 0, bufferSize: 1024, format: AVAudioFormat.validInputAudioFormat(inputNode)) { [weak self] buffer, _ in
guard let self = self, let file = self.audioFile else { return }
write(file, buffer: buffer)
}
audioEngine.prepare()
do {
try audioEngine.start()
recordingTimer = Timer.scheduledTimer(withTimeInterval: recordingInterval, repeats: true) { [weak self] _ in
self?.handleRecordingInterval()
}
} catch {
debugPrint(error)
}
}
On the try audioEngine.start() call, I receive error code 561145187 in the catch block.
Logs/Error Messages:
• Error code: 561145187
Request:
I would appreciate any guidance or solutions to ensure the app can resume recording after interruptions and background transitions without requiring a restart.
Thank you for your assistance.
Hi! I'm developing a music player app that interchanges between ApplicationMusicPlayer and AVAudioEngine. I'm facing an issue when switching from playback via ApplicationMusicPlayer to AVAudioEngine while the app is in background. Based on testing, it seems like the issue has to do with being unable to set audio focus in background, causing error AVAudioSessionErrorCodeCannotInterruptOthers.
I would like to check if ApplicationMusicPlayer has its own audio focus separated from the app's own audio focus. If it is, is there anything that I can do to ensure that ApplicationMusicPlayer returns focus to the app?
(I notice that the issue does not occur if we are moving playback from AVAudioEngine to ApplicationMusicPlayer. Not sure why the opposite does not work)
I’m using AVAudioEngine to get a stream of AVAudioPCMBuffers from the device’s microphone using the usual installTap(onBus:) setup.
To distribute the audio stream to other parts of the program, I’m sending the buffers to a Combine publisher similar to the following:
private let publisher = PassthroughSubject<AVAudioPCMBuffer, Never>()
I’m starting to suspect I have some kind of concurrency or memory management issue with the buffers, because when consuming the buffers elsewhere I’m getting a range of crashes that suggest some internal pointer in a buffer is NULL (specifically, I’m seeing crashes in vDSP.convertElements(of:to:) when I try to read samples from the buffer).
These crashes are in production and fairly rare — I can’t reproduce them locally.
I never modify the audio buffers, only read them for analysis.
My question is: should it be possible to put AVAudioPCMBuffers into a Combine pipeline? Does the AVAudioPCMBuffer class not retain/release the underlying AudioBufferList’s memory the way I’m assuming? Is this a fundamentally flawed approach?
Media Player album artwork images do not release memory after loading so memory accumulates, eventually impacting performance and causing crashes.
Steps To Reproduce:
Download example project and run on device
Grant access to library of 100+ albums
Scroll through albums on any tab screen
Observe steady memory increase in Xcode debug navigator and crash as you scroll
Observe same memory accumulation on other screens that use different methods to get album artwork images
Observations:
Various methods of obtaining album artwork insufficiently release memory and impact app performance.
1 - Artwork Image releases memory sometimes if the images are small and you scroll slow, but app will still accumulate memory, drop frames when scrolling, and eventually crash.
2 - Value For Property behaves similar to Artwork Image even when using the implicit size of the artwork bounds.
3 - Image From Disk problem seems related to perform() method since you can remove UIImage and memory still accumulates
All 3 methods result in higher retain counts than expected for artwork objects that could be preventing memory from releasing
hi,
i am currently developing an app that has core functionalities reliant on detecting user laughter in the background. in our early stages we noticed apple's built-in sound recognition functionality. at the core, i am guessing that sound recognition requires permission from the user to access the microphone 24/7. currently, using the conventional avenue of background audio recording, a yellow indicator will be present on the top of the iphone screen indicating recording. this is not the case for sound recognition; instead. if all sound processing/recognition is kept on-device, is there any way to avoid the yellow dot and achieve sound laughter in a way that is similar to how apple's sound recognition does it?
from the settings interface for sound recognition accessible to the user in the settings app, the only detectable "people" sounds are baby crying, coughing, and shouting. is it also possible to add laughter to this list somehow?
thank you in advance.
MPMusicPlayerControllers nowPlayingItem no longer seems to be able to change a song. The code use to work but seems to be broken on iOS 16, 17 and now the iOS 18 beta.
When newSong is triggered, the song restarts but it does not change songs. Instead I get the following error: Failed to set now playing item error=<MPMusicPlayerControllerErrorDomain.5 "Unable to play item <MPConcreteMediaItem: 0x9e9f0ef70> 206357861099970620" {}>.
The documentation seems to indicate I’m doing things correctly.
class MusicPlayer {
var songTwo: MPMediaItem?
let player = MPMusicPlayerController.applicationMusicPlayer
func start() async {
await MPMediaLibrary.requestAuthorization()
let myPlaylistsQuery = MPMediaQuery.playlists()
let playlists = myPlaylistsQuery.collections!.filter { $0.items.count > 2}
let playlist = playlists.first!
let songOne = playlist.items.first!
songTwo = playlist.items[1]
player.setQueue(with: playlist)
play(songOne)
}
func newSong() {
guard let songTwo else { return }
play(songTwo)
}
private func play(_ song: MPMediaItem) {
player.stop()
player.nowPlayingItem = song
player.prepareToPlay()
player.play()
}
}
Audio getting disabled, Not able to control audio, When opening music player audio works but not on instagram or any other apps.
Audio button on notification bar is greyed out as getting disabled.
I need to find a way to allow recording from the mic while outputting two different sound streams to two different devices (speaker and headphones).
I've done a fair bit of reading around using AVAudioSession.Category.multiroute but haven't found any modern examples. @theanalogkid posted a nice example using obj-C nine years ago, but others have noted that the code isn't readily translatable to Swift.
To make matters worse, this is one of the very few examples on how to properly use multirouting. The official documentation is lacking, to say the least, and the WWDC 2012 session is, well, old enough to attend middle school and be a Taylor Swift fan, but definitely not in Swift. The few relevant forum posts here are spread over this middle schooler's life span and likely outdated, with most having no responses other than the poster's own plightful echo. They don't paint a pretty picture of .multiroute's health, with a recent poster noting that volume buttons don't work in this mode, contacting DTS and finding that there's no fix; another finding that it just doesn't work for certain devices, etc.
Audio is giving me enough of a headache so I'd like to avoid slogging through this if possible. .multiroute feels like the developer mode of AVAudioSession, but without documentation.
tl;dr - Without using .multiroute, is there a way to allow an app to output two different devices while simultaneously recording audio? If .multiroute is the only way to achieve this, can someone give me a quick rundown of how this category works?
I am developing a visionOS app that captions speech in real environments. Currently, I am using Apple's built-in speech recognizer. However, when I was testing the app with a Vision Pro, the device seemed to only pick up the user's voice (in other words, the voices of the wearer of the Vision Pro device). For example, when the speech recognition task is running, and another person in front of me is talking, the system does not pick up the speech well.
I tried to set the AVAudioSession to be equally sensitive to all directions:
private func configureAudioSession() {
do {
try audioSession.setCategory(.record, mode: .measurement)
try audioSession.setActive(true)
if #available(visionOS 1.0, *) {
let availableDataSources = audioSession.availableInputs?.first?.dataSources
if let omniDirectionalSource = availableDataSources?.first(where: {$0.preferredPolarPattern == .omnidirectional}) {
try audioSession.setInputDataSource(omniDirectionalSource)
}
}
} catch {
print("Failed to set up audio session: \(error)")
}
}
And here is how I set up the speech recognition and configure the microphone inputs:
private func startSpeechRecognition(completion: @escaping (String) -> Void) {
do {
// Cancel the previous task if it's running.
if let recognitionTask = recognitionTask {
recognitionTask.cancel()
self.recognitionTask = nil
}
// The AudioSession is already active, creating input node.
let inputNode = audioEngine.inputNode
try inputNode.setVoiceProcessingEnabled(false)
// Create and configure the speech recognition request
recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let recognitionRequest = recognitionRequest else { fatalError("Unable to create a recognition request") }
recognitionRequest.shouldReportPartialResults = true
// Keep speech recognition data on device
if #available(iOS 13, *) {
recognitionRequest.requiresOnDeviceRecognition = true
}
// Create a recognition task for speech recognition session.
// Keep a reference to the task so that it can be canceled.
recognitionTask = speechRecognizer?.recognitionTask(with: recognitionRequest) { result, error in
// var isFinal = false
if let result = result {
// Update the recognizedText
completion(result.bestTranscription.formattedString)
} else if let error = error {
completion("Recognition error: \(error.localizedDescription)")
}
if error != nil || result?.isFinal == true {
// Stop recognizing speech if there is a problem
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.recognitionRequest = nil
self.recognitionTask = nil
}
}
// Configure the microphone input
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer, when) in
self.recognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
} catch {
completion("Audio engine could not start: \(error.localizedDescription)")
}
}
When listening to music in carplay, the sound quality is very bad and I can't turn down the volume of the music. it acts like a voice assistant instead of music. when turning the volume down and up, it's like adjusting the volume of the voice assistant when it should turn up the volume of the music. So instead of music, it acts like the voice assistant is activated. This was previously reported in iOS 18 Beta 2.
I have a 2024 Honda Ridgeline Black Edition.
I would really like to know if it is possible to get a state for the ringer on iOS. In my application I want to repeat the logic of audio rules from Instagram during watching a video.
The video plays with sound. When the ringer switches to mute status the video's audio should be muted too. When you press the volume up or down button the audio should be unmuted.
I found how to catch volume up/down buttons with AVAudioSession.sharedInstance().observe(\.outputVolume) but I couldn't find anything that could help me with the ringer state. AVAudioSession.Category can't achieve this effect.
Also there is a possibility to check ringer state with Darwin notify lib like
var token = NOTIFY_TOKEN_INVALID
notify_register_dispatch(
"com.apple.springboard.ringerstate",
&token,
.main
) { token in
var state: UInt64 = 0
notify_get_state(token, &state)
print("Changed to", state == 1 ? "ON" : "OFF")
}
but I'm not sure that this won't lead to the application being rejected. I don't know is it a private API usage or not.
I will be glad to any advice and suggestions. Thanks
I tried to play music on my iPhone and it keeps skipping over all of the songs and not playing any music.