AVSpeechSynthesizer doesn't notifyOthersOnDeactivation

Question

Created Jul ’24

Replies 6

Boosts 1

Participants 3

Hello, I am building a new iOS app which uses AVSpeechSynthesizer and should be able to mix audio nicely with audio from other apps. AVSpeechSynthesizer seems to handle setting the AVAudioSession to active on it's own, but does not deactivate the audio session. This leads to issues, namely that other audio sources remain "ducked" after AVSpeechSynthesizer is done speaking.

I have implemented deactivating the audio session myself, which "works", in that it allows other audio sources to become "un-ducked", but it throws this exception each time even though it appears successful.

Error Domain=NSOSStatusErrorDomain Code=560030580 "Session deactivation failed" UserInfo={NSLocalizedDescription=Session deactivation failed}

It appears to be a bug with how AVSpeechSynthesizer handles activating/deactivating the audio session.

Below is a minimal example which illustrates the problem. It has two buttons, one which manually deactivates the audio sessions, which throws the exception, but otherwise works, and another button which leaves audio session management to the AVSpeechSynthesizer but does not "un-duck" other audio.

If you play some audio from another app (ex: Music), you'll see the button which throws/catches an exception successfully ducks/un-ducks the audio, while the one without attempting to deactivate the session ducks but does not un-duck the audio.

import AVFoundation

struct ContentView: View {
    let workingSynthesizer = UnduckingSpeechSynthesizer()
    let brokenSynthesizer = BrokenSpeechSynthesizer()
    
    init() {
        let audioSession = AVAudioSession.sharedInstance()
        do {
            try audioSession.setCategory(.playback, mode: .voicePrompt, options: [.duckOthers])
        } catch {
            print("Setup error info: \(error)")
        }
    }
    
    var body: some View {
        VStack {
            Button("Works Correctly"){
                workingSynthesizer.speak(text: "Hello planet")
            }
            Text("-------")
            Button("Does not work"){
                brokenSynthesizer.speak(text: "Hello planet")
            }
        }
        .padding()
    }
}

class UnduckingSpeechSynthesizer: NSObject {
    var synth = AVSpeechSynthesizer()
    let audioSession = AVAudioSession.sharedInstance()

    override init(){
        super.init()
        synth.delegate = self
    }
    
    func speak(text: String) {
        let utterance = AVSpeechUtterance(string: text)
        synth.speak(utterance)
    }
}

extension UnduckingSpeechSynthesizer: AVSpeechSynthesizerDelegate {
    func speechSynthesizer(_ synthesizer: AVSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
        do {
            try audioSession.setActive(false, options: .notifyOthersOnDeactivation)
        }
        catch {
            // always throws an error
            // Error Domain=NSOSStatusErrorDomain Code=560030580 "Session deactivation failed" UserInfo={NSLocalizedDescription=Session deactivation failed}
            print("Deactivate error info: \(error)")
        }
    }
}

class BrokenSpeechSynthesizer {
    var synth = AVSpeechSynthesizer()
    let audioSession = AVAudioSession.sharedInstance()

    func speak(text: String) {
        let utterance = AVSpeechUtterance(string: text)
        synth.speak(utterance)
    }
}

(I have a separate issue where the first speech attempt takes a few seconds but I don't think it's related)

Boost

Answer 1

zerooverride OP

Jul ’24

FYI my copy/paste left import SwiftUI out from the top of the file.

0

Answer 2

zerooverride OP

Jul ’24

I have uploaded a minimal sample project which demonstrates the issues here: https://github.com/zerooverride/MinimalSpeechAudioTest

It also contains a video in the README which demonstrates the issue.

0

Answer 3

Engineer OP

Apple

Jul ’24

Hello @zerooverride, thank you for sharing a test project. I can reproduce both issues. For the BrokenSpeechSynthesizer issue, please use Feedback Assistant to submit a bug report, and please paste here the ID generated by Feedback Assistant. For the UnduckingSpeechSynthesizer issue, I set usesApplicationAudioSession to false in the UnduckingSpeechSynthesizer initializer, and de-activating the session succeeded after that.

0

Answer 4

zerooverride OP

Jul ’24

Hello, thank you for the response. Configuring the usesApplicationAudioSession property to false for either AVSpeechSynthesizer in my sample appears to do the same thing. (You can add an init to BrokenSpeechSynthesizer which does so, and it will then function the same as UnduckingSpeechSynthesizer). They will then de-activate the session.

But doing so has the clear drawback of not actually allowing control of the audio session by the app. Options such as those set via AVAudioSession.setCategory() are no longer applied. As a concrete example, in my actual project I set the AVAudioSession.CategoryOptions to include both .duckAudio and .interruptSpokenAudioAndMixWithOthers because this is the experience my users want. When setting usesApplicationAudioSession to false, spoken audio is not interrupted, it is ducked. There may be other settings that are undesirable as well that I have not discovered yet. (I am happy to update the minimal example to include this if you'd like but it didn't seem "minimal" on my first go.)

So, I can set usesApplicationAudioSession to false to provide a degraded user experience, but I assume the intention is for a developer to be able to use AVSpeechSynthesizer and have full control of the related audio session. I have submitted feedback through Feedback Assistant and the ID is FB14444620.

0

Answer 5

zerooverride OP

Jul ’24

Something that could potentially be considered a workaround would be to set the synthesizers usesApplicationAudioSession to false, while also having the app itself activate/deactivate the audio session before and after using the AVSpeechSynthesizer. Essentially wrapping all usage of the synthesizer that no longer uses the apps audio session within an explicit app audio session. In my testing this appears to have the desired AVAudioSession settings kick in and there are no exceptions thrown.

BUT, that then causes the UI to pause updates both when activating and de-activating the session, which is a worse outcome (in current example it only pauses when de-activating). If I have missed a way to not have activating/deactivating the audio session pause UI updates, which you could point me to, I could utilize this workaround.

0

Answer 6

ZStacker OP

Aug ’24

Hey @zerooverride, I am having the same problem. One fix that might be suitable in your case is delaying the .setActive function by 0.5 s, e.g. by DispatchQueue.global.asyncAfter or Task.sleep

func speechSynthesizer(_ synthesizer: ACSpeechSynthesizer, didFinish utterance: AVSpeechUtterance) {
  DispatchQueue.global().asyncAfter(deadline: .now+0.5, 
  execute: {
    do {
      try AVAudioSession.sharedInstance().setActive(false)
    } catch {
      print("Error: \(error)")
    }
  })
}

This delaying should get rid of the errors. However, this does not seem to work, if you have overlapping synthesizer.speak(utterance) calls.

If you have new information on how to really solve that problem, let me know :)

0