Hello,
Using ShazamKit, based on a shazam catalog result, would it be possible to detect the audio-recorded FPS (speed)?
I'm thinking that the shazam catalog which was created from an audio file can be used to compare the speed of a live recorded audio.
Thank you!
AudioToolbox
RSS for tagRecord or play audio convert formats parse audio streams and configure your audio session using AudioToolbox.
Posts under AudioToolbox tag
33 Posts
Sort by:
Post
Replies
Boosts
Views
Activity
I'm running into an issue where in some cases, when the AUHostingServiceXPC_arrow process is shut down by Logic, the process is terminated abruptly without calling AP_Close on all of the plugins hosted in the process. In our case, we have filesystem resources we need to clean up, and having stale files around from the last run can cause issues in new sessions, so this leak is having some pretty gnarly effects.
I can reproduce the issue using only Apple sample plugins, and it seems to be triggered by a timeout. If I have two different AU plugins in the session, and I add a 1 second sleep to the destructor of one of the sample plugins, Logic will force terminate the process and the remaining destructors are not called (even for the plugins without the 1 second sleep).
Is there a way to avoid this behavior? Or to safely clean up our plugin even if other plugins in the session take a second to tear down?
Hello, I have a question regarding the voice and sound recognition features on the iPhone 15 Pro.
The iPhone 15 Pro is equipped with four microphones, and I understand that for features like Apple’s sound recognition and when invoking Siri, the microphone(s) must always be active. My question is whether the device uses a single microphone (mono channel) for these functions or if multiple microphones are activated simultaneously.
I would appreciate clarification on how the microphones are utilized in sound and voice recognition features.
Thank you for your assistance.
Best regards.
Hello, I have a question regarding the voice and sound recognition features on the iPhone 15 Pro.
The iPhone 15 Pro is equipped with four microphones, and I understand that for features like Apple’s sound recognition and when invoking Siri, the microphone(s) must always be active. My question is whether the device uses a single microphone (mono channel) for these functions or if multiple microphones are activated simultaneously.
I would appreciate clarification on how the microphones are utilized in sound and voice recognition features.
Thank you for your assistance.
Best regards.
Currently we tested iOS AAC LC encoder using AudioToolbox framework, no matter we set mManufacturer to kAppleHardwareAudioCodecManufacturer or kAppleSoftwareAudioCodecManufacturer, it always run on CPU.
Hello,
As explained in this link, the AVAssetReaderTrackOutput.copyNextSampleBuffer() returns a CMSampleBuffer in linear PCM audio format.
I want to place this audio buffer into an AVAssetWriterInput of type kAudioFormatMPEG4AAC, but I can't manage the conversion.
Could you help me by providing an extension that returns a CMSampleBuffer converted from linear PCM audio format to kAudioFormatMPEG4AAC?
Example:
extension CMSampleBuffer {
func fromPCMToAAC() -> CMSampleBuffer? {
// Here, get a new AudioStreamBasicDescription, create a CMSampleBuffer and a CMBlockBuffer
}
}
I've tried multiple times but without success.
Software: iOS 18.1
XCode: 16.0
Thank you!
Since upgrading to tvOS 18, the above function isn't working for me in converting a stream with these formats. It does work in decoding AAC, however.
https://developer.apple.com/documentation/audiotoolbox/1503098-audioconverterfillcomplexbuffer?language=objc
I pass a valid ioOutputDataPacketSize in, but it always comes out as zero.
Has anyone else observed this too?
I wonder if this is related to the issue being discussed widely about 5.1 sound being broken for many people after upgrading to tvOS 18?
https://discussions.apple.com/thread/255769102?login=true&sortBy=rank
EDIT: further information; the callback gets called once, asking for 1 packet (which is ok). I give it one packet and return noErr. However, after this, the callback is never invoked again. Must be a bug?
EDIT2: the same code continues to work correctly on macOS in decoding the same audio stream.
AudioQueueObject.cpp:1580 BuildConverter: AudioConverterNew returned -50
from: 0 ch, 16000 Hz, .... (0x00000000) 0 bits/channel, 0 bytes/packet, 0 frames/packet, 0 bytes/frame
to: 2 ch, 16000 Hz, Int16, interleaved
AQMEIO_HAL.cpp:2773 iOSSimulatorAudioDevice-15111-0: Abandoning I/O cycle because reconfig pending (1).
HALC_ProxySystem.cpp:163 HALC_ProxySystem::GetObjectInfo: got an error from the server, Error: 560947818 (!obj)
HALC_ShellObject.mm:213 HALC_ShellObject::HasProperty: there is no proxy object
AudioHardware-mac-imp.cpp:1224 AudioObjectRemovePropertyListener: no object with given ID 160
HALSystem.cpp:2216 AudioObjectPropertiesChanged: no such object
why? Can't record on ios17. Normal recording before iOS 16.
Calls to ExtAudioFileRead are throwing OSStatus 561145203 (AVAudioSessionErrorCodeResourceNotAvailable) on iOS and iPadOS 18 -- earlier versions of iOS have not exhibited this behavior. This is a longstanding code path that has seen a spike of these error codes since iOS 18's release.
The following is also printed to the Xcode 16 console:
Hello! The new lower latency support for AirPods in Game Mode is impressive, but I'm not sure of the best way to handle the transition into/out of Game Mode while audio is playing. In order to lower the latency, the system appears to drop some number of samples, with the result being a good deal less latency. My use case is macOS where it's easier to switch in/out of the fullscreen game (a simple swipe left), thus causing more issues for Game Mode since the audio is playing the entire time. It would be nice if offscreen games could remain in game mode, but I understand not wanting to give developers that control.
Are there any best practices for avoiding or masking the audio glitch caused by this skip-ahead? Is there a system event I can receive to know when Game Mode is about to be enabled or disabled, where I could perhaps fade out the audio? My callback checks the inTimestamp->mSampleTime value to detect gaps, but it only rarely detects a Game Mode gap, even though the audio skip-ahead always happens.
BTW, I am currently only developing on macOS (15.0) and I'm working at a low level with AudioUnit callbacks and a SpatialMixer. I am not currently using any higher-level audio APIs.
And here's a few questions I don't necessarily expect answers to, but it doesn't hurt to ask: Is there any additional technical details about how this latency reduction works, or exactly how much of a reduction is achieved (or said another way, how many samples are dropped)? How much does this affect AirPods battery life? And finally, is there a way to query the actual latency value? I check the value for kAudioDevicePropertyLatency but it seems to always report 160ms for AirPods. Thanks!
I recently upgraded my iPhone 13 to iOS 18, and I'm facing two issues.
I didn't get the call recording feature. When I make a call and the person picks up, the call recording icon is not showing.
The option to take notes during a call has disappeared. Earlier, the notes option used to be available on the call screen itself.
Please help fix these two issues, or let me know if it's possible to resolve them from my end.
1, I saw nullAudio custom properties of the static const AudioObjectPropertySelector kPlugIn_CustomPropertyID = 'PCst'; But I don't know how to use this in a project.
2. What is the difference between PlugIn and Device's custom properties?
3. When I try to customize the PropertySelector for deivce. After adding kAudioObjectPropertyCustomPropertyInfoList NullAudio_HasDeviceProperty method to compile again after restarting coreAudio service, found that virtual devices don't show.
Hello everyone,
I'm new to Core Audio and still haven't found my footing. I'm learning how to capture audio from the default device, using Audio Units. On my MacBook, the default audio input is mono. But when I write a piece of code to capture audio using AUHAL, I'm discovering that I need to provide an AudioBufferList with two channels, not one. Also, when I try to capture audio from an audio interface with 20 audio inputs, I must provide an AudioBufferList with two channels, and not with 20 channels. To investigate the issue, I wrote a small diagnostic program, which opens the default audio device and probes it for the number of channels. Depending on which way I'm probing, I'm getting different results. When I probe the stream format, I'm getting information that there is 1 channels. But when I probe the input audio unit, I'm getting information that there are 2 input channels.
Here's my program to demonstrate the issue:
// InputDeviceChannels.m
// Compile with:
// clang -framework CoreAudio -framework AudioToolbox -framework CoreFoundation -framework AudioUnit -o InputDeviceChannels InputDeviceChannels.m
//
// On my system, this prints:
// Device Name: MacBook Pro Microphone
// Number of Channels (Stream Format): 1
// Number of Elements (Element Count): 2
#import <AudioToolbox/AudioToolbox.h>
#import <AudioUnit/AudioUnit.h>
#import <CoreAudio/CoreAudio.h>
#import <Foundation/Foundation.h>
void printDeviceInfo(AudioUnit audioUnit) {
UInt32 size;
OSStatus err;
AudioStreamBasicDescription streamFormat;
size = sizeof(streamFormat);
err = AudioUnitGetProperty(audioUnit, kAudioUnitProperty_StreamFormat, kAudioUnitScope_Input, 1,
&streamFormat, &size);
if (err != noErr) {
printf("Error getting stream format\n");
exit(1);
}
int numChannels = streamFormat.mChannelsPerFrame;
UInt32 elementCount;
size = sizeof(elementCount);
err = AudioUnitGetProperty(audioUnit, kAudioUnitProperty_ElementCount, kAudioUnitScope_Input, 0,
&elementCount, &size);
if (err != noErr) {
printf("Error getting element count\n");
exit(1);
}
printf("Number of Channels (Stream Format): %d\n", numChannels);
printf("Number of Elements (Element Count): %d\n", elementCount);
}
void printDeviceName(AudioDeviceID deviceID) {
UInt32 size;
OSStatus err;
CFStringRef deviceName = NULL;
size = sizeof(deviceName);
err = AudioObjectGetPropertyData(
deviceID,
&(AudioObjectPropertyAddress){kAudioDevicePropertyDeviceNameCFString,
kAudioObjectPropertyScopeGlobal,
kAudioObjectPropertyElementMain},
0, NULL, &size, &deviceName);
if (err != noErr) {
printf("Error getting device name\n");
exit(1);
}
char deviceNameStr[256];
if (!CFStringGetCString(deviceName, deviceNameStr, sizeof(deviceNameStr),
kCFStringEncodingUTF8)) {
printf("Error converting device name to C string\n");
exit(1);
}
CFRelease(deviceName);
printf("Device Name: %s\n", deviceNameStr);
}
int main(int argc, const char *argv[]) {
@autoreleasepool {
OSStatus err;
// Get the default input device ID
AudioDeviceID input_device_id = kAudioObjectUnknown;
{
UInt32 property_size = sizeof(input_device_id);
AudioObjectPropertyAddress input_device_property = {
kAudioHardwarePropertyDefaultInputDevice,
kAudioObjectPropertyScopeGlobal,
kAudioObjectPropertyElementMain,
};
err = AudioObjectGetPropertyData(kAudioObjectSystemObject, &input_device_property, 0, NULL,
&property_size, &input_device_id);
if (err != noErr || input_device_id == kAudioObjectUnknown) {
printf("Error getting default input device ID\n");
exit(1);
}
}
// Print the device name using the input device ID
printDeviceName(input_device_id);
// Open audio unit for the input device
AudioComponentDescription desc = {kAudioUnitType_Output, kAudioUnitSubType_HALOutput,
kAudioUnitManufacturer_Apple, 0, 0};
AudioComponent component = AudioComponentFindNext(NULL, &desc);
AudioUnit audioUnit;
err = AudioComponentInstanceNew(component, &audioUnit);
if (err != noErr) {
printf("Error creating AudioUnit\n");
exit(1);
}
// Enable IO for input on the AudioUnit and disable output
UInt32 enableInput = 1;
UInt32 disableOutput = 0;
err = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Input,
1, &enableInput, sizeof(enableInput));
if (err != noErr) {
printf("Error enabling input on AudioUnit\n");
exit(1);
}
err = AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_EnableIO, kAudioUnitScope_Output,
0, &disableOutput, sizeof(disableOutput));
if (err != noErr) {
printf("Error disabling output on AudioUnit\n");
exit(1);
}
// Set the current device to the input device
err =
AudioUnitSetProperty(audioUnit, kAudioOutputUnitProperty_CurrentDevice,
kAudioUnitScope_Global, 0, &input_device_id, sizeof(input_device_id));
if (err != noErr) {
printf("Error setting device for AudioUnit\n");
exit(1);
}
// Initialize AudioUnit
err = AudioUnitInitialize(audioUnit);
if (err != noErr) {
printf("Error initializing AudioUnit\n");
exit(1);
}
// Print device info
printDeviceInfo(audioUnit);
// Clean up
AudioUnitUninitialize(audioUnit);
AudioComponentInstanceDispose(audioUnit);
}
return 0;
}
It prints:
Device Name: MacBook Pro Microphone
Number of Channels (Stream Format): 1
Number of Elements (Element Count): 2
I tried to set the number of channels to 1 on the input unit, but it didn’t change anything. After calling setNumberOfChannels(1, audioUnit), I’m still getting the same output.
Note 1: I know that I can ignore one channel, etc, etc. My purpose here is not to "somehow get it to work", I already did that. My purpose is to understand the API, so that I'll be able to write code that handles any number of audio inputs.
Note 2: I already read a bunch of documentation, especially this here: https://developer.apple.com/library/archive/technotes/tn2091/ - perhaps the channel map could help here, but I can’t make sense of it - I tried to use it based on my understanding but I only got the -50 OSStatus.
How should I understand this? Is it that that audio unit is an abstraction layer and automatically converts mono input into stereo input? Can I ask AUHAL to provide me the same number of input channels that the audio device has?
I'm having trouble using SFSpeechRecognizer & SFSpeechRecognitionTask to show me the words from an audio file. I found a solution on stackoverflow to separate the audio file into smaller sizes. How would I do that programmatically using Swift for a macOS app Xcode project?
I would prefer not to separate the file into smaller files. I will submit another post with more information for that.
Is there anyway we can add volume controls to adjust the volume settings from our phone? I’ve noticed we have to reply on the radio head unit to control our volume when it comes to our navigation and music. we should have full access to control it from our phone as well. Any thoughts?
Audio getting disabled, Not able to control audio, When opening music player audio works but not on instagram or any other apps.
Audio button on notification bar is greyed out as getting disabled.
Hi there community,
First and foremost, a big thank you to everyone who takes the time to read this.
TL;DR: How, if even possible, can I record multiple audio streams simultaneously on an iOS application (iPad/iPhone)?
I'm working on a recorder for the iPad to gather data for a machine learning project focused on speech recognition. Our goal is to capture extensive speech data, which requires recording from multiple microphones. Specifically, I need to record from all mics connected to our Scarlett 4i4 audio interface and, most importantly, also record from the built-in mic on the iPad or iPhone at the same time.
As a newcomer to Swift development, I initially explored AVAudioRecorder. However, I quickly realized that it only supports one active audio node at a time, making multi-channel recording impossible. (perhaps you can proof me wrong, would make my day) Next, I transitioned to using AVAudioEngine, but encountered the same limitation: I couldn't manage to get input nodes for both the built-in mic and the Scarlett interface channels simultaneously. The application started behaving oddly, often resulting in identical audio data being recorded across all files.
Determined to find a solution, I delved deeper into the Core Audio framework, specifically using Audio Toolbox. My approach involved creating and configuring multiple Audio Units, each corresponding to a different audio input device. Here's a brief overview of my current implementation:
Listing Available Input Devices: I used AVAudioSession to enumerate all available input devices.
Creating Audio Units: For each device, I created an Audio Unit and attempted to configure it for recording.
Setting Up Callbacks: I set up input and output callbacks to handle the audio processing.
Despite my efforts over the last few days, I haven't had much success. The callbacks for the Audio Units don't seem to be invoked correctly, and I'm struggling to achieve simultaneous multi-channel recording. Below is a snippet of my latest attempt:
let audioUnitCallback: AURenderCallback = { (
inRefCon: UnsafeMutableRawPointer,
ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
inTimeStamp: UnsafePointer<AudioTimeStamp>,
inBusNumber: UInt32,
inNumberFrames: UInt32,
ioData: UnsafeMutablePointer<AudioBufferList>?
) -> OSStatus in
guard let ioData = ioData else {
return noErr
}
print("Input callback invoked")
let audioUnit = inRefCon.assumingMemoryBound(to: AudioUnit.self).pointee
var bufferList = AudioBufferList(
mNumberBuffers: 1,
mBuffers: AudioBuffer(
mNumberChannels: 1,
mDataByteSize: 0,
mData: nil
)
)
let status = AudioUnitRender(audioUnit, ioActionFlags, inTimeStamp, inBusNumber, inNumberFrames, &bufferList)
if status != noErr {
print("AudioUnitRender failed: \(status)")
return status
}
// Copy rendered data to output buffer
let buffer = UnsafeMutableAudioBufferListPointer(ioData)[0]
buffer.mData?.copyMemory(from: bufferList.mBuffers.mData!, byteCount: Int(bufferList.mBuffers.mDataByteSize))
buffer.mDataByteSize = bufferList.mBuffers.mDataByteSize
print("Rendered audio data")
return noErr
}
let outputCallback: AURenderCallback = { (
inRefCon: UnsafeMutableRawPointer,
ioActionFlags: UnsafeMutablePointer<AudioUnitRenderActionFlags>,
inTimeStamp: UnsafePointer<AudioTimeStamp>,
inBusNumber: UInt32,
inNumberFrames: UInt32,
ioData: UnsafeMutablePointer<AudioBufferList>?
) -> OSStatus in
guard let ioData = ioData else {
return noErr
}
print("Output callback invoked")
// Process the output data if needed
return noErr
}
In essence, I'm stuck and in need of guidance. Has anyone here successfully implemented multi-channel recording on iOS, especially involving both built-in microphones and external audio interfaces? Any shared experiences, insights, or suggestions on how to proceed would be immensely appreciated.
Thank you once again for your time and assistance!
I have a React website that uses getUserMedia to capture user audio. I'm displaying this website in an iOS mobile app using UIWebView in SwiftUI. The audio is correctly captured when the app is in focus. However, when the app goes to the home screen and runs in the background, the microphone audio gets cut off. This issue does not occur when the website is opened in iOS Safari.
Here's my Info.plist and .entitlements file. I granted most, if not all, permissions for both files in an attempt to get it to work, but it still doesn't resolve the issue.
Info.plist
audio
bluetooth-central
bluetooth-peripheral
external-accessory
fetch
location
nearby-interaction
processing
push-to-talk
remote-notification
voip
Entitlements.plist
com.apple.developer.push-to-talk: true
com.apple.developer.spatial-audio.profile-access: true
inter-app-audio: true
Hello Apple Community,
I am developing an iOS app and would like to add a feature that allows users to play and organize Audible.com files within the app. Does Audible or the App Store provide any API or SDK for third-party apps to access and manage Audible content? If so, could you please provide some guidance on how to integrate it into my app?
Thank you for your assistance!
Best regards,
Yes it labs
In the AudioServicesPlaySystemSound function of AudioToolbox, you can enter the corresponding SystemSoundID to play some sound effects that come with the system. However, I can't be sure what sound effect each number corresponds to, so I want to know all the sound effects in visionOS and its corresponding SystemSoundID.