我有一个AVAudioPlayerNode
连接到一个AVAudioEngine
。样本缓冲区通过scheduleBuffer()
方法提供给playerNode
。
然而,看起来playerNode
正在失真音频。而不是简单地"通过"缓冲区,输出失真并包含静态(但仍然大部分是可听到的)。
相关代码:
let myBufferFormat = AVAudioFormat(standardFormatWithSampleRate: 48000, channels: 2)
// Configure player node
let playerNode = AVAudioPlayerNode()
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: myBufferFormat)
// Provide audio buffers to playerNode
for await buffer in mySource.streamAudio() {
await playerNode.scheduleBuffer(buffer)
}
在上面的例子中,mySource.streamAudio()
从ScreenCaptureKit SCStreamDelegate
实时提供音频。音频缓冲区以CMSampleBuffer
到达,转换为AVAudioPCMBuffer
,然后通过AsyncStream
传递到上面的音频引擎。我已经验证了转换后的缓冲区是有效的。
也许缓冲区到达的速度不够快?这张约25,000帧的图表表明inputNode
正在周期性地插入"零"帧片段:
失真似乎是这些空帧的结果。
编辑:
即使我们从管道中删除AsyncStream
,并立即在ScreenCaptureKit回调中处理缓冲区,失真仍然存在。下面是一个可以按原样运行的端到端示例(重要部分是didOutputSampleBuffer
):
class Recorder: NSObject, SCStreamOutput {
private let audioEngine = AVAudioEngine()
private let playerNode = AVAudioPlayerNode()
private var stream: SCStream?
private let queue = DispatchQueue(label: "sampleQueue", qos: .userInitiated)
func setupEngine() {
let format = AVAudioFormat(standardFormatWithSampleRate: 48000, channels: 2)
audioEngine.attach(playerNode)
// playerNode --> mainMixerNode --> outputNode --> speakers
audioEngine.connect(playerNode, to: audioEngine.mainMixerNode, format: format)
audioEngine.prepare()
try? audioEngine.start()
playerNode.play()
}
func startCapture() async {
// Capture audio from Safari
let availableContent = try! await SCShareableContent.excludingDesktopWindows(true, onScreenWindowsOnly: false)
let display = availableContent.displays.first!
let app = availableContent.applications.first(where: {$0.applicationName == "Safari"})!
let filter = SCContentFilter(display: display, including: [app], exceptingWindows: [])
let config = SCStreamConfiguration()
config.capturesAudio = true
config.sampleRate = 48000
config.channelCount = 2
stream = SCStream(filter: filter, configuration: config, delegate: nil)
try! stream!.addStreamOutput(self, type: .audio, sampleHandlerQueue: queue)
try! stream!.addStreamOutput(self, type: .screen, sampleHandlerQueue: queue) // To prevent warnings
try! await stream!.startCapture()
}
func stream(_ stream: SCStream, didOutputSampleBuffer sampleBuffer: CMSampleBuffer, of type: SCStreamOutputType) {
switch type {
case .audio:
let pcmBuffer = createPCMBuffer(from: sampleBuffer)!
playerNode.scheduleBuffer(pcmBuffer, completionHandler: nil)
default:
break // Ignore video frames
}
}
func createPCMBuffer(from sampleBuffer: CMSampleBuffer) -> AVAudioPCMBuffer? {
var ablPointer: UnsafePointer<AudioBufferList>?
try? sampleBuffer.withAudioBufferList { audioBufferList, blockBuffer in
ablPointer = audioBufferList.unsafePointer
}
guard let audioBufferList = ablPointer,
let absd = sampleBuffer.formatDescription?.audioStreamBasicDescription,
let format = AVAudioFormat(standardFormatWithSampleRate: absd.mSampleRate, channels: absd.mChannelsPerFrame) else { return nil }
return AVAudioPCMBuffer(pcmFormat: format, bufferListNoCopy: audioBufferList)
}
}
let recorder = Recorder()
recorder.setupEngine()
Task {
await recorder.startCapture()
}
2条答案
按热度按时间yhuiod9q1#
您的“将缓冲区写入文件:distorted!”块几乎肯定是在做一些缓慢和阻塞的事情(比如写文件)。每170 ms(8192/48 k)调用一次。tap块最好不要花比这更长的时间来执行,否则你会落后并丢弃缓冲区。
写文件的时候可以保持同步,但这取决于你是如何做到的。如果你做了一些效率很低的事情(比如重新打开并刷新每个缓冲区的文件),那么你可能就跟不上了。
如果这个理论是正确的,那么现场扬声器输出应该没有静态的,只是你的输出文件。
nhhxz33t2#
罪魁祸首是
createPCMBuffer()
函数,用这个替换它,一切都运行得很顺利:我问题中的原始函数直接取自苹果的ScreenCaptureKit示例项目。它在技术上可以工作,当写入文件时音频听起来很好,但显然它对实时音频来说不够快。
**编辑:**实际上这可能与速度无关,因为新函数由于复制数据而平均慢了2- 3倍,可能是因为在用指针创建
AVAudioPCMBuffer
时底层数据被释放了。