首页 > 解决方案 > swift AVAudioEngine 将多声道非交错信号转换为单声道

问题描述

我正在使用 AVAudioEngine 进行测量。我从界面中播放刺激声音,并使用 micTap 记录返回的信号。

我现在正在研究支持多种不同格式的不同音频接口。我通过混合器转换 inputNode 的输入格式有两个不同的原因:

  1. 从接口的首选 sampleRate 下采样到我的应用程序正在运行的 sampleRate

  2. 将传入通道数转换为单个单声道

我试试这个,但它似乎并不总是按预期工作。如果我的界面运行 96k 而我的应用运行 48k,则通过混音器进行格式更改最终会得到以下结果: 在此处输入图像描述

这看起来只是获得立体声交错通道的一侧。下面是我的音频引擎代码:

func initializeEngine(inputSweep:SweepFilter)  {
    buf1current = 0
    buf2current = 0
    in1StartTime = 0
    in2startTime = 0
    in1firstRun = true
    in2firstRun = true
    in1Buf = Array(repeating:0, count:1000000)
    in2Buf = Array(repeating:0, count:1000000)
    engine.stop()
    engine.reset()
    engine = AVAudioEngine()
    numberOfSamples = 0

    var time:Int = 0
    do {
        try AVAudioSession.sharedInstance().setCategory(.playAndRecord)
        try AVAudioSession.sharedInstance()
        .setPreferredSampleRate(Double(sampleRate))    
    } catch {
        assertionFailure("AVAudioSession setup failed")
    }

    let format = engine.outputNode.inputFormat(forBus: 0)
    let stimulusFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32,
        sampleRate: Double(sampleRate),
        channels: 1,
        interleaved: false)

    let outputFormat = engine.outputNode.inputFormat(forBus: 0)
    let inputFormat = engine.inputNode.outputFormat(forBus: 0)

    let srcNode = AVAudioSourceNode { _, timeStamp, frameCount, AudioBufferList -> OSStatus in
            let ablPointer = UnsafeMutableAudioBufferListPointer(AudioBufferList)
            if self.in2firstRun == true {
                let start2 = CACurrentMediaTime()
                self.in2startTime = Double(CACurrentMediaTime())
                self.in2firstRun = false
            }

            if Int(frameCount) + time >= inputSweep.stimulus.count{
            self.running = false
            print("AUDIO ENGINE STOPPED")
        }

        if (Int(frameCount) + time) <= inputSweep.stimulus.count {
            for frame in 0..<Int(frameCount) {
                let value = inputSweep.stimulus[frame + time] * Float(outputVolume)
                for buffer in ablPointer {
                    let buf: UnsafeMutableBufferPointer<Float> = UnsafeMutableBufferPointer(buffer)
                    buf[frame] = value
                }
            }

            time += Int(frameCount)
        } else {
            for frame in 0..<Int(frameCount) {
                let value = 0
                for buffer in ablPointer {
                    let buf: UnsafeMutableBufferPointer<Float> = UnsafeMutableBufferPointer(buffer)
                    buf[frame] = Float(value)
                }
            }
        }
    return noErr
    }

    engine.attach(srcNode)
    engine.connect(srcNode, to: engine.mainMixerNode, format: stimulusFormat)
    engine.connect(engine.mainMixerNode, to: engine.outputNode, format: format)

    let requiredFormat = AVAudioFormat(commonFormat: .pcmFormatFloat32,
        sampleRate: Double(sampleRate),
        channels: 1,
        interleaved: false)  

    let formatMixer = AVAudioMixerNode()
    engine.attach(formatMixer)
    engine.connect(engine.inputNode, to: formatMixer, format: inputFormat)

    let MicSinkNode = AVAudioSinkNode() { (timeStamp, frames, audioBufferList) ->
        OSStatus in
            if self.in1firstRun == true {
                let start1 = CACurrentMediaTime()
                self.in1StartTime = Double(start1)
                self.in1firstRun = false

           }

            let ptr = audioBufferList.pointee.mBuffers.mData?.assumingMemoryBound(to: Float.self)
            var monoSamples = [Float]()
            monoSamples.append(contentsOf: UnsafeBufferPointer(start: ptr, count: Int(frames)))
        if self.buf1current >= 100000 {
            self.running = false
        }
            for frame in 0..<frames {
                self.in1Buf[self.buf1current + Int(frame)] = monoSamples[Int(frame)]
            }
            self.buf1current = self.buf1current + Int(frames)



       return noErr
    }

    engine.attach(MicSinkNode)
    engine.connect(formatMixer, to: MicSinkNode, format: requiredFormat)

    engine.prepare()
    assert(engine.inputNode != nil)
    running = true
    try! engine.start()
}

我的 sourceNode 是一个使用stimulusFormat 合成的浮点数组。如果我以 96k 的接口收听这个 audioEngine,输出刺激听起来完全干净。然而,这个破碎的信号是来自 micTap 的。物理上接口的输出是路由的。直接输入,所以不通过任何其他设备。

除此之外,我还有以下函数,它将我的数组记录到 WAV 文件中,以便我可以在 DAW 中直观地检查。

func writetoFile(buff:[Float], name:String){
let SAMPLE_RATE =  sampleRate

let outputFormatSettings = [
    AVFormatIDKey:kAudioFormatLinearPCM,
    AVLinearPCMBitDepthKey:32,
    AVLinearPCMIsFloatKey: true,
    AVLinearPCMIsBigEndianKey: true,
    AVSampleRateKey: SAMPLE_RATE,
    AVNumberOfChannelsKey: 1
    ] as [String : Any]

let fileName = name
let DocumentDirURL = try! FileManager.default.url(for: .documentDirectory, in: .userDomainMask, appropriateFor: nil, create: true)


let url = DocumentDirURL.appendingPathComponent(fileName).appendingPathExtension("wav")
//print("FilePath: \(url.path)")

let audioFile = try? AVAudioFile(forWriting: url, settings: outputFormatSettings, commonFormat: AVAudioCommonFormat.pcmFormatFloat32, interleaved: false)

let bufferFormat = AVAudioFormat(settings: outputFormatSettings)

let outputBuffer = AVAudioPCMBuffer(pcmFormat: bufferFormat!, frameCapacity: AVAudioFrameCount(buff.count))

for i in 0..<buff.count {
    outputBuffer?.floatChannelData!.pointee[i] = Float(( buff[i] ))
}
outputBuffer!.frameLength = AVAudioFrameCount( buff.count )

do{
    try audioFile?.write(from: outputBuffer!)

} catch let error as NSError {
    print("error:", error.localizedDescription)
}

}

如果我将我的接口设置为 48k,并且我的应用程序以 48k 运行,如果我检查我的参考信号和。我的测量信号,我得到以下信息:

在此处输入图像描述

测量的信号显然比原始刺激要长很多。物理文件大小。与初始化为固定大小的空数组相同。但是在某些时候进行格式转换,这是不正确的。如果我将界面设置为 44.1k,而我的应用程序以 48k 运行,我可以在音频中看到常规的“故障”。所以这里的格式转换不能正常工作。

任何人都可以看到明显的错误吗?

标签: swiftaudiomeasurementavaudioengineaudioformat

解决方案


将非交错选项“AVLinearPCMIsNonInterleaved”放入格式设置中:

let outputFormatSettings = [
**AVLinearPCMIsNonInterleaved: 0,**
AVFormatIDKey:kAudioFormatLinearPCM,
AVLinearPCMBitDepthKey:32,
AVLinearPCMIsFloatKey: true,
AVLinearPCMIsBigEndianKey: true,
AVSampleRateKey: SAMPLE_RATE,
AVNumberOfChannelsKey: 1
] as [String : Any]

它对我有用,让我知道


推荐阅读