首页 > 解决方案 > Android,C++:如何使用双簧管的重采样器转换音频采样率

问题描述

我正在使用双簧管在 Android 上播放声音文件。我有 44.1kHz 和 48kHz 文件,我希望能够在同一音频流上播放它们,因此我需要重新采样。

解码和播放文件工作正常,但由于我有两种不同的采样率,我需要重新采样(44.1 到 48 是我目前正在尝试的,因为我的音频流是 48kHz。)

所以我正在尝试使用双簧管的重采样器进行重采样,但我无法完全理解如何去做。按照用于转换固定数量的输入帧的自述指南(我认为这是我必须做的事情?),我尝试按如下方式实现。代码的第一部分获取解码并在采样率相等时返回(这部分按预期工作),第二部分是我在必要时尝试重新采样的地方:

StorageDataSource *StorageDataSource::newFromStorageAsset(AMediaExtractor &extractor,
                                                          const char *fileName,
                                                          AudioProperties targetProperties) {

    std::ifstream stream;
    stream.open(fileName, std::ifstream::in | std::ifstream::binary);
    stream.seekg(0, std::ios::end);
    long size = stream.tellg();
    stream.close();

    constexpr int kMaxCompressionRatio{12};
    const long maximumDataSizeInBytes =
            kMaxCompressionRatio * (size) * sizeof(int16_t);
    auto decodedData = new uint8_t[maximumDataSizeInBytes];

    int32_t rate = NDKExtractor::getSampleRate(extractor);
    int32_t *inputSampleRate = &rate;

    int64_t bytesDecoded = NDKExtractor::decode(extractor, decodedData, targetProperties);
    auto numSamples = bytesDecoded / sizeof(int16_t);

    auto outputBuffer = std::make_unique<float[]>(numSamples);

    // The NDK decoder can only decode to int16, we need to convert to floats
    oboe::convertPcm16ToFloat(
            reinterpret_cast<int16_t *>(decodedData),
            outputBuffer.get(),
            bytesDecoded / sizeof(int16_t));

 if (*inputSampleRate == targetProperties.sampleRate) {
        return new StorageDataSource(std::move(outputBuffer),
                                     numSamples,
                                     targetProperties);
    } else {

        // this is where I try to convert the sample rate

        float *inputBuffer;
        inputBuffer = reinterpret_cast<float *>(decodedData); // is this correct?

        float *outputBuffer2;    // multi-channel buffer to be filled, TODO improve name
        int numInputFrames;  // number of frames of input

        // TODO is this correct?
        numInputFrames = numSamples / 2;

        int numOutputFrames = 0;
        int channelCount = 2;  

        resampler::MultiChannelResampler *mResampler = resampler::MultiChannelResampler::make(
                2, // channel count
                44100, // input sampleRate
                48000, // output sampleRate
                resampler::MultiChannelResampler::Quality::Best); // conversion quality

        int inputFramesLeft = numInputFrames;

        while (inputFramesLeft > 0) {

            if (mResampler->isWriteNeeded()) {
                mResampler->writeNextFrame(inputBuffer);
                inputBuffer += channelCount;
                inputFramesLeft--;
            } else {
                mResampler->readNextFrame(outputBuffer2);
                outputBuffer2 += channelCount;
                numOutputFrames++;
            }
        }
        delete mResampler;

// return is missing!
    }

// returning the original data since above code doesn't work properly yet
 return new StorageDataSource(std::move(outputBuffer),
                                     numSamples,
                                     targetProperties);
}

重采样崩溃SIGSEV

A: signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x7fe69c7000
A:     x0  0000007c0e3d1e00  x1  0000007fe69c7000  x2  0000007bb77dd198  x3  0000007bf5432140
A:     x4  0000000000000021  x5  8080800000000000  x6  fefeff7b976e0667  x7  7f7f7f7fff7f7f7f
A:     x8  0000000000000660  x9  0000000000000660  x10 0000000000000000  x11 0000007bf5435840
A:     x12 0000007bb77dd118  x13 0000000000000008  x14 0000007bf54321c0  x15 0000000000000008
A:     x16 0000007bf5432200  x17 0000000000000000  x18 0000007fe69bf7ba  x19 0000007c14e14c00
A:     x20 0000000000000000  x21 0000007c14e14c00  x22 0000007fe69c0d70  x23 0000007bfc6e5dc7
A:     x24 0000000000000008  x25 0000007c9b7705f8  x26 0000007c14e14ca0  x27 0000000000000002
A:     x28 0000007fe69c0aa0  x29 0000007fe69c0420
A:     sp  0000007fe69c0400  lr  0000007bf94f61f0  pc  0000007bf9501b5c
A: backtrace:
A:     #00 pc 0000000000078b5c  /data/app/myapp-G-GmPWmPgOGfffk-qHsQxw==/lib/arm64/libnative-lib.so (resampler::PolyphaseResamplerStereo::readFrame(float*)+684)
A:     #01 pc 000000000006d1ec  /data/app/myapp-G-GmPWmPgOGfffk-qHsQxw==/lib/arm64/libnative-lib.so (resampler::MultiChannelResampler::readNextFrame(float*)+44)
A:     #02 pc 000000000006c84c  /data/app/myapp-G-GmPWmPgOGfffk-qHsQxw==/lib/arm64/libnative-lib.so (StorageDataSource::newFromStorageAsset(AMediaExtractor&, char const*, AudioProperties)+1316)
A:     #03 pc 78bbcdd7f9b20dbe  <unknown>

以下是我的主要问题:首先,如何正确获取输入的帧数?帧究竟如何处理音频数据?我对此进行了研究,但我仍然不确定我得到了这个?这是一个常数吗?如何计算帧数。它与样本、采样率和比特率有何关联?

其次,我是否使用了正确的输入数据?我使用我的decodedData价值,因为那是我从解码器得到的,只是reinterpret_castfloat*

由于我对 C++ 相当缺乏经验,因此我不确定我所做的是否正确,并且我可能会在这段代码中引入多个错误。

编辑:由于我试图重新采样我的解码输出,我假设这里关于 PCM 的一些信息解释了这里的帧的含义

For encodings like PCM, a frame consists of the set of samples for all channels at a given point in time, and so the size of a frame (in bytes) is always equal to the size of a sample (in bytes) times the number of channels.

这对我来说是正确的吗?这意味着我可以从样本数、音频位的长度和通道数中减去帧数?

标签: androidc++audioresamplingoboe

解决方案


推荐阅读