首页 > 解决方案 > Android 如何将 UDP 接收音频转换为 wav 文件

问题描述

我正在尝试从我的应用程序接收流式音频。

下面是我接收音频流的代码:

public class ClientListen implements Runnable {
private Context context;

    public ClientListen(Context context) {
        this.context = context;
    }

    @Override
    public void run() {
        boolean run = true;
        try {
            DatagramSocket udpSocket = new DatagramSocket(8765);
            InetAddress serverAddr = null;
            try {
                serverAddr = InetAddress.getByName("127.0.0.1");
            } catch (UnknownHostException e) {
                e.printStackTrace();
            }

            while (run) {
                try {
                    byte[] message = new byte[8000];
                    DatagramPacket packet = new DatagramPacket(message,message.length);
                    Log.i("UDP client: ", "about to wait to receive");
                    udpSocket.setSoTimeout(10000);
                    udpSocket.receive(packet);

                    String text = new String(packet.getData(), 0, packet.getLength());
                    Log.d("Received text", text);
                } catch (IOException e) {
                    Log.e(" UDP clien", "error: ", e);
                    run = false;
                    udpSocket.close();
                }
            }
        } catch (SocketException e) {
            Log.e("Socket Open:", "Error:", e);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

在收到的文本记录器中,我可以看到数据如下

D/Received text: �������n�����������q�9�$�0�/�G�{�������s�����JiH&������d�����Z���������d�����E������C�+
    ��l��y�����������v���9����������u��f�j�������$�����K���������F��~R�2�����T��������������L�����!��G��8������s�;�"�,�R�����(��{�����*_��Z�������5������������\������x���j~������������/��=�����%�������

如何将这些数据存储到 wav 文件中?

标签: androidudpaudio-streaming

解决方案


  1. 您看到的是单个 udp 数据包在收到并且收到的块刚刚被释放后的字符串表示。它只是您想要转换为波形的声音的一小部分。很快while循环将继续,您将收到另一个数据包等等。您需要将所有数据包收集在缓冲区中,然后当您认为没问题时 - 将它们转换为波形文件。

  2. 请记住,Wave 不仅是您从 udp 获得的声音字节,还包括您需要添加到此文件以被玩家识别的 44 字节前缀。

  3. 此外,如果 udp 来自另一种编码格式,例如 G711 - 您必须将这些字节编码为 PCM - 如果不是,您将在波形声音或播放的流中听到很大的噪音。

  4. 缓冲区必须准确。如果它太大(数组末尾有很多空字节),你会听到直升机的声音。如果您确切知道每个数据包的大小,那么您可以将其写入 AudioTrack 以播放流,或者在您认为合适时将其累积并转换为波形文件。但是,如果您不确定大小,可以使用此答案获取缓冲区,然后将缓冲区写入 AudioTrack: Android AudioRecord to Server over UDP Playback Issues。他们使用 Javax,因为它是非常古老的答案,但您只需要使用 AudioTrack 来进行流式传输。它不在此范围内,因此我将仅介绍 AudioTrack 流替换而不是 Javax SourceDataLine:

         final int SAMPLE_RATE = 8000; // Hertz
         final int STREAM_TYPE = AudioManager.STREAM_NOTIFICATION;
         int channelConfig = AudioFormat.CHANNEL_OUT_MONO;
         int encodingFormat = AudioFormat.ENCODING_PCM_16BIT;
    
         AudioTrack track = new AudioTrack(STREAM_TYPE, SAMPLE_RATE, channelConfig,
               encodingFormat, BUF_SIZE, AudioTrack.MODE_STREAM);            
         track.play();
         //.. then after receive UDP packets and the buffer is full:
          if(track != null && packet != null){
            track.write(audioStreamBuffer, 0, audioStreamBuffer.length);
          }
    
  5. 您不能在 UI 线程中执行此操作(我假设您知道这一点)。

  6. 在代码中,我将向您展示 - 我正在从 PTT 电台获取音频日志的 udp。它以 G711 Ulaw 编码。每个数据包正好是 172 字节。前 12 个字节用于 RTP,我需要偏移(删除)它们以消除小噪音。剩下的 160 个字节是 20MS 的声音。

  7. 我必须将 G711 Ulaw 字节解码为 PCM 短裤数组。然后取出短数组并从中制作波形文件。在我看到超过一秒钟没有收到数据包后,我正在接受它(所以我知道演讲结束了,新的块发布是因为有新演讲,所以我可以把旧演讲拿出来制作一个波形文件)。您可以决定不同的缓冲区取决于您在做什么。
  8. 它工作正常。解码后的波形声音非常好。如果你有 UDP 和 PCM,那么你不需要解码 G711 - 跳过这部分。

  9. 最后我想提一下,我看到了许多使用 javax.sound.sampled 代码部分的旧答案,这看起来很棒,因为它可以使用 AudioFileFormat 轻松地将音频文件或流转换为波形格式,还可以使用 AudioFormat 操作将 G711 转换为 pcm。但不幸的是,它不是当前 java for android 的一部分。我们必须依靠 android AudioTrack 来代替(如果我们想从麦克风获取声音,还需要 AudioRecord),但 AudioTrack 只播放 PCM 并且不支持 G711 格式——所以当使用 AudioTrack 流式传输 G711 时,噪音很糟糕。在将其写入轨道之前,我们必须在代码中对其进行解码。此外,我们无法使用 audioInputStream 转换为波形文件——我尝试使用添加到我的应用程序的 javax.sound.sampled jar 文件轻松完成此操作,但 android 不断给我错误,例如波形不支持格式,

A. 在清单中添加:

         <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"/>
         <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"/>
         <uses-permission android:name="android.permission.INTERNET"/>

B. 在工作线程中:

       @Override
      public void run(){
        Log.i(TAG, "ClientListen thread started. Thread id: " + Thread.currentThread().getId());
        try{
                udpSocket = new DatagramSocket(port);
        }catch(SocketException e){
              e.printStackTrace();
        }
        byte[] messageBuf = new byte[BUF_SIZE];
        Log.i(TAG, "waiting to receive packet in port: " + port);
        if(udpSocket != null){
            // here you can create new AudioTrack and play.track
            byte pttSession[] = null;
            while (running){
                packet = new DatagramPacket(messageBuf, 0, messageBuf.length);
                Log.d(TAG, "inside while running loop");
                try{
                    Log.d(TAG, "receive block: waiting for user to press on 
                    speaker(listening now inside udpSocket for DatagramPacket..)");

                    //get inside receive block until packet will arrive through this socket
                     long timeBeforeBlock = System.currentTimeMillis();
                    udpSocket.receive(packet);
                    Log.d(TAG, "client received a packet, receive block stopped)");
                    //this is for sending msg handler to the UI tread (you may skip this)
                    sendState("getting UDP packets..."); 

          /* if previous block release happened more than one second ago - so this 
              packet release is for a new speech. so let’s copy the previous speech 
              to a wave file and empty the speech */
                  if(System.currentTimeMillis() - timeBeforeBlock > 1000 && pttSession != null){
                       convertBytesToFile(pttSession);
                       pttSession = null;
                    }
                  /* let’s take the packet that was released and start new speech or add it to the ongoing speech. */
                     byte[] slice = Arrays.copyOfRange(packet.getData(), 12, packet.getLength());
                    if(null == pttSession){
                        pttSession = slice;
                    }else{
                        pttSession = concat(pttSession, slice);
                        Log.d(TAG, "pttSession:" + Arrays.toString(pttSession));
                    }
                }catch(IOException e){
                    Log.e(TAG, "UDP client IOException - error: ", e);
                    running = false;
                }
            }
            // let’s take the latest speech and make a last wave file out of it.
            if(pttSession != null){
                convertBytesToFile(pttSession);
                pttSession = null;
            }
             // if running == false then stop listen.
            udpSocket.close();
            handler.sendEmptyMessage(MainActivity.UdpClientHandler.UPDATE_END);
         }else{
            sendState("cannot bind datagram socket to the specified port:" + port);
         }
      }


        private void convertBytesToFile(byte[] byteArray){

             //decode the bytes from G711U to PCM (outcome is a short array)
             G711UCodec decoder = new G711UCodec();
             int size = byteArray.length;
             short[] shortArray = new short[size];
             decoder.decode(shortArray, byteArray, size, 0);
             String newFileName = "speech_" + System.currentTimeMillis() + ".wav";
             //convert short array to wav (add 44 prefix shorts) and save it as a .wav file
             Wave wave = new Wave(SAMPLE_RATE, (short) 1, shortArray, 0, shortArray.length - 1);
             if(wave.writeToFile(Environment.getExternalStoragePublicDirectory
     (Environment.DIRECTORY_DOWNLOADS),newFileName)){ 
                   Log.d(TAG, "wave.writeToFile successful!");
                   sendState("create file: "+ newFileName);
             }else{
                   Log.w(TAG, "wave.writeToFile failed");
             }
         }

C.编码/解码G711 U-Law类:取自:https ://github.com/thinktube-kobe/airtube/blob/master/JavaLibrary/src/com/thinktube/audio/G711UCodec.java

 /**
 * G.711 codec. This class provides u-law conversion.
 */
 public class G711UCodec {
 // s00000001wxyz...s000wxyz
 // s0000001wxyza...s001wxyz
 // s000001wxyzab...s010wxyz
 // s00001wxyzabc...s011wxyz
 // s0001wxyzabcd...s100wxyz
 // s001wxyzabcde...s101wxyz
 // s01wxyzabcdef...s110wxyz
 // s1wxyzabcdefg...s111wxyz

private static byte[] table13to8 = new byte[8192];
private static short[] table8to16 = new short[256];

static {
    // b13 --> b8
    for (int p = 1, q = 0; p <= 0x80; p <<= 1, q += 0x10) {
        for (int i = 0, j = (p << 4) - 0x10; i < 16; i++, j += p) {
            int v = (i + q) ^ 0x7F;
            byte value1 = (byte) v;
            byte value2 = (byte) (v + 128);
            for (int m = j, e = j + p; m < e; m++) {
                table13to8[m] = value1;
                table13to8[8191 - m] = value2;
            }
        }
    }

    // b8 --> b16
    for (int q = 0; q <= 7; q++) {
        for (int i = 0, m = (q << 4); i < 16; i++, m++) {
            int v = (((i + 0x10) << q) - 0x10) << 3;
            table8to16[m ^ 0x7F] = (short) v;
            table8to16[(m ^ 0x7F) + 128] = (short) (65536 - v);
        }
    }
}

public int decode(short[] b16, byte[] b8, int count, int offset) {
    for (int i = 0, j = offset; i < count; i++, j++) {
        b16[i] = table8to16[b8[j] & 0xFF];
    }
    return count;
}

public int encode(short[] b16, int count, byte[] b8, int offset) {

    for (int i = 0, j = offset; i < count; i++, j++) {
        b8[j] = table13to8[(b16[i] >> 4) & 0x1FFF];
    }
    return count;
}

 public int getSampleCount(int frameSize) {
    return frameSize;
 }
}

D.转换成wave文件:取自这里: https ://github.com/google/oboe/issues/320

 import java.io.File;
 import java.io.FileNotFoundException;
 import java.io.FileOutputStream;
 import java.io.IOException;


 public class Wave
 {
        private final int LONGINT = 4;
        private final int SMALLINT = 2;
        private final int INTEGER = 4;
        private final int ID_STRING_SIZE = 4;
        private final int WAV_RIFF_SIZE = LONGINT+ID_STRING_SIZE;
        private final int WAV_FMT_SIZE = (4*SMALLINT)+(INTEGER*2)+LONGINT+ID_STRING_SIZE;
        private final int WAV_DATA_SIZE = ID_STRING_SIZE+LONGINT;
        private final int WAV_HDR_SIZE = WAV_RIFF_SIZE+ID_STRING_SIZE+WAV_FMT_SIZE+WAV_DATA_SIZE;
        private final short PCM = 1;
        private final int SAMPLE_SIZE = 2;
        int cursor, nSamples;
        byte[] output;


public Wave(int sampleRate, short nChannels, short[] data, int start, int end)
{
    nSamples=end-start+1;
    cursor=0;
    output=new byte[nSamples*SMALLINT+WAV_HDR_SIZE];
    buildHeader(sampleRate,nChannels);
    writeData(data,start,end);
}

/*
 by Udi for using byteArray directly
*/
public Wave(int sampleRate, short nChannels, byte[] data, int start, int end)
{
    int size = data.length;
    short[] shortArray = new short[size];
    for (int index = 0; index < size; index++){
        shortArray[index] = (short) data[index];
    }
    nSamples=end-start+1;
    cursor=0;
    output=new byte[nSamples*SMALLINT+WAV_HDR_SIZE];
    buildHeader(sampleRate,nChannels);
    writeData(shortArray,start,end);
}



// ------------------------------------------------------------
private void buildHeader(int sampleRate, short nChannels)
{
    write("RIFF");
    write(output.length);
    write("WAVE");
    writeFormat(sampleRate, nChannels);
}
// ------------------------------------------------------------
public void writeFormat(int sampleRate, short nChannels)
{
    write("fmt ");
    write(WAV_FMT_SIZE-WAV_DATA_SIZE);
    write(PCM);
    write(nChannels);
    write(sampleRate);
    write(nChannels * sampleRate * SAMPLE_SIZE);
    write((short)(nChannels * SAMPLE_SIZE));
    write((short)16);
}
// ------------------------------------------------------------
public void writeData(short[] data, int start, int end)
{
    write("data");
    write(nSamples*SMALLINT);
    for(int i=start; i<=end; write(data[i++]));
}
// ------------------------------------------------------------
private void write(byte b)
{
    output[cursor++]=b;
}
// ------------------------------------------------------------
private void write(String id)
{
    if(id.length()!=ID_STRING_SIZE){

    }
    else {
        for(int i=0; i<ID_STRING_SIZE; ++i) write((byte)id.charAt(i));
    }
}
// ------------------------------------------------------------
private void write(int i)
{
    write((byte) (i&0xFF)); i>>=8;
    write((byte) (i&0xFF)); i>>=8;
    write((byte) (i&0xFF)); i>>=8;
    write((byte) (i&0xFF));
}
// ------------------------------------------------------------
private void write(short i)
{
    write((byte) (i&0xFF)); i>>=8;
    write((byte) (i&0xFF));
}
// ------------------------------------------------------------
public boolean writeToFile(File fileParent , String filename)
{
    boolean ok=false;

    try {
       File path=new File(fileParent, filename);
       FileOutputStream outFile = new FileOutputStream(path);
      outFile.write(output);
      outFile.close();
        ok=true;
    } catch (FileNotFoundException e) {
        e.printStackTrace();
        ok=false;
    } catch (IOException e) {
        ok=false;
        e.printStackTrace();
    }
    return ok;
}


/**
 * by Udi for test: write file with temp name so if you write many packets each packet will be written to a new file instead of deleting
 * the previous file. (this is mainly for debug)
 * @param fileParent
 * @param filename
 * @return
 */
public boolean writeToTmpFile(File fileParent , String filename)
{
    boolean ok=false;

    try {
        File outputFile = File.createTempFile(filename, ".wav",fileParent);
        FileOutputStream fileoutputstream = new FileOutputStream(outputFile);
        fileoutputstream.write(output);
        fileoutputstream.close();
        ok=true;
    } catch (FileNotFoundException e) {
        e.printStackTrace();
        ok=false;
    } catch (IOException e) {
        ok=false;
        e.printStackTrace();
    }
    return ok;
 }
}

推荐阅读