首页 > 解决方案 > 检查视频中嵌入的音频是否为 Ambisonic 的最佳方法是什么?

问题描述

我们正在使用 Unity 为 VR 耳机制作 360 度视频播放器。我们正在尝试实现空间音频。最简单的情况是视频和音频位于单独的文件中,但现在我们决定也支持嵌入音频的视频。在加载视频之前,我需要知道音频是否为 Ambisonic。我正在寻找一种简单的方法来确定音频是否为 Ambisonic,以便将其与视频分离并转换为我们的应用程序当前支持的 .tbe 文件。

我试图使用ffmpeg:

$./ffmpeg.exe -i ~/Videos/video.mp4

并得到:

ffmpeg version 4.1.3 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8.3.1 (GCC) 20190414
  configuration: --enable-gpl --enable-version3 --enable-sdl2 --enable-fontconfig --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libfreetype --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libtheora --enable-libtwolame --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libzimg --enable-lzma --enable-zlib --enable-gmp --enable-libvidstab --enable-libvorbis --enable-libvo-amrwbenc --enable-libmysofa --enable-libspeex --enable-libxvid --enable-libaom --enable-libmfx --enable-amf --enable-ffnvcodec --enable-cuvid --enable-d3d11va --enable-nvenc --enable-nvdec --enable-dxva2 --enable-avisynth
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C:/Users/Medion/Videos/video.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.56.101
  Duration: 00:11:39.40, start: 0.000000, bitrate: 17290 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 3840x2160 [SAR 1:1 DAR 16:9], 16497 kb/s, 29.97 fps, 29.97 tbr, 30k tbn, 59.94 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Side data:
      stereo3d: top and bottom
      spherical: equirectangular (0.000000/0.000000/0.000000)
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 4.0, fltp, 778 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
    Stream #0:2(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 4.0, fltp, 4 kb/s
    Metadata:
      handler_name    : SoundHandler
    Stream #0:3(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 2 kb/s
    Metadata:
      handler_name    : SoundHandler
At least one output file must be specified

但我没有看到任何可以表明音频是 Ambisonic 的线。我看到流#0:1 和#0:2 中有4 个频道,但我敢打赌这还不够。

我也尝试过 MediaInfo 但它是同一件事:

General
CompleteName                     : C:\Users\Medion\Videos\video.mp4
Format/String                    : MPEG-4
Format_Profile                   : Base Media
CodecID/String                   : isom (isom/iso2/avc1/mp41)
FileSize/String                  : 1.41 GiB
Duration/String                  : 11 min 39 s
OverallBitRate_Mode/String       : Variable
OverallBitRate/String            : 17.3 Mb/s
Encoded_Application/String       : Lavf57.56.101

Video
ID/String                        : 1
Format/String                    : AVC
Format/Info                      : Advanced Video Codec
Format_Profile                   : High@L5.2
Format_Settings                  : CABAC / 3 Ref Frames
Format_Settings_CABAC/String     : Yes
Format_Settings_RefFrames/String : 3 frames
Format_Settings_GOP              : M=3, N=29
CodecID                          : avc1
CodecID/Info                     : Advanced Video Coding
Duration/String                  : 11 min 39 s
BitRate_Mode/String              : Variable
BitRate/String                   : 16.5 Mb/s
BitRate_Maximum/String           : 20.0 Mb/s
Width/String                     : 3 840 pixels
Height/String                    : 2 160 pixels
DisplayAspectRatio/String        : 16:9
FrameRate_Mode/String            : Constant
FrameRate/String                 : 29.970 (30000/1001) FPS
Standard                         : NTSC
ColorSpace                       : YUV
ChromaSubsampling/String         : 4:2:0
BitDepth/String                  : 8 bits
ScanType/String                  : Progressive
Bits-(Pixel*Frame)               : 0.066
StreamSize/String                : 1.34 GiB (95%)
Language/String                  : English
Tagged_Date                      : UTC 2017-06-13 17:37:51
colour_range                     : Limited
colour_primaries                 : BT.709
transfer_characteristics         : BT.709
matrix_coefficients              : BT.709
Codec configuration box          : avcC

Audio #1
ID/String                        : 2
Format/String                    : AAC LC
Format/Info                      : Advanced Audio Codec Low Complexity
CodecID                          : mp4a-40-2
Duration/String                  : 11 min 39 s
Source_Duration/String           : 11 min 39 s
BitRate_Mode/String              : Constant
BitRate/String                   : 779 kb/s
Channel(s)/String                : 4 channels
ChannelLayout                    : C L R Cb
SamplingRate/String              : 48.0 kHz
FrameRate/String                 : 46.875 FPS (1024 SPF)
Compression_Mode/String          : Lossy
StreamSize/String                : 64.9 MiB (5%)
Source_StreamSize/String         : 64.9 MiB (5%)
Default/String                   : Yes
AlternateGroup/String            : 1
Tagged_Date                      : UTC 2017-06-13 17:37:51

Audio #2
ID/String                        : 3
Format/String                    : AAC LC
Format/Info                      : Advanced Audio Codec Low Complexity
CodecID                          : mp4a-40-2
Duration/String                  : 11 min 39 s
Source_Duration/String           : 11 min 39 s
BitRate_Mode/String              : Variable
BitRate/String                   : 4 900 b/s
BitRate_Maximum/String           : 266 kb/s
Channel(s)/String                : 4 channels
ChannelLayout                    : C L R Cb
SamplingRate/String              : 48.0 kHz
FrameRate/String                 : 46.875 FPS (1024 SPF)
Compression_Mode/String          : Lossy
StreamSize/String                : 418 KiB (0%)
Source_StreamSize/String         : 418 KiB (0%)
Default/String                   : No
AlternateGroup/String            : 1
Tagged_Date                      : UTC 2017-06-13 17:37:51

Audio #3
ID/String                        : 4
Format/String                    : AAC LC
Format/Info                      : Advanced Audio Codec Low Complexity
CodecID                          : mp4a-40-2
Duration/String                  : 11 min 39 s
Source_Duration/String           : 11 min 39 s
BitRate_Mode/String              : Variable
BitRate/String                   : 2 275 b/s
BitRate_Maximum/String           : 128 kb/s
Channel(s)/String                : 2 channels
ChannelLayout                    : L R
SamplingRate/String              : 48.0 kHz
FrameRate/String                 : 46.875 FPS (1024 SPF)
Compression_Mode/String          : Lossy
StreamSize/String                : 194 KiB (0%)
Source_StreamSize/String         : 194 KiB (0%)
Default/String                   : No
AlternateGroup/String            : 1
Tagged_Date                      : UTC 2017-06-13 17:37:51

我假设我不知道在这些输出中寻找什么。提前致谢。

标签: audioffmpegvlcmediainfo

解决方案


MediaInfo 开发快照(mini 2190520-2) 现在支持Spatial Audio RFC (draft)元数据(此类元数据在您提供的文件中作为示例),并以这种方式显示信息:

Channel layout                           : Ambisonics (W X Y Z)

推荐阅读