首页 > 解决方案 > 如何将 OGG_OPUS 输入音频流转换为 Google Speech-to-Text API 可接受的字节流格式?

问题描述

上下文:我有一个原始格式为 OGG_OPUS 的录制音频流的 URL。我将其音频从 URL 转换为字节流(根据 Google API 的要求 - https://cloud.google.com/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionAudio)。

当我将此字节流提供给 Google Speech-to-text API 时,我收到一个空响应!!

问题:

  1. 为什么 Google API 在这里返回 Null 响应?
  2. Google API 真的支持 OGG_OPUS 格式的音频输入吗?

代码块

import time
from urllib.request import urlopen
from io import BytesIO
import requests
import base64

# Imports the Google Cloud client library
from google.cloud import speech

# Instantiates a client
client = speech.SpeechClient()
url = "https://drive.google.com/file/d/1zlJaptJYJe0ge_SkpB52N6uRTsEKUGG4/view?usp=sharing"

response  = requests.get(url,stream=True)
output = base64.b64encode(BytesIO(response.content).read())

audio = speech.RecognitionAudio(content=output)

config = speech.RecognitionConfig(
   encoding=speech.RecognitionConfig.AudioEncoding.OGG_OPUS,
   sample_rate_hertz=16000,
   language_code="en-US",
)

# Detects speech in the audio file
response = client.recognize(config=config, audio=audio)
print("response: ", response)
'''

标签: pythonaudioogggoogle-speech-to-text-api

解决方案


推荐阅读