python - 使用 Google Cloud Speech 转录音频文件时如何解决“请求包含无效参数错误”
问题描述
transcribe_gcs
我们能够在程序的每次运行时创建唯一的存储桶,但是,它在到达函数时遇到了障碍。我们希望程序转录上传到存储桶的音频文件。但是转录过程并不完全正常。
我们将 gcs_uri 的目录更改为“gs://”。这允许每次创建唯一的桶。
def transcribe_gcs(gcs_uri):
"""Asynchronously transcribes the audio file specified by the gcs_uri."""
#from google.cloud import speech
from google.cloud.speech import enums
from google.cloud.speech import types
from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types
audio = types.RecognitionAudio(uri=gcs_uri)
config = types.RecognitionConfig(
encoding='LINEAR16',
sample_rate_hertz=44100,
language_code='en-US',
enable_speaker_diarization=True,
diarization_speaker_count=2)
client = speech.SpeechClient()
##response = client.recognize(config, audio)
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=3000)
result = response.results[-1]
words_info = result.alternatives[0].words
tag = 1
speaker = ""
for word_info in words_info:
if word_info.speaker_tag == tag:
speaker = speaker + " " + word_info.word #need to adjust how speakers are actually separated
else:
print("Speaker {}: {}".format(tag, speaker)) #get program to print entire transcript through here
tag = word_info.speaker_tag
speaker = "" + word_info.word #make sentiment analysis work on each individual line
# Each result is for a consecutive portion of the audio. Iterate through
# them to get the transcripts for the entire audio file.
for result in response.results:
# The first alternative is the most likely one for this portion.
print(u'Transcript: {}'.format(result.alternatives[0].transcript)) #this should be removed eventually but should be used somehow to modify the speaker portion
transcribedSpeechFile = open('speechToAnalyze.txt', 'a+') # this is where a text file is made with the transcribed speech
transcribedSpeechFile.write(format(result.alternatives[0].transcript))
transcribedSpeechFile.close()
confidencePercentage = result.alternatives[0].confidence
confidencePercentage = confidencePercentage * 100
print("Confidence level of transcription: {}%".format(round(confidencePercentage, 2)))
# [END speech_transcribe_async_gcs]
if __name__ == '__main__':
transcribe_gcs(gcs_uri)
预期结果:转录上传到唯一存储桶的音频文件
实际结果:创建一个存储桶,但仅此而已。
错误:
Traceback (most recent call last):
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\google\api_core\grpc_helpers.py", line 57, in error_remapped_callable
return callable_(*args, **kwargs)
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\grpc\_channel.py", line 565, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\grpc\_channel.py", line 467, in _end_unary_response_blocking
raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Request contains an invalid argument."
debug_error_string = "{"created":"@1564207941.288000000","description":"Error received from peer ipv6:[2607:f8b0:4000:80e::200a]:443","file":"src/core/lib/surface/call.cc","file_line":1052,"grpc_message":"Request contains an invalid argument.","grpc_status":3}"
>
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:/Users/Dave/Desktop/mizu/test.py", line 120, in <module>
transcribe_gcs(gcs_uri)
File "C:/Users/Dave/Desktop/mizu/test.py", line 80, in transcribe_gcs
operation = client.long_running_recognize(config, audio)
File "C:\Users\Dave\AppData\Local\Programs\Python\Python37\lib\site-packages\google\cloud\speech_v1p1beta1\gapic\speech_client.py", line 326, in long_running_recognize
request, retry=retry, timeout=timeout, metadata=metadata
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\google\api_core\gapic_v1\method.py", line 143, in __call__
return wrapped_func(*args, **kwargs)
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\google\api_core\retry.py", line 273, in retry_wrapped_func
on_error=on_error,
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\google\api_core\retry.py", line 182, in retry_target
return target()
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\google\api_core\timeout.py", line 214, in func_with_timeout
return func(*args, **kwargs)
File "C:\Users\Dave\AppData\Roaming\Python\Python37\site-packages\google\api_core\grpc_helpers.py", line 59, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 Request contains an invalid argument
解决方案
在按照@siamsot 在他的评论中建议的那样对您的代码进行一些更改后,我可以重现您遇到的错误。仅当您没有通过有效的gcs_uri
.
它应该是类型string
和格式:
gs://[BUCKET_NAME]/[PATH_TO_FILE]/[FILENAME]
就像@Huy Nguyen 在他们的回答中发布的谷歌样本一样:
gs://gcs-test-data/vr.flac
我怀疑您没有gs://
在gcs_uri
. 我设法用您的代码转录了上述示例文件。如果要测试它,请将导入更改为:
from google.cloud import speechv1p1beta1 as speech
#from google.cloud.speech import enums
#from google.cloud.speech import types
#from google.cloud.speech_v1p1beta1 import enums
from google.cloud.speech_v1p1beta1 import types
并传递'gs://gcs-test-data/vr.flac'
给函数。gcs_uri
transcribe_gcs
由于此文件与您在代码中所期望的不同,您应该分别更改to和的encoding
和sample_rate_hertz
属性。RecognitionConfig
'FLAC'
16000
推荐阅读
- javascript - 如何在 JavaScript 中跟踪鼠标移动
- javascript - 有没有办法在 javascript 中使用 keras.pad_sequences?
- android - 其他一些函数中的 mMap.clear 正在从公共 void onMapLongClick(LatLng latLng) 中删除标记?
- reactjs - 在反应中重用顺风组件的简单方法
- machine-learning - X 上具有较大值的线性回归给出了奇怪的结果
- numpy - How to Plot in 3D Principal Component Analysis Visualizations, using the fast PCA script from this answer
- javascript - Three.js OBJ 和 MTL Loaders 不循环调用
- flutter - 为什么两个 UniqueKeys 仍然会触发“多个小部件使用相同的 GlobalKey”断言?
- java - 如何在我的代码 java 中计算文件中每个段落的单词数?
- jquery - jQuery 将数组发布到 Flask