c# - 在 C# 中,对于长音频,Speaker Diarization 始终返回零
问题描述
我目前正在使用 Google Cloud Speech to Text SDK for c#。我使用的 nuget 包是 Google.Cloud.Speech.V1P1Beta1。我正在尝试从较长视频的语音分类中受益,但无论音频中有多少扬声器,它总是在所有单词的扬声器标签中返回 0。下面是我的代码
var longOperation = speech.LongRunningRecognize(new RecognitionConfig()
{
Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
DiarizationSpeakerCount = 2,
EnableSpeakerDiarization = true,
SampleRateHertz = 16000,
LanguageCode = "en",
}, RecognitionAudio.FromFile("testRecording.wav"));
longOperation = longOperation.PollUntilCompleted();
var response = longOperation.Result;
Console.WriteLine("Response received successfully.");
foreach (var result in response.Results)
{
foreach (var alternative in result.Alternatives)
{
foreach (var word in alternative.Words)
{
Console.WriteLine($"{word.Word}: {word.SpeakerTag}");
}
}
}