首页 > 解决方案 > 在 C# 中,对于长音频,Speaker Diarization 始终返回零

问题描述

我目前正在使用 Google Cloud Speech to Text SDK for c#。我使用的 nuget 包是 Google.Cloud.Speech.V1P1Beta1。我正在尝试从较长视频的语音分类中受益,但无论音频中有多少扬声器,它总是在所有单词的扬声器标签中返回 0。下面是我的代码

var longOperation = speech.LongRunningRecognize(new RecognitionConfig()
               {
                   Encoding = RecognitionConfig.Types.AudioEncoding.Linear16,
                   DiarizationSpeakerCount = 2,
                   EnableSpeakerDiarization = true,
                   SampleRateHertz = 16000,
                   LanguageCode = "en",
               }, RecognitionAudio.FromFile("testRecording.wav"));

               longOperation = longOperation.PollUntilCompleted();
               var response = longOperation.Result;
               Console.WriteLine("Response received successfully.");

               foreach (var result in response.Results)
               {
                   foreach (var alternative in result.Alternatives)
                   {
                       foreach (var word in alternative.Words)
                       {
                           Console.WriteLine($"{word.Word}: {word.SpeakerTag}");
                       }
                   }
               }

标签: c#google-speech-api

解决方案


推荐阅读