首页 > 解决方案 > 如果用户在 gcp 语音到文本的给定秒内没有响应,我如何阻止麦克风收听并中断进程循环

问题描述

我正在使用 GCP 文档中给出的示例并使用 API 将语音转换为文本。当用户在 2 到 3 秒内没有响应时,我需要停止循环。我怎样才能做到这一点。我没有任何功能来控制它。我使用了时间功能,但没有帮助。

该示例位于https://cloud.google.com/speech-to-text/docs/samples/speech-transcribe-streaming-mic

  RATE = 16000
    CHUNK = int(RATE / 10) 
    
    class MicrophoneStream(object):
        """Opens a recording stream as a generator yielding the audio chunks."""

    
        
            
    def listen_print_loop(responses):
       
        num_chars_printed = 0
        for response in responses:
            if not response.results:
                continue
               result = response.results[0]
            if not result.alternatives:
                continue
                # Display the transcription of the top alternative.
            transcript = result.alternatives[0].transcript
    
            overwrite_chars = " " * (num_chars_printed - len(transcript))
    
            if not result.is_final:
                sys.stdout.write(transcript + overwrite_chars + "\r")
                sys.stdout.flush()
    
                num_chars_printed = len(transcript)
            #elif time idle
    
            else:
                print(transcript + overwrite_chars)
    
                # Exit recognition if any of the transcribed phrases could be
                # one of our keywords.
                if re.search(r"\b(exit|quit)\b", transcript, re.I):
                    print("Exiting..")
                    break
    
                num_chars_printed = 0
    
    
    def main():
    
        os.environ = credentials_STT()
         
        language_code = "en-IN"  
        client = speech.SpeechClient()
        config = speech.RecognitionConfig(
            encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
            sample_rate_hertz=RATE,
            language_code=language_code,
        )
    
        streaming_config = speech.StreamingRecognitionConfig(
            config=config, interim_results=True
        )
    
        with MicrophoneStream(RATE, CHUNK) as stream:
            audio_generator = stream.generator()
            requests = (
                speech.StreamingRecognizeRequest(audio_content=content)
                for content in audio_generator)
    
            responses = client.streaming_recognize(streaming_config, requests)
    
            # Now, put the transcription responses to use.
            listen_print_loop(responses) # function called
            print("end")

标签: pythongoogle-cloud-platform

解决方案


添加选项single_utterance = True,以便在没有检测到语音时自动结束识别。但请注意,此选项通常用于简短的语句。有关此选项的更多信息,您可以查看此文档

您可以在语音转文本 python 参考中检查它的用法。

streaming_config应该是这样的:

        streaming_config = speech.StreamingRecognitionConfig(
            config=config, interim_results=True, single_utterance=True
        )

推荐阅读