首页 > 解决方案 > 在 docker run 中运行用 python 编写的语音应用程序失败

问题描述

我尝试从 docker 容器内运行聊天机器人脚本,但失败并出现以下错误:

Traceback (most recent call last):
  File "script.py", line 16, in <module>
    with sr.Microphone() as source:
  File "/home/datamastery/.local/lib/python3.8/site-packages/speech_recognition/__init__.py", line 86, in __init__
    device_info = audio.get_device_info_by_index(device_index) if device_index is not None else audio.get_default_input_device_info()
  File "/usr/local/lib/python3.8/dist-packages/pyaudio.py", line 949, in get_default_input_device_info
    device_index = pa.get_default_input_device()
OSError: No Default Input Device Available

Dockerfile:

FROM python:3.6-stretch

RUN pip install --upgrade pip
RUN apt-get update && apt-get install -y espeak
RUN apt-get install portaudio19-dev -y

RUN useradd -rm -d /home/datamastery -s /bin/bash -g root -G sudo -u 1001 datamastery
USER datamastery

WORKDIR /home/datamastery

COPY script.py ./script.py
COPY requirements.txt ./requirements.txt


RUN pip install -r requirements.txt

CMD ["python", "script.py"]

要求.txt

pyttsx3==2.90
transformers==4.6.1
SpeechRecognition==3.8.1
torch==1.8.1
PyAudio==0.2.11

脚本文件:

# import library
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import pyttsx3
import speech_recognition as sr

engineio = pyttsx3.init()
voices = engineio.getProperty("voices")
engineio.setProperty("rate", 130)  # Aquí puedes seleccionar la velocidad de la voz
engineio.setProperty("voice", voices[0].id)
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")

r = sr.Recognizer()

with sr.Microphone() as source:
    for step in range(5):
        r.adjust_for_ambient_noise(source)
        print("Sprich...")
        audio = r.listen(source, timeout=3)
        print("Danke!")
        audio_text = r.recognize_google(audio)

        new_user_input_ids = tokenizer.encode(
            audio_text + tokenizer.eos_token, return_tensors="pt"
        )
        bot_input_ids = (
            torch.cat([chat_history_ids, new_user_input_ids], dim=-1)
            if step > 0
            else new_user_input_ids
        )
        chat_history_ids = model.generate(
            bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id
        )
        print(chat_history_ids.shape)
        print(type(chat_history_ids))
        new_text = tokenizer.decode(
            chat_history_ids[:, bot_input_ids.shape[-1] :][0], skip_special_tokens=True
        )

        print(new_text)
        # recoginize_() method will throw a request error if the API is unreachable, hence using exception handling

        try:
            # using google speech recognition
            engineio.say(new_text)
            engineio.runAndWait()
        except:
            engineio.say("Sorry, did not understand you")

我尝试了此链接的解决方案:OSError: No Default Input Device Available,但它给了我一个错误的索引错误(我添加了device_index=0)。

  File "/home/datamastery/.local/lib/python3.6/site-packages/speech_recognition/__init__.py", line 84, in __init__
    assert 0 <= device_index < count, "Device index out of range ({} devices available; device index should be between 0 and {} inclusive)".format(count, count - 1)
AssertionError: Device index out of range (0 devices available; device index should be between 0 and -1 inclusive)

原因可能是因为 ubuntu 无法识别我的麦克风吗?如果是这种情况,我是否必须安装一个库或在我的 docker run 命令中设置一些东西。

标签: pythondockerspeech-recognition

解决方案


推荐阅读