python - 在 docker run 中运行用 python 编写的语音应用程序失败
问题描述
我尝试从 docker 容器内运行聊天机器人脚本,但失败并出现以下错误:
Traceback (most recent call last):
File "script.py", line 16, in <module>
with sr.Microphone() as source:
File "/home/datamastery/.local/lib/python3.8/site-packages/speech_recognition/__init__.py", line 86, in __init__
device_info = audio.get_device_info_by_index(device_index) if device_index is not None else audio.get_default_input_device_info()
File "/usr/local/lib/python3.8/dist-packages/pyaudio.py", line 949, in get_default_input_device_info
device_index = pa.get_default_input_device()
OSError: No Default Input Device Available
Dockerfile:
FROM python:3.6-stretch
RUN pip install --upgrade pip
RUN apt-get update && apt-get install -y espeak
RUN apt-get install portaudio19-dev -y
RUN useradd -rm -d /home/datamastery -s /bin/bash -g root -G sudo -u 1001 datamastery
USER datamastery
WORKDIR /home/datamastery
COPY script.py ./script.py
COPY requirements.txt ./requirements.txt
RUN pip install -r requirements.txt
CMD ["python", "script.py"]
要求.txt
pyttsx3==2.90
transformers==4.6.1
SpeechRecognition==3.8.1
torch==1.8.1
PyAudio==0.2.11
脚本文件:
# import library
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
import pyttsx3
import speech_recognition as sr
engineio = pyttsx3.init()
voices = engineio.getProperty("voices")
engineio.setProperty("rate", 130) # Aquí puedes seleccionar la velocidad de la voz
engineio.setProperty("voice", voices[0].id)
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")
r = sr.Recognizer()
with sr.Microphone() as source:
for step in range(5):
r.adjust_for_ambient_noise(source)
print("Sprich...")
audio = r.listen(source, timeout=3)
print("Danke!")
audio_text = r.recognize_google(audio)
new_user_input_ids = tokenizer.encode(
audio_text + tokenizer.eos_token, return_tensors="pt"
)
bot_input_ids = (
torch.cat([chat_history_ids, new_user_input_ids], dim=-1)
if step > 0
else new_user_input_ids
)
chat_history_ids = model.generate(
bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id
)
print(chat_history_ids.shape)
print(type(chat_history_ids))
new_text = tokenizer.decode(
chat_history_ids[:, bot_input_ids.shape[-1] :][0], skip_special_tokens=True
)
print(new_text)
# recoginize_() method will throw a request error if the API is unreachable, hence using exception handling
try:
# using google speech recognition
engineio.say(new_text)
engineio.runAndWait()
except:
engineio.say("Sorry, did not understand you")
我尝试了此链接的解决方案:OSError: No Default Input Device Available,但它给了我一个错误的索引错误(我添加了device_index=0
)。
File "/home/datamastery/.local/lib/python3.6/site-packages/speech_recognition/__init__.py", line 84, in __init__
assert 0 <= device_index < count, "Device index out of range ({} devices available; device index should be between 0 and {} inclusive)".format(count, count - 1)
AssertionError: Device index out of range (0 devices available; device index should be between 0 and -1 inclusive)
原因可能是因为 ubuntu 无法识别我的麦克风吗?如果是这种情况,我是否必须安装一个库或在我的 docker run 命令中设置一些东西。
解决方案
推荐阅读
- git - 如何将 git hook 脚本传递给远程存储库和我的队友
- ruby-on-rails - 使用 Github 操作测试生产 Rails 服务器启动
- php - 从左连接中提取特定字段,流明?
- java - tomcat access-logs中的MDC相关内容
- ithit-webdav-server - ITHitWebDAV AJAX 库 CORS 错误 401 PROPFIND
- java - VSCode 的代码补全显示 StrictMath over String
- android-tv - 在带有按钮的 AndroidTV 中使用 PlayFragment 时隐藏 PlayerControls
- android - Android NDK - E/ACameraMetadata:getConstEntry:找不到元数据标签 65582
- .net-core - Azure Devops BUild 脚本需要还原生成和测试
- django - Django:获取 ManyToManyField 的 ManyToMany 对象