首页 > 解决方案 > Chrome 扩展程序:通过语音识别呼叫访问麦克风

问题描述

在您阅读本文之前,可能与 Chrome 扩展程序如何获得用户使用用户计算机麦克风的权限有关? 如果有帮助,我在下面添加了一个答案,包括代码和我的清单。

我正在编写一个最小的 Chrome 扩展程序(在 MacOS 10.14.5 上使用 Chrome 75.0.3770.90)来为我的辅助功能项目实现一个“收听”按钮。我用适用于麦克风的 JavaScript 编写了一个 HTML 版本。

但是,当我将该代码提升到 Extension background.js 文件中时,文本到语音的工作,但不是语音到文本的工作。代码运行,但闪烁的麦克风从未出现在选项卡中。

有效的代码是:


    <!DOCTYPE html>
    <html>
        <body>
        <h2>All-in-one JavaScript Example</h2>
        <button onclick="myCode();">Listen</button> 
        <script>
            window.SpeechRecognition = window.webkitSpeechRecognition 
               || window.SpeechRecognition;

            function myCode() {
                recognition = new SpeechRecognition();
                recognition.start();
                recognition.onresult = function(event) {    
                if (event.results[0].isFinal) {
                    response = event.results[0][0].transcript;
                    synth = window.speechSynthesis;
                    synth.speak( new SpeechSynthesisUtterance( 
                        "i don't understand, "+response
                    ));
            }   }
            alert( "all-in-one: we're done!" );
        }
        </script>
        </body>
    </html>

最小的可重现示例:


    {
        "name": "myName",
        "description": "Press to talk",
        "version": "0.97",
        "manifest_version": 2,
        "background": {
            "scripts": ["background.js"],
            "persistent": false
        },
        "permissions": ["contentSettings","desktopCapture","*://*/*","tabCapture","tabs","tts","ttsEngine"],
        "browser_action": {
            "default_icon": "images/myIcon512.png",
            "default_title": "Press Ctrl(Win)/Command(Mac)+Shift+ Down to speak"
        },
        "commands": {
            "myCommand": {
                "suggested_key": {
                    "default": "Ctrl+Shift+Down",
                    "mac": "Command+Shift+Down"
                },
                "description": "Start speaking"
            }
        },
        "icons": {
            "512": "images/myIcon512.png"
        }
    }

我的背景 JavaScript 是:

    window.SpeechRecognition = window.webkitSpeechRecognition || window.SpeechRecognition;

    function myCode() {
        var recognition = new SpeechRecognition();
        recognition.onresult = function(event) {
            if (event.results[0].isFinal) {
                var synth = window.speechSynthesis;
                synth.speak( new SpeechSynthesisUtterance(
                        "sorry, I don't understand."
                    )
                );
            }   
        }
        recognition.start();
        alert( "extension: we're done!" );
    }
    chrome.commands.onCommand.addListener(function(command) {
        if (command === 'myCommand')
            myCode();
    });

我还注意到代码只运行一次——我可以继续点击监听按钮,但扩展命令只运行一次(在函数开头输入警报只会在第一次显示)

我的浏览器的默认设置是它应该(一次)询问它在 HTML 版本上做了什么。

谢谢,只是为了读到这里!我在下面用代码给出了答案。

标签: javascriptgoogle-chrome-extensionspeech-recognition

解决方案


我遇到的问题是麦克风似乎是一项后台任务,而我正在尝试与选项卡内容进行交互。我不认为这很常见,并且*://*/*在我的清单(完整)中得到了一个通用的“匹配”值():

{ "name": "Enguage(TM) - Let's all talk to the Web",
  "short_name" : "Enguage",
  "description": "A vocal Web interface",
  "version": "0.98",
  "manifest_version": 2,
  "content_security_policy": "script-src 'self'; object-src 'self'",
  "background": {
    "scripts": ["kbTx.js"],
    "persistent": false
  },
  "content_scripts": [
    { "matches" : ["*://*/*"],
      "js": ["tabRx.js", "interp.js"]
  } ],
  "permissions": [
    "activeTab",
    "contentSettings",
    "desktopCapture",
    "tabCapture",
    "tabs",
    "tts"
  ],
  "browser_action": {
    "default_icon": "images/lbolt512.png",
    "default_title": "Press Ctrl(Win)/Command(Mac)+Shift+ Space and speak"
  },
  "commands": {
    "enguage": {
      "suggested_key": {
        "default": "Ctrl+Shift+Space",
        "mac": "Command+Shift+Space"
      },
      "description": "single utterance"
  } },
  "icons": {
    "16": "images/lbolt16.png",
    "48": "images/lbolt48.png",
    "128": "images/lbolt128.png",
    "512": "images/lbolt512.png"
} }

我认为谷歌可能不喜欢这样!无论如何,我已经在我的后台代码(kbTx.js)中加入了一个键盘监听器:


    chrome.commands.onCommand.addListener(function(command) {
        if (command === 'enguage') {
        // find the active tab...
        chrome.tabs.query({active: true, currentWindow: true}, function(tabs) {
            //send it a message...
            chrome.tabs.sendMessage(
                tabs[0].id,          // index not always 0?
                null, // message sent - none required?
                null                 // response callback - none expected!
                //function(response) {console.log("done" /*response.farewell*/);}
                );
            });
        }
    });

我已经放入了一个上下文脚本来监听这个消息(tabRx.js):


    window.SpeechRecognition = window.webkitSpeechRecognition || 
                                window.SpeechRecognition;

    chrome.runtime.onMessage.addListener(
        function(request, sender, sendResponse) {
            var recognition = new SpeechRecognition();
            recognition.start();
            recognition.continuous = false;
            recognition.onresult = function(event) {
                if (event.results[0].isFinal) {
                    window.speechSynthesis.speak(
                        new SpeechSynthesisUtterance(
                            interp( event.results[0][0].transcript )
                    )   );
        }   }   }
    );

消息侦听器本质上包含allInOne.html上面示例中的代码。可能还有其他方法可以做到这一点,但这很有效并且看起来相当轻巧。希望这可以帮助。

如果您认为我可以改进我的代码,请随时为此添加评论!


推荐阅读