首页 > 解决方案 > 为什么每个 Express GET 都会请求 robots.txt?

问题描述

我正在使用 expressjs.com 网站上的 Express 的“入门”教程。

我正在运行我的应用程序(在 Windows 中):

> set DEBUG=express:* & npm start

当我向服务器发出请求时(在本例中为http://localhost:3000/),我在控制台中看到:

  express:router dispatching GET / +24s
  express:router query  : / +3ms
  express:router expressInit  : / +2ms
  express:router logger  : / +3ms
  express:router jsonParser  : / +3ms
  express:router urlencodedParser  : / +4ms
  express:router cookieParser  : / +3ms
  express:router serveStatic  : / +1ms
  express:router router  : / +5ms
  express:router dispatching GET / +2ms
  express:view require "jade" +3ms
  express:view lookup "index.jade" +467ms
  express:view stat "C:\mysites\app\views\index.jade" +3ms
  express:view render "C:\mysites\app\views\index.jade" +3ms
GET / 304 545.682 ms - -
  express:router dispatching GET /robots.txt +58ms
  express:router query  : /robots.txt +2ms
  express:router expressInit  : /robots.txt +3ms
  express:router logger  : /robots.txt +3ms
  express:router jsonParser  : /robots.txt +3ms
  express:router urlencodedParser  : /robots.txt +8ms
  express:router cookieParser  : /robots.txt +1ms
  express:router serveStatic  : /robots.txt +2ms
  express:router dispatching GET /stylesheets/style.css +19ms
  express:router query  : /stylesheets/style.css +2ms
  express:router expressInit  : /stylesheets/style.css +1ms
  express:router logger  : /stylesheets/style.css +4ms
  express:router jsonParser  : /stylesheets/style.css +7ms
  express:router urlencodedParser  : /stylesheets/style.css +4ms
  express:router cookieParser  : /stylesheets/style.css +2ms
  express:router serveStatic  : /stylesheets/style.css +2ms
  express:router router  : /robots.txt +1ms
  express:router dispatching GET /robots.txt +1ms
GET /robots.txt 304 55.631 ms - -

我试图了解robots.txt每次我发出任何页面/资源请求时谁触发了 GET 请求。我不相信它是浏览器。它不在呈现的页面中:

<!DOCTYPE html><html><head><title>Express</title><link rel="stylesheet" href="/stylesheets/style.css"></head><body><h1>Express</h1><p>Welcome to Express</p></body></html>

请求是否是robots.txtnode.js/express 在内部生成的,可能仅在此调试模式下?如果是这样,为什么?

我的 package.json 文件,以防万一:

{
  "name": "app",
  "version": "0.0.0",
  "private": true,
  "scripts": {
    "start": "node ./bin/www"
  },
  "dependencies": {
    "cookie-parser": "~1.4.3",
    "debug": "~2.6.9",
    "express": "~4.16.0",
    "http-errors": "~1.6.2",
    "jade": "~1.11.0",
    "morgan": "~1.9.0"
  }
}

标签: node.jsexpressrobots.txt

解决方案


这与 node.js 或 express 无关。我最近安装了 Wappalyzer Chrome 扩展程序。这是我从那时起工作的第一个本地主机站点。当我关闭 Wappalyzer 时,我不再看到 robots.txt 的请求。Wappalyzer 发出此请求是有道理的,并且它不会在 Chrome 调试器中显示为正常请求。

将其留在这里,以免其他人在调试 Express 应用程序时遇到同样的困惑。


推荐阅读