robots.txt - 需要停止对自定义构建 CMS 的 URL 参数进行索引

我希望 Google 忽略这样的 URL：

https://www.example.com/blog/category/web-development?page=2

当我的链接在 Google 中被索引时，我需要停止索引它们。我应该使用什么代码来不索引它们？

这是我的刮匙robots.txt文件：

Disallow: /cgi-bin/
Disallow: /scripts/
Disallow: /privacy
Disallow: /404.html
Disallow: /500.html
Disallow: /tweets
Disallow: /tweet/

我可以用它来禁止它们吗？

Disallow: /blog/category/*?*

标签： robots.txt

使用 robots.txt，可以防止抓取，不一定要索引。

如果您想禁止 Google 抓取网址

那么你可以使用这个：

Disallow: /blog/category/*?

最后不需要另一个*，因为Disallow值代表URL的开始（从路径开始）。

但请注意，并非所有机器人都支持这一点。根据最初的 robots.txt 规范，*没有特殊含义。符合要求的机器人会按字面意思解释上述行（*作为路径的一部分）。如果您只遵循原始规范中的规则，则必须列出每一次出现：

Disallow: /blog/category/c1?
Disallow: /blog/category/c2?
Disallow: /blog/category/c3?