首页 > 解决方案 > 有没有办法用 Selenium 和 Python 绕过谷歌机器人识别?

问题描述

我在 Python 上运行 Selenium 以从谷歌默认网络搜索中检索数据,但在运行我的爬虫一段时间后,我得到了一个验证码屏幕来解决。有没有办法跳过这个谷歌标识?还是通过代码解决验证码的可靠方法?

我目前正在使用 Google Chrome 运行 Selenium,随机更改用户代理,每次查询之间等待 3 秒,无头标志 = False

options = Options()
options.headless = False
user_agents = {
    1: "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36",
    2: "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36",
    3: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36",
    4: "Mozilla/5.0 (Windows NT 6.3; Win64; x64; Trident/7.0; Touch; rv:11.0) like Gecko",
    5: "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36",
    6: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.85 Safari/537.36",
    7: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:40.0) Gecko/20100101 Firefox/40.0"
}
user_agent = '--user-agent="' + user_agents[random.randint(1,7)] + '"'
options.add_argument(user_agent)

谷歌验证码屏幕

标签: pythonseleniumweb-scrapingbots

解决方案


推荐阅读