首页 > 解决方案 > 如何使用请求抓取受 Cloudflare 保护的网站的 HTML

问题描述

所以我已经遇到了很长一段时间的问题,它是在 python 中请求某个网站的 html。我已经使用了大多数方法,但我无法让它发挥作用,所以我在这里问你们是否有人可以帮助我。

代码 1 [请求]:

import requests 

r = requests.get("https://livechart.me")
print(r.text)

响应 1

        <div class="cf-columns two">
          <div class="cf-column">
            <h2 data-translate="why_captcha_headline">Why do I have to complete a CAPTCHA?</h2>

            <p data-translate="why_captcha_detail">Completing the CAPTCHA proves you are a human and gives you temporary access to the web property.</p>
          </div>

          <div class="cf-column">
            <h2 data-translate="resolve_captcha_headline">What can I do to prevent this in the future?</h2>


            <p data-translate="resolve_captcha_antivirus">If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware.</p>

            <p data-translate="resolve_captcha_network">If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices.</p>....

代码 2 [cfscrape]

import cfscrape

scraper = cfscrape.create_scraper()
print(scraper.get("http://livechart.me"))

响应 2

ValueError: Unable to identify Cloudflare IUAM Javascript on website. Cloudflare may have changed their technique, or there may be a bug in the script.

我什至也使用了硒,但效果不佳,所以如果你们中的任何人知道任何解决方案,如果您告诉我,我会很高兴:)

标签: pythonpython-requests

解决方案


推荐阅读