首页 > 解决方案 > 无法在 python 上阅读网页

问题描述

我正在尝试从页面顶部的该网站https://roobet.com/读取一个数字使用 page=requests.get('https://roobet.com/') 我不明白number 为什么会发生这种情况,我必须做什么?我想读的数字叫做“赌注:XXXXXXX”但是当我使用 requests.get() 我没有看到这样的东西

PS:当我在网页上使用viewsource时,我仍然没有看到这样的数字或文字。如何读取和导入该号码?

import requests
page=requests.get("https://roobet.com")
text_page=page.text
print(text_page)

出去:

<!DOCTYPE html>\n<html lang="en">\n\n  <head>\n    <!-- Google Tag Manager -->\n    <script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({\'gtm.start\':\n    new Date().getTime(),event:\'gtm.js\'});var f=d.getElementsByTagName(s)[0],\n    j=d.createElement(s),dl=l!=\'dataLayer\'?\'&l=\'+l:\'\';j.async=true;j.src=\n    \'https://www.googletagmanager.com/gtm.js?id=\'+i+dl;f.parentNode.insertBefore(j,f);\n  })(window,document,\'script\',\'dataLayer\',\'GTM-563FCQS\');</script>\n    <!-- End Google Tag Manager -->\n    <meta charset="UTF-8">\n    <meta name="viewport" content="width=device-width, initial-scale=1">\n    <link rel="preconnect" href="https://fonts.googleapis.com/" crossorigin>\n    <title>Roobet | Crypto\'s Fastest Growing Casino</title>\n    <meta name="description" content="Roobet, crypto\'s fastest growing casino. Hop on in, chat to others and play exciting games - Come and join the fun!">\n    <base href="/">\n    <meta name="theme-color" content="#191b31" />\n    <link rel="icon" type="image/png" href="images/favicon.png">\n    <link rel="manifest" href="/manifest.json" />\n    <script src="https://cdn.onesignal.com/sdks/OneSignalSDK.js" async ></script>\n    <script src="https://maps.googleapis.com/maps/api/js?key=AIzaSyCXI19SE-ZWv_ZyW7gGMzCTf4TGfOA3Sdk&libraries=places"></script>\n    <script src="https://tekhou5-dk2.pragmaticplay.net/gs2c/common/js/lobby/GameLib.js" />\n    <script>\n      var OneSignal = window.OneSignal || [];\n      OneSignal.push(function() {\n        OneSignal.init({\n          appId: "29c72f64-e7e6-408c-99b2-d86a84c6a9cb",\n          notifyButton: {\n            enable: false,\n            autoResubscribe: true,\n          },\n          welcomeNotification: {\n            disable: true\n          }\n        });\n      });\n    </script>\n  <link href="0.20c4e82d288213005850.css" rel="stylesheet"><link href="app.20c4e82d288213005850.css" rel="stylesheet"></head>\n  <body>\n    <!-- Google Tag Manager (noscript) -->\n    <noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-563FCQS"\n    height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>\n    <!-- End Google Tag Manager (noscript) -->\n    <div id="root"></div>\n    <div id="modalRoot"></div>\n    <div id="loader">\n      <div class="loaderLogo">\n        <img src="/images/logo.svg" />\n      </div>\n    </div>\n  <script type="text/javascript" src="vendors.bundle.js?v=1272961ec29bf316a891"></script><script type="text/javascript" src="locale.bundle.js?v=f09f53a5cbf99ec0cac6"></script><script type="text/javascript" src="app.bundle.js?v=9f19f2ed821de8c93f9c"></script></body>\n  <script>(function(){var w=window;var ic=w.Intercom;if(typeof ic==="function"){ic(\'reattach_activator\');ic(\'update\',intercomSettings);}else{var d=document;var i=function(){i.c(arguments)};i.q=[];i.c=function(args){i.q.push(args)};w.Intercom=i;function l(){var s=d.createElement(\'script\');s.type=\'text/javascript\';s.async=true;s.src=\'https://widget.intercom.io/widget/gcr7bzde\';var x=d.getElementsByTagName(\'script\')[0];x.parentNode.insertBefore(s,x);}if(w.attachEvent){w.attachEvent(\'onload\',l);}else{w.addEventListener(\'load\',l,false);}}})()</script>\n  <script src="https://intaggr.softswiss.net/public/sg.js"></script>\n  <script type="text/javascript" src="https://www.google.com/recaptcha/api.js?render=6LdG97YUAAAAAHMcbX2hlyxQiHsWu5bY8_tU-2Y_"></script>\n  <script type="text/javascript">\n    if (typeof window.grecaptcha !== \'undefined\') {\n      grecaptcha.ready(function() {\n        grecaptcha.execute(\'6LdG97YUAAAAAHMcbX2hlyxQiHsWu5bY8_tU-2Y_\', {action: \'homepage\'});\n      })\n    }\n  </script>\n</html>\n'

在真实页面中,您可以看到我们确实有所有时间的赌注和其他东西

标签: pythonweb-scrapingpython-requests

解决方案


您应该使用一些网络抓取工具来处理这个问题。由于您了解 Python,因此Selenium可能是一种选择。

我正在分享一个小片段来帮助您入门。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver = webdriver.Chrome()
# Open the webpage
driver.get('https://roobet.com/')

elem_xpath = '//div[contains(text(), "Wagers All Time")]/following-sibling::div'

try:
    # Wait till the element is located
    elem = WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.XPATH, elem_xpath)))
    print (elem.text)
finally:
    driver.quit()

推荐阅读