python - 请求和 lxml - 登录和抓取数据 - 没有内容显示
问题描述
我能够成功登录并获得 200 状态响应。但是,当我尝试使用 lxml 来转义数据时,主 HTML 标记内没有任何内容:
<main role="main" sp-main></main>
如果我通过浏览器登录,所有内容以及我要提取的数据都在其中main
,加载内容确实需要一段时间,可能需要 5 秒左右。在拉入仪表板内容之前,我确实尝试设置一个 time.sleep(x),但是当我使用脚本拉取它时,main 中仍然没有填充任何内容。
(加载了更多内容,只是不想全部粘贴)
<main role="main" sp-main=""><button class="sp-fixed-btn sp-feedback" ui-sref="feedback"
ng-show="isAuthenticated && !kiosk">
<div class="icon-chat-outline"></div>
<div translate="CHROME_FEEDBACK">feedback</div>
</button> <button class="sp-fixed-btn sp-help" ui-sref="help" ng-hide="kiosk">
<div class="icon-question"></div>
<div translate="CHROME_GETHELP">help</div>
</button>
<div class="sp-view sp-view-on" ng-class="{ 'sp-view-on': isAuthenticated }">
<div class="sp-loader animate-loader" ng-class="{ 'animate-loader': !showLoader }">
<div class="sp-loader-dots searching-ellipsis remove-dots" ng-class="{ 'remove-dots': !showLoader }">
<span>•</span> <span>•</span> <span>•</span></div>
<!---->
</div>
<!---->
<div ui-view="" class="sp-animate">
<article page-spinner="dashboard">
<div class="sp-dash-container" style="width: 965px;">
<!---->
<div class="sp-widget-item">
<div sp-current-production-gauge="">
<sp-widget-container heading="CURRENT_PRODUCTION" show-menu="true"
widgetcolor="#d9dbdc">
<!---->
<div class="widget-status-bar" ng-if="!!widgetColor"
ng-style="{'background': widgetColor}" style="background: rgb(217, 219, 220);">
</div>
<!---->
<div class="widget-with-header">
<div class="widget-content" ng-mouseleave="showOptions=false">
<div class="sp-widget-header-container"
ng-class="{'make-blue': showOptions}" ng-show="showHeader">
<h5 class="sp-title" translate="CURRENT_PRODUCTION"
ng-class="{'make-white': showOptions}">Current Power</h5>
<!---->
<div class="sp-dots" id="sp-dots"
ng-hide="showOptions || showMenu!=='true'"
ng-click="showOptions=true"><span>•</span> <span>•</span>
<span>•</span></div>
<div class="sp-widget-header-icons ng-hide" ng-show="showOptions">
<div class="sp-widget-title-contents"><span
class="widget-settings-icons icon-info" title="info"
ng-click="showCurrentPowerHelpModal()"></span></div>
</div>
</div>
<hr class="sp-settings-title-hr" ng-class="{'hide-title-hr': showOptions}">
<div class="sp-widget-body" ng-show="showBody">
<div class="sp-widget-body-contents"><button
analytics-category="Dashboard"
analytics-event="Click_On_Current_Power" analytics-on="click"
class="sp-dash-btn" ng-click="goPage('graphs', true)"></button>
<div class="sp-dash-item-subtitle">
<div class="sp-dash-description" translate="NOW_PRODUCING">Now
Producing</div>
<div class="sp-dash-value">2.1 kW</div>
</div>
<div id="now_prod_chart" style="overflow: hidden;"
data-highcharts-chart="5">
<div id="highcharts-opo2oxr-57" dir="ltr"
class="highcharts-container "
style="position: relative; overflow: hidden; width: 300px; height: 170px; text-align: left; line-height: normal; z-index: 0; -webkit-tap-highlight-color: rgba(0, 0, 0, 0); font-family: "Open Sans", sans-serif;">
<svg version="1.1" class="highcharts-root"
style="font-family:'Open Sans', sans-serif;font-size:12px;"
xmlns="http://www.w3.org/2000/svg" width="300"
height="170" viewBox="0 0 300 170">
<desc>Created with Highcharts 7.1.2</desc>
<defs>
<clipPath id="highcharts-opo2oxr-59-">
<rect x="0" y="0" width="300" height="170"
fill="none"></rect>
</clipPath>
</defs>
<rect fill="transparent" class="highcharts-background"
x="0" y="0" width="300" height="170" rx="0" ry="0">
</rect>
<rect fill="none" class="highcharts-plot-background"
x="0" y="20" width="300" height="170"></rect>
<g class="highcharts-pane-group" data-z-index="0">
<path fill="transparent"
d="M 77.75 147.5 A 72.25 72.25 0 0 1 222.249963875003 147.42775001204166 L 207.7999711000024 147.44220000963332 A 57.8 57.8 0 0 0 92.2 147.5 Z"
class="highcharts-pane " stroke="#cccccc"
stroke-width="1"></path>
</g>
<g class="highcharts-grid highcharts-yaxis-grid"
data-z-index="1">
<path fill="none" data-z-index="1"
class="highcharts-grid-line"
d="M 150 147.5 L 77.75 147.5" opacity="1">
</path>
<path fill="none" data-z-index="1"
class="highcharts-grid-line"
d="M 150 147.5 L 98.91153505927196 96.41153505927194"
opacity="1"></path>
<path fill="none" data-z-index="1"
class="highcharts-grid-line"
d="M 150 147.5 L 150 75.25" opacity="1"></path>
<path fill="none" data-z-index="1"
class="highcharts-grid-line"
d="M 150 147.5 L 201.08846494072804 96.41153505927196"
opacity="1"></path>
<path fill="none" data-z-index="1"
class="highcharts-grid-line"
d="M 150 147.5 L 222.25 147.5" opacity="1">
</path>
</g>
<rect fill="none" class="highcharts-plot-border"
data-z-index="1" x="0" y="20" width="300"
height="170"></rect>
<g class="highcharts-axis highcharts-yaxis"
data-z-index="2">
<path fill="none" class="highcharts-tick"
stroke="#ccd6eb" stroke-width="1"
d="M 77.75 147.5 L 67.75 147.5" opacity="1">
</path>
<path fill="none" class="highcharts-tick"
stroke="#ccd6eb" stroke-width="1"
d="M 98.91153505927196 96.41153505927194 L 91.84046724740648 89.34046724740647"
opacity="1"></path>
<path fill="none" class="highcharts-tick"
stroke="#ccd6eb" stroke-width="1"
d="M 150 75.25 L 150 65.25" opacity="1"></path>
<path fill="none" class="highcharts-tick"
stroke="#ccd6eb" stroke-width="1"
d="M 201.08846494072804 96.41153505927196 L 208.15953275259352 89.34046724740648"
opacity="1"></path>
<path fill="none" class="highcharts-tick"
stroke="#ccd6eb" stroke-width="1"
d="M 222.25 147.5 L 232.25 147.5" opacity="1">
</path>
<path fill="none" class="highcharts-axis-line"
data-z-index="7"
d="M 77.75 147.5 A 72.25 72.25 0 0 1 222.249963875003 147.42775001204166 M 150 147.5 A 0 0 0 0 0 150 147.5 ">
</path>
</g>
<g data-z-index="2"
class="highcharts-data-labels highcharts-series-0 highcharts-solidgauge-series highcharts-tracker"
transform="translate(0,20) scale(1 1)">
<g class="highcharts-label highcharts-data-label highcharts-data-label-color-0 highcharts-tracker"
data-z-index="1" transform="translate(106,138)">
</g>
</g>
<g class="highcharts-series-group" data-z-index="3">
<g data-z-index="0.1"
class="highcharts-series highcharts-series-0 highcharts-solidgauge-series highcharts-tracker"
transform="translate(0,20) scale(1 1)"
clip-path="url(https://monitor.us.sunpower.com/#highcharts-opo2oxr-59-)">
<path fill="rgb(105,179,66)"
d="M 77.75 127.49999999999999 A 72.25 72.25 0 0 1 130.4281223771638 57.95142628121894 L 134.34249790173104 71.86114102497515 A 57.8 57.8 0 0 0 92.2 127.5 Z"
sweep-flag="0" stroke-linecap="round"
stroke-linejoin="round"
class="highcharts-point highcharts-color-0">
</path>
</g>
<g data-z-index="0.1"
class="highcharts-markers highcharts-series-0 highcharts-solidgauge-series "
transform="translate(0,20) scale(1 1)"
clip-path="none"></g>
</g><text x="150" text-anchor="middle"
class="highcharts-title" data-z-index="4"
style="color:#333333;font-size:18px;fill:#333333;"
y="34"></text><text x="150" text-anchor="middle"
class="highcharts-subtitle" data-z-index="4"
style="color:#666666;fill:#666666;" y="34"></text>
<g class="highcharts-legend" data-z-index="7">
<rect fill="none" class="highcharts-legend-box"
rx="0" ry="0" x="0" y="0" width="8" height="8"
visibility="hidden"></rect>
<g data-z-index="1">
<g></g>
</g>
</g>
<g class="highcharts-axis-labels highcharts-yaxis-labels"
data-z-index="7"></g>
</svg>
<div class="highcharts-axis-labels highcharts-yaxis-labels"
style="position: absolute; left: 0px; top: 0px; opacity: 1;">
<span opacity="1"
style="position: absolute; font-family: "Open Sans", sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 25.75px; top: 137.5px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
<div>0 kW</div>
</span><span opacity="1"
style="position: absolute; font-family: "Open Sans", sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 53.1273px; top: 58.1273px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
<div>1.3 kW</div>
</span><span opacity="1"
style="position: absolute; font-family: "Open Sans", sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 132.5px; top: 25.25px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
<div>2.5 kW</div>
</span><span opacity="1"
style="position: absolute; font-family: "Open Sans", sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 211.873px; top: 58.1273px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
<div>3.8 kW</div>
</span><span opacity="1"
style="position: absolute; font-family: "Open Sans", sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 250.25px; top: 137.5px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
<div>5 kW</div>
</span></div>
<div class="highcharts-data-labels highcharts-series-0 highcharts-solidgauge-series highcharts-tracker"
style="position: absolute; left: 0px; top: 20px; opacity: 1; visibility: inherit;">
<div class="highcharts-label highcharts-data-label highcharts-data-label-color-0 highcharts-tracker"
style="position: absolute; left: 106px; top: 138px; opacity: 1;">
<span data-z-index="1"
style="position: absolute; font-family: "Open Sans", sans-serif; font-size: 18px; white-space: nowrap; font-weight: 600; color: transparent; margin-left: 0px; margin-top: 0px; left: 5px; top: 5px;">
<div class="production-label">2.065 kW</div>
</span></div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</sp-widget-container>
</div>
</div>
<!---->
<div class="sp-widget-item sp-energy-mix" ng-if="homeConsumptionUser"
style="top: 0px; left: 325px;">
<sp-energy-mix id="energy_mix_landscape" show-heading="true" set-dates="day">
<sp-widget-container heading="TODAY_ENERGY_MIX" show-menu="true" widgetcolor="#d9dbdc">
<!---->
<div class="widget-status-bar" ng-if="!!widgetColor"
ng-style="{'background': widgetColor}" style="background: rgb(217, 219, 220);">
</div>
<!---->
<div class="widget-with-header">
<div class="widget-content" ng-mouseleave="showOptions=false">
<div class="sp-widget-header-container"
ng-class="{'make-blue': showOptions}" ng-show="showHeader">
<h5 class="sp-title" translate="TODAY_ENERGY_MIX"
ng-class="{'make-white': showOptions}">Today's Energy Mix</h5>
<!---->
<div class="sp-dots" id="sp-dots"
ng-hide="showOptions || showMenu!=='true'"
ng-click="showOptions=true"><span>•</span> <span>•</span>
<span>•</span></div>
<div class="sp-widget-header-icons ng-hide" ng-show="showOptions">
<div class="sp-widget-title-contents"><span
class="widget-settings-icons icon-info" title="info"
ng-click="showEnergyMixHelpModal()"></span></div>
</div>
</div>
<hr class="sp-settings-title-hr" ng-class="{'hide-title-hr': showOptions}">
<div class="sp-widget-body" ng-show="showBody">
<div class="sp-widget-body-contents"><button
analytics-category="Dashboard"
analytics-event="Click_On_Energy_Mix" analytics-on="click"
class="sp-dash-btn" ng-click="goPage('graphs')"></button>
</main>
这是我的脚本:
import requests
import lxml
import time
from lxml import html
cookies = {
'_ga': 'xxxxxxxxxxxxxxxxxxxxxxxx',
}
headers = {
'Connection': 'keep-alive',
'Cache-Control': 'no-cache',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
'Accept': 'application/json, text/plain, */*',
'Sec-Fetch-Site': 'same-origin',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-User': '?1',
'Sec-Fetch-Dest': 'empty',
'Referer': 'https://monitor.us.sunpower.com/',
'Accept-Language': 'en-US,en;q=0.9',
'If-Modified-Since': 'Mon, 06 Jul 2020 20:10:01 GMT',
'Origin': 'https://monitor.us.sunpower.com',
'Pragma': 'no-cache',
'authority': 'elhapi.edp.sunpower.com',
'accept': 'application/json, text/plain, */*',
'access-control-request-method': 'GET',
'access-control-request-headers': 'authorization',
'origin': 'https://monitor.us.sunpower.com',
'sec-fetch-mode': 'cors',
'sec-fetch-site': 'same-site',
'sec-fetch-dest': 'empty',
'referer': 'https://monitor.us.sunpower.com/',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
'accept-language': 'en-US,en;q=0.9',
'content-type': 'application/json;charset=UTF-8',
'authorization': 'SP-CUSTOM 9c0f119d-685f-463c-812c-e03dd6a99b84',
}
data = '{"username":"user@email.com","password":"xxxxxxxxxxx","isPersistent":false}'
response = requests.post('https://monitor.us.sunpower.com/', headers=headers, cookies=cookies, data=data)
login_url = "https://elhapi.edp.sunpower.com/v1/elh/authenticate"
s = requests.Session()
response = s.post(login_url, data=data, headers=headers)
print(response)
url = "https://monitor.us.sunpower.com/#!/dashboard/"
result = s.get(
url,
headers = dict(referer = url)
)
tree = html.fromstring(result.content)
lxml.html.open_in_browser(tree)
对此的任何帮助将不胜感激,谢谢!
解决方案
正如@bigbounty 评论的那样,您应该为此使用Selenium。内容是用 javascript 加载的,所以它不会出现在简单的 html 请求中。
在使用 Selenium 之前,您必须安装 webdriver 并对其进行配置。网上有很多教程展示了如何做到这一点,比如这个:http: //jonathansoma.com/lede/foundations-2018/classes/selenium/selenium-windows-install/
session = requests.Session()
response = session.post(login_url, data=data, headers=headers)
# you may have to specify your webdriver path depending on how you installed it
chrome_options = Options()
chrome_options.add_argument('--headless') # or '-start maximized' to see the window open
# create a webdriver object
driver = webdriver.Chrome(options=chrome_options)
# loads the cookies from the session on to the driver
for c in session.cookies :
driver.add_cookie({'name': c.name, 'value': c.value, 'path': c.path, 'expiry': c.expires})
# get the url with the driver
driver.get(yourURL)
html = driver.page_source # downloads the html including the html loaded with javascript
推荐阅读
- tableau-api - 在 Tableau 中创建计算字段
- angular - 如何使用 TypeScript 在 Angular 组件中处理来自 NgRx 的可空状态属性?
- html - 您如何使用引导程序将导航向右对齐?
- r - 从R中的一行和一列解析嵌套的JSON
- swift - SwiftUI:使用 TabView 和 watchOS 基于状态强制焦点
- formula - Process Builder 电子邮件警报未触发
- sql - 在分区中使用 where 子句
- javascript - 如何从字符串中提取数字并替换它?
- amazon-web-services - 如何在 AWS Route 53 中设置 clientDeleteProhibited、clientRenewProhibited 和 clientUpdateProhibited EPP 状态代码?
- javascript - 这棵树是有效的二叉搜索树吗 [1, 7, 11, 17, 21, 29, 74, 89, 91, 101, 132, 157]