首页 > 解决方案 > 请求和 lxml - 登录和抓取数据 - 没有内容显示

问题描述

我能够成功登录并获得 200 状态响应。但是,当我尝试使用 lxml 来转义数据时,主 HTML 标记内没有任何内容:

<main role="main" sp-main></main>

如果我通过浏览器登录,所有内容以及我要提取的数据都在其中main,加载内容确实需要一段时间,可能需要 5 秒左右。在拉入仪表板内容之前,我确实尝试设置一个 time.sleep(x),但是当我使用脚本拉取它时,main 中仍然没有填充任何内容。

(加载了更多内容,只是不想全部粘贴)

<main role="main" sp-main=""><button class="sp-fixed-btn sp-feedback" ui-sref="feedback"
            ng-show="isAuthenticated &amp;&amp; !kiosk">
            <div class="icon-chat-outline"></div>
            <div translate="CHROME_FEEDBACK">feedback</div>
        </button> <button class="sp-fixed-btn sp-help" ui-sref="help" ng-hide="kiosk">
            <div class="icon-question"></div>
            <div translate="CHROME_GETHELP">help</div>
        </button>
        <div class="sp-view sp-view-on" ng-class="{ 'sp-view-on': isAuthenticated }">
            <div class="sp-loader animate-loader" ng-class="{ 'animate-loader': !showLoader }">
                <div class="sp-loader-dots searching-ellipsis remove-dots" ng-class="{ 'remove-dots': !showLoader }">
                    <span>•&lt;/span> <span>•&lt;/span> <span>•&lt;/span></div>
                <!---->
            </div>
            <!---->
            <div ui-view="" class="sp-animate">
                <article page-spinner="dashboard">
                    <div class="sp-dash-container" style="width: 965px;">
                        <!---->
                        <div class="sp-widget-item">
                            <div sp-current-production-gauge="">
                                <sp-widget-container heading="CURRENT_PRODUCTION" show-menu="true"
                                    widgetcolor="#d9dbdc">
                                    <!---->
                                    <div class="widget-status-bar" ng-if="!!widgetColor"
                                        ng-style="{'background': widgetColor}" style="background: rgb(217, 219, 220);">
                                    </div>
                                    <!---->
                                    <div class="widget-with-header">
                                        <div class="widget-content" ng-mouseleave="showOptions=false">
                                            <div class="sp-widget-header-container"
                                                ng-class="{'make-blue': showOptions}" ng-show="showHeader">
                                                <h5 class="sp-title" translate="CURRENT_PRODUCTION"
                                                    ng-class="{'make-white': showOptions}">Current Power</h5>
                                                <!---->
                                                <div class="sp-dots" id="sp-dots"
                                                    ng-hide="showOptions || showMenu!=='true'"
                                                    ng-click="showOptions=true"><span>•&lt;/span> <span>•&lt;/span>
                                                    <span>•&lt;/span></div>
                                                <div class="sp-widget-header-icons ng-hide" ng-show="showOptions">
                                                    <div class="sp-widget-title-contents"><span
                                                            class="widget-settings-icons icon-info" title="info"
                                                            ng-click="showCurrentPowerHelpModal()"></span></div>
                                                </div>
                                            </div>
                                            <hr class="sp-settings-title-hr" ng-class="{'hide-title-hr': showOptions}">
                                            <div class="sp-widget-body" ng-show="showBody">
                                                <div class="sp-widget-body-contents"><button
                                                        analytics-category="Dashboard"
                                                        analytics-event="Click_On_Current_Power" analytics-on="click"
                                                        class="sp-dash-btn" ng-click="goPage('graphs', true)"></button>
                                                    <div class="sp-dash-item-subtitle">
                                                        <div class="sp-dash-description" translate="NOW_PRODUCING">Now
                                                            Producing</div>
                                                        <div class="sp-dash-value">2.1 kW</div>
                                                    </div>
                                                    <div id="now_prod_chart" style="overflow: hidden;"
                                                        data-highcharts-chart="5">
                                                        <div id="highcharts-opo2oxr-57" dir="ltr"
                                                            class="highcharts-container "
                                                            style="position: relative; overflow: hidden; width: 300px; height: 170px; text-align: left; line-height: normal; z-index: 0; -webkit-tap-highlight-color: rgba(0, 0, 0, 0); font-family: &quot;Open Sans&quot;, sans-serif;">
                                                            <svg version="1.1" class="highcharts-root"
                                                                style="font-family:'Open Sans', sans-serif;font-size:12px;"
                                                                xmlns="http://www.w3.org/2000/svg" width="300"
                                                                height="170" viewBox="0 0 300 170">
                                                                <desc>Created with Highcharts 7.1.2</desc>
                                                                <defs>
                                                                    <clipPath id="highcharts-opo2oxr-59-">
                                                                        <rect x="0" y="0" width="300" height="170"
                                                                            fill="none"></rect>
                                                                    </clipPath>
                                                                </defs>
                                                                <rect fill="transparent" class="highcharts-background"
                                                                    x="0" y="0" width="300" height="170" rx="0" ry="0">
                                                                </rect>
                                                                <rect fill="none" class="highcharts-plot-background"
                                                                    x="0" y="20" width="300" height="170"></rect>
                                                                <g class="highcharts-pane-group" data-z-index="0">
                                                                    <path fill="transparent"
                                                                        d="M 77.75 147.5 A 72.25 72.25 0 0 1 222.249963875003 147.42775001204166 L 207.7999711000024 147.44220000963332 A 57.8 57.8 0 0 0 92.2 147.5 Z"
                                                                        class="highcharts-pane " stroke="#cccccc"
                                                                        stroke-width="1"></path>
                                                                </g>
                                                                <g class="highcharts-grid highcharts-yaxis-grid"
                                                                    data-z-index="1">
                                                                    <path fill="none" data-z-index="1"
                                                                        class="highcharts-grid-line"
                                                                        d="M 150 147.5 L 77.75 147.5" opacity="1">
                                                                    </path>
                                                                    <path fill="none" data-z-index="1"
                                                                        class="highcharts-grid-line"
                                                                        d="M 150 147.5 L 98.91153505927196 96.41153505927194"
                                                                        opacity="1"></path>
                                                                    <path fill="none" data-z-index="1"
                                                                        class="highcharts-grid-line"
                                                                        d="M 150 147.5 L 150 75.25" opacity="1"></path>
                                                                    <path fill="none" data-z-index="1"
                                                                        class="highcharts-grid-line"
                                                                        d="M 150 147.5 L 201.08846494072804 96.41153505927196"
                                                                        opacity="1"></path>
                                                                    <path fill="none" data-z-index="1"
                                                                        class="highcharts-grid-line"
                                                                        d="M 150 147.5 L 222.25 147.5" opacity="1">
                                                                    </path>
                                                                </g>
                                                                <rect fill="none" class="highcharts-plot-border"
                                                                    data-z-index="1" x="0" y="20" width="300"
                                                                    height="170"></rect>
                                                                <g class="highcharts-axis highcharts-yaxis"
                                                                    data-z-index="2">
                                                                    <path fill="none" class="highcharts-tick"
                                                                        stroke="#ccd6eb" stroke-width="1"
                                                                        d="M 77.75 147.5 L 67.75 147.5" opacity="1">
                                                                    </path>
                                                                    <path fill="none" class="highcharts-tick"
                                                                        stroke="#ccd6eb" stroke-width="1"
                                                                        d="M 98.91153505927196 96.41153505927194 L 91.84046724740648 89.34046724740647"
                                                                        opacity="1"></path>
                                                                    <path fill="none" class="highcharts-tick"
                                                                        stroke="#ccd6eb" stroke-width="1"
                                                                        d="M 150 75.25 L 150 65.25" opacity="1"></path>
                                                                    <path fill="none" class="highcharts-tick"
                                                                        stroke="#ccd6eb" stroke-width="1"
                                                                        d="M 201.08846494072804 96.41153505927196 L 208.15953275259352 89.34046724740648"
                                                                        opacity="1"></path>
                                                                    <path fill="none" class="highcharts-tick"
                                                                        stroke="#ccd6eb" stroke-width="1"
                                                                        d="M 222.25 147.5 L 232.25 147.5" opacity="1">
                                                                    </path>
                                                                    <path fill="none" class="highcharts-axis-line"
                                                                        data-z-index="7"
                                                                        d="M 77.75 147.5 A 72.25 72.25 0 0 1 222.249963875003 147.42775001204166 M 150 147.5 A 0 0 0 0 0 150 147.5 ">
                                                                    </path>
                                                                </g>
                                                                <g data-z-index="2"
                                                                    class="highcharts-data-labels highcharts-series-0 highcharts-solidgauge-series  highcharts-tracker"
                                                                    transform="translate(0,20) scale(1 1)">
                                                                    <g class="highcharts-label highcharts-data-label highcharts-data-label-color-0 highcharts-tracker"
                                                                        data-z-index="1" transform="translate(106,138)">
                                                                    </g>
                                                                </g>
                                                                <g class="highcharts-series-group" data-z-index="3">
                                                                    <g data-z-index="0.1"
                                                                        class="highcharts-series highcharts-series-0 highcharts-solidgauge-series  highcharts-tracker"
                                                                        transform="translate(0,20) scale(1 1)"
                                                                        clip-path="url(https://monitor.us.sunpower.com/#highcharts-opo2oxr-59-)">
                                                                        <path fill="rgb(105,179,66)"
                                                                            d="M 77.75 127.49999999999999 A 72.25 72.25 0 0 1 130.4281223771638 57.95142628121894 L 134.34249790173104 71.86114102497515 A 57.8 57.8 0 0 0 92.2 127.5 Z"
                                                                            sweep-flag="0" stroke-linecap="round"
                                                                            stroke-linejoin="round"
                                                                            class="highcharts-point highcharts-color-0">
                                                                        </path>
                                                                    </g>
                                                                    <g data-z-index="0.1"
                                                                        class="highcharts-markers highcharts-series-0 highcharts-solidgauge-series "
                                                                        transform="translate(0,20) scale(1 1)"
                                                                        clip-path="none"></g>
                                                                </g><text x="150" text-anchor="middle"
                                                                    class="highcharts-title" data-z-index="4"
                                                                    style="color:#333333;font-size:18px;fill:#333333;"
                                                                    y="34"></text><text x="150" text-anchor="middle"
                                                                    class="highcharts-subtitle" data-z-index="4"
                                                                    style="color:#666666;fill:#666666;" y="34"></text>
                                                                <g class="highcharts-legend" data-z-index="7">
                                                                    <rect fill="none" class="highcharts-legend-box"
                                                                        rx="0" ry="0" x="0" y="0" width="8" height="8"
                                                                        visibility="hidden"></rect>
                                                                    <g data-z-index="1">
                                                                        <g></g>
                                                                    </g>
                                                                </g>
                                                                <g class="highcharts-axis-labels highcharts-yaxis-labels"
                                                                    data-z-index="7"></g>
                                                            </svg>
                                                            <div class="highcharts-axis-labels highcharts-yaxis-labels"
                                                                style="position: absolute; left: 0px; top: 0px; opacity: 1;">
                                                                <span opacity="1"
                                                                    style="position: absolute; font-family: &quot;Open Sans&quot;, sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 25.75px; top: 137.5px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
                                                                    <div>0 kW</div>
                                                                </span><span opacity="1"
                                                                    style="position: absolute; font-family: &quot;Open Sans&quot;, sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 53.1273px; top: 58.1273px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
                                                                    <div>1.3 kW</div>
                                                                </span><span opacity="1"
                                                                    style="position: absolute; font-family: &quot;Open Sans&quot;, sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 132.5px; top: 25.25px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
                                                                    <div>2.5 kW</div>
                                                                </span><span opacity="1"
                                                                    style="position: absolute; font-family: &quot;Open Sans&quot;, sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 211.873px; top: 58.1273px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
                                                                    <div>3.8 kW</div>
                                                                </span><span opacity="1"
                                                                    style="position: absolute; font-family: &quot;Open Sans&quot;, sans-serif; font-size: 0.9rem; white-space: nowrap; margin-left: 0px; margin-top: 0px; left: 250.25px; top: 137.5px; color: rgb(94, 99, 103); cursor: default; text-align: center; transform: rotate(0deg); transform-origin: 50% 12px; text-overflow: clip; opacity: 1;">
                                                                    <div>5 kW</div>
                                                                </span></div>
                                                            <div class="highcharts-data-labels highcharts-series-0 highcharts-solidgauge-series  highcharts-tracker"
                                                                style="position: absolute; left: 0px; top: 20px; opacity: 1; visibility: inherit;">
                                                                <div class="highcharts-label highcharts-data-label highcharts-data-label-color-0 highcharts-tracker"
                                                                    style="position: absolute; left: 106px; top: 138px; opacity: 1;">
                                                                    <span data-z-index="1"
                                                                        style="position: absolute; font-family: &quot;Open Sans&quot;, sans-serif; font-size: 18px; white-space: nowrap; font-weight: 600; color: transparent; margin-left: 0px; margin-top: 0px; left: 5px; top: 5px;">
                                                                        <div class="production-label">2.065 kW</div>
                                                                    </span></div>
                                                            </div>
                                                        </div>
                                                    </div>
                                                </div>
                                            </div>
                                        </div>
                                    </div>
                                </sp-widget-container>
                            </div>
                        </div>
                        <!---->
                        <div class="sp-widget-item sp-energy-mix" ng-if="homeConsumptionUser"
                            style="top: 0px; left: 325px;">
                            <sp-energy-mix id="energy_mix_landscape" show-heading="true" set-dates="day">
                                <sp-widget-container heading="TODAY_ENERGY_MIX" show-menu="true" widgetcolor="#d9dbdc">
                                    <!---->
                                    <div class="widget-status-bar" ng-if="!!widgetColor"
                                        ng-style="{'background': widgetColor}" style="background: rgb(217, 219, 220);">
                                    </div>
                                    <!---->
                                    <div class="widget-with-header">
                                        <div class="widget-content" ng-mouseleave="showOptions=false">
                                            <div class="sp-widget-header-container"
                                                ng-class="{'make-blue': showOptions}" ng-show="showHeader">
                                                <h5 class="sp-title" translate="TODAY_ENERGY_MIX"
                                                    ng-class="{'make-white': showOptions}">Today's Energy Mix</h5>
                                                <!---->
                                                <div class="sp-dots" id="sp-dots"
                                                    ng-hide="showOptions || showMenu!=='true'"
                                                    ng-click="showOptions=true"><span>•&lt;/span> <span>•&lt;/span>
                                                    <span>•&lt;/span></div>
                                                <div class="sp-widget-header-icons ng-hide" ng-show="showOptions">
                                                    <div class="sp-widget-title-contents"><span
                                                            class="widget-settings-icons icon-info" title="info"
                                                            ng-click="showEnergyMixHelpModal()"></span></div>
                                                </div>
                                            </div>
                                            <hr class="sp-settings-title-hr" ng-class="{'hide-title-hr': showOptions}">
                                            <div class="sp-widget-body" ng-show="showBody">
                                                <div class="sp-widget-body-contents"><button
                                                        analytics-category="Dashboard"
                                                        analytics-event="Click_On_Energy_Mix" analytics-on="click"
                                                        class="sp-dash-btn" ng-click="goPage('graphs')"></button>
                                                    
                                                       
    </main>

这是我的脚本:

import requests
import lxml
import time
from lxml import html


cookies = {
    '_ga': 'xxxxxxxxxxxxxxxxxxxxxxxx',
}

headers = {
    'Connection': 'keep-alive',
    'Cache-Control': 'no-cache',
    'Upgrade-Insecure-Requests': '1',
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
    'Accept': 'application/json, text/plain, */*',
    'Sec-Fetch-Site': 'same-origin',
    'Sec-Fetch-Mode': 'cors',
    'Sec-Fetch-User': '?1',
    'Sec-Fetch-Dest': 'empty',
    'Referer': 'https://monitor.us.sunpower.com/',
    'Accept-Language': 'en-US,en;q=0.9',
    'If-Modified-Since': 'Mon, 06 Jul 2020 20:10:01 GMT',
    'Origin': 'https://monitor.us.sunpower.com',
    'Pragma': 'no-cache',
    'authority': 'elhapi.edp.sunpower.com',
    'accept': 'application/json, text/plain, */*',
    'access-control-request-method': 'GET',
    'access-control-request-headers': 'authorization',
    'origin': 'https://monitor.us.sunpower.com',
    'sec-fetch-mode': 'cors',
    'sec-fetch-site': 'same-site',
    'sec-fetch-dest': 'empty',
    'referer': 'https://monitor.us.sunpower.com/',
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36',
    'accept-language': 'en-US,en;q=0.9',
    'content-type': 'application/json;charset=UTF-8',
    'authorization': 'SP-CUSTOM 9c0f119d-685f-463c-812c-e03dd6a99b84',
}

data = '{"username":"user@email.com","password":"xxxxxxxxxxx","isPersistent":false}'

response = requests.post('https://monitor.us.sunpower.com/', headers=headers, cookies=cookies, data=data)
    
login_url = "https://elhapi.edp.sunpower.com/v1/elh/authenticate"

s = requests.Session()
response = s.post(login_url, data=data, headers=headers)
print(response)


url = "https://monitor.us.sunpower.com/#!/dashboard/"

result = s.get(
    url, 
    headers = dict(referer = url)
)
    
tree = html.fromstring(result.content)

lxml.html.open_in_browser(tree)

对此的任何帮助将不胜感激,谢谢!

标签: pythonweb-scrapingrequestpython-requestslxml

解决方案


正如@bigbounty 评论的那样,您应该为此使用Selenium。内容是用 javascript 加载的,所以它不会出现在简单的 html 请求中。

在使用 Selenium 之前,您必须安装 webdriver 并对其进行配置。网上有很多教程展示了如何做到这一点,比如这个:http: //jonathansoma.com/lede/foundations-2018/classes/selenium/selenium-windows-install/

session = requests.Session()
response = session.post(login_url, data=data, headers=headers)

# you may have to specify your webdriver path depending on how you installed it
chrome_options = Options()
chrome_options.add_argument('--headless') # or '-start maximized' to see the window open

# create a webdriver object
driver = webdriver.Chrome(options=chrome_options)

# loads the cookies from the session on to the driver
for c in session.cookies :
    driver.add_cookie({'name': c.name, 'value': c.value, 'path': c.path, 'expiry': c.expires})

# get the url with the driver
driver.get(yourURL)
html = driver.page_source  # downloads the html including the html loaded with javascript

推荐阅读