首页 > 解决方案 > 如何延迟 Python 请求库以允许数据填充

问题描述

我正在尝试从使用 .aspx 的网页中获取数据。我能够获取除一个值之外的所有数据,因为在 HTML 加载后似乎需要一些时间来加载。

我的代码目前如下所示:

import requests # Requesting HTML
import bs4 as bs # Parsing HTML
url_two = "https://www.walottery.com/Scratch/Explorer.aspx?id=1463"   
r_two = requests.get(url_two)
soup = bs.BeautifulSoup(r_two.text, "lxml")
print(soup.find("strong", {"class": "ticket-explorer-detail-info-printed"}))

但是,当我打印该值为<strong class="ticket-explorer-detail-info-printed">N/A</strong>.

如果您在网页上“检查元素”,您可以看到数据从我上面粘贴的内容变为:<strong class="ticket-explorer-detail-info-printed">2,428,400</strong>.

如何造成轻微延迟,以便我的请求库允许我获取计算值,而不是“N/A”?

标签: pythonbeautifulsouppython-requests

解决方案


该网页是从嵌入在 HTML 中的脚本元素中的 JSON 动态生成的。您可以提取 JSON 并对其进行解析以获取所需的数据,或使用 Selenium 在页面上呈现 JavaScript。要提取 JSON:

import requests
import json
from bs4 import BeautifulSoup

url = 'https://www.walottery.com/Scratch/Explorer.aspx?id=1463'
page = requests.get(url)
soup = BeautifulSoup(page.content,"html.parser")
# Find the script element contaning th JSON the web-page is dynamically generated from.
anchor = "WaLottery.Scratch.data = "
s = soup.find(lambda tag:tag.name=="script" and anchor in tag.text)
# Extract the JSON.
j = s.text[s.text.find("parse")+7:s.text.find("'),")]
# Load the JSON.
d = json.loads(j)
# Read the data from the JSON.
for game in d['Games']:
    print ( game['Id'], game['TicketsPrinted'])

输出:

1503 3,232,300
1497 2,427,400
1496 2,585,600
1493 3,467,000
1491 2,169,350
1490 2,194,350
1489 3,862,600
1488 4,832,950
1486 1,801,975
1483 2,422,200
1482 2,410,200
1481 2,450,400
1480 1,802,100
1479 1,320,300
1478 1,822,000
1476 5,236,000
1475 3,496,200
1474 3,155,000
1473 2,127,300
1472 1,112,265
1470 2,350,250
1469 3,120,050
1468 955,800
1467 2,161,550
1466 1,339,400
1465 556,000
1464 2,213,350
1463 2,428,400
1462 2,419,600
1461 2,434,600
1460 2,591,900
1459 3,887,000
1458 3,468,500
1457 2,180,300
1456 2,110,100
1455 2,089,200
1454 543,235
1453 2,421,600
1452 2,418,200
1451 2,400,800
1450 3,127,050
1449 2,167,400
1448 2,379,950
1446 4,838,700
1445 1,233,550
1444 2,456,550
1442 1,770,425
1441 3,838,700
1440 13,647,500
1439 3,255,400
1433 2,859,400
1431 3,158,450
1422 3,332,500
1415 5,192,000
1410 1,836,575
1409 3,567,270
1405 2,409,500
1391 2,162,100
1379 2,467,725
1373 3,645,075

您正在查看的是:

 1463 2,428,400

推荐阅读