首页 > 解决方案 > 数据抓取 - 字段值 - 问题

问题描述

我想从这个网站获得有关感染人数的实际信息:https: //www.gov.pl/web/koronawirus/wykaz-zarazen-koronawirusem-sars-cov-2

我的代码看起来像:

import requests
from bs4 import BeautifulSoup
adresURL = 'https://www.gov.pl/web/koronawirus/wykaz-zarazen-koronawirusem-sars-cov-2'
res = requests.get(adresURL)
soup = BeautifulSoup(res.text, 'html.parser')
data = soup.select('.details-property-value')
print(data)

结果我收到:

[<div class="details-property-value" tabindex="0">{{selectedRecord[commonColumns[index]] || '-'}}</div>]

任何想法如何获得字段的价值?我错过了什么吗?

标签: pythonbeautifulsouppython-requests

解决方案


我猜你正试图刮掉该页面上的表格。看起来 HTML 中包含了一些 JSON:

import requests
from bs4 import BeautifulSoup
import json

url = "https://www.gov.pl/web/koronawirus/wykaz-zarazen-koronawirusem-sars-cov-2"

response = requests.get(url)
response.raise_for_status()

soup = BeautifulSoup(response.content, "html.parser")

data = json.loads(soup.find("pre", {"id": "registerData"}).text)
print(data)

推荐阅读