python - BeautifulSoup:从元素中提取属性?
问题描述
我试图在 Stackoverflow 上查找此内容,但无法使其适合我的代码。也许有人可以帮助我吗?
我正在尝试从此 HTML 中获取“team1”、“team2”和“bettext”属性:
<table class="sportbet_extra_list_table" id="mc-ga312004790">
<tbody>
<tr>
<td class="sportbet_extra_c0"></td>
<td class="sportbet_extra_c1"><span>
<a class="combi_1"></a>
Hvem vinder kampen? </span></td>
<td class="sportbet_extra_c2">
<div id="mc-ti312004790_1" class="js-ti312004790_1 sportbet_extra_rate_content" onclick="Bettingslip.addBet({type: 'N', team1: 'Rusland', team2: 'Saudi Arabien', bettext: 'Hvem vinder kampen?', combi_cat: 1, sub_group: 0, game: 312004790, groupId:461392, leagueId:30124, odd: 138, odd_id: 312004790, tiptext: '1', tip: 1, betstyle: 2224})">
<div class="sportbet_content_rate_left">1</div>
<div class="sportbet_content_rate_right">1,38</div>
</div>
</td>
到目前为止,这段代码是我用来从 sportbet_extra_list_table 中提取信息的代码:
REQUEST = requests.get('https://www.cashpoint.dk/en/?
r=bets/xtra&group=461392&game=312004790').text
SOUP = BeautifulSoup(REQUEST, 'lxml')
# find_all to extract all
SCRAPE = SOUP.find('table', class_='sportbet_extra_list_table')
for CLEAN in SCRAPE:
CLEANER = BeautifulSoup(str(CLEAN), 'lxml').text
STRIP = " ".join(line.strip() for line in CLEANER.split("\n"))
print(STRIP)
我试图添加
SOUP.find('table', class_='sportbet_extra_list_table', attrs={"onclick": "team1"})
但它没有用
解决方案
尝试以下操作以按照您在帖子中提到的方式获取输出:
import json
import requests
from bs4 import BeautifulSoup
url = "https://www.cashpoint.dk/en/?r=bets/xtra&group=461392&game=312004790"
res = requests.get(url)
soup = BeautifulSoup(res.text,'lxml')
dataset = []
for items in soup.select("#container_xtra [id^='mc-ti']"):
d = {}
data = items.get("onclick").split("Bettingslip.addBet(")[1].split(")")[0]
d['team1'] = data.split("team1:")[1].split(",")[0].split("'")[1].split("'")[0]
d['team2'] = data.split("team2:")[1].split(",")[0].split("'")[1].split("'")[0]
d['bettext'] = data.split("bettext:")[1].split(",")[0].split("'")[1].split("'")[0]
if d not in dataset:
dataset.append(d)
print(json.dumps(dataset,indent=4))
部分结果:
[
{
"team1": "Rusland",
"team2": "Saudi Arabien",
"bettext": "Hvem vinder kampen?"
},
{
"team1": "Rusland",
"team2": "Saudi Arabien",
"bettext": "Dobbeltchance"
},
推荐阅读
- jquery - 如何在accounting.js中为小数添加小数点?
- android - React Native 代码中的错误
- c - 通过 OpenMP SIMD 进行 256 位矢量化会阻止编译器的优化(比如函数内联)?
- google-apps-script - 交互式谷歌电子表格到谷歌文档编辑不起作用
- acl - 将 ACL 添加到 Hyperledger Composer 查询
- facebook-graph-api - Facebook 错误子代码 1870034
- javascript - 查询时如何在 Firebase Cloud Functions 中使用 forEach?
- css - 调整浏览器窗口大小时如何使文本自动堆叠
- java - 以编程方式更改 PopupMenu 项目的标题
- javascript - 在 Javascript 中使用带有自定义日期过滤器的数据表函数的问题