python - Python，美丽的汤，<br>标签

问题描述

因此，我查看了堆栈溢出，但似乎找不到问题的答案。如何在 < br > 标签之后获取文本、特定文本？

这是我的代码：

product_review_container = container.findAll("span",{"class":"search_review_summary"})
for product_review in product_review_container:
    prr = product_review.get('data-tooltip-html')
    print(prr)

这是输出：

Very Positive<br>86% of the 1,013 user reviews for this game are positive.

我只想要这个字符串中的 86%，也只想要 1,013。所以只有数字。但是它不是 int 所以我不知道该怎么做。

以下是文本的来源：

   [<span class="search_review_summary positive" data-tooltip-html="Very Positive&lt;br&gt;86% of the 1,013 user reviews for this game are positive.">
</span>]

这是我获取信息的链接：https ://store.steampowered.com/search/?specials=1&page=1

谢谢！

标签： pythonbeautifulsouptags

你需要在这里使用正则表达式！

import re

string = 'Very Positive<br>86% of the 1,013 user reviews for this game are positive.'
a = re.findall('(\d+%)|(\d+,\d+)',string)
print(a)

output: [('86%', ''), ('', '1,013')]
#Then a[0][0] will be 86% and a[1][1] will be 1,013

其中 \d 是字符串中的任意数字字符，+ 表示至少有 1 个或多个数字。

如果您需要更具体的正则表达式，那么您可以在https://regex101.com中尝试

python - Python，美丽的汤，
标签

问题描述

解决方案

推荐阅读

python - Python，美丽的汤，标签

问题描述

解决方案

推荐阅读

python - Python，美丽的汤，
标签