首页 > 解决方案 > 如何从跨度标签中抓取字典?

问题描述

结果:

soup.find('span', {'class':'js-date-picker btn--secondary btn--secondary--no-spacing'})
<span class="js-date-picker btn--secondary btn--secondary--no-spacing" data-clear="/h/?type=ln&amp;search=ethereum&amp;lang=en&amp;searchheadlines=1" data-date='{"sel":false,"latest":1599889680,"now":1599897600}' data-href="/h/?type=ln&amp;search=ethereum&amp;lang=en&amp;searchheadlines=1&amp;d=" href="javascript://">
<span class="btn--secondary__icon"><i class="far fa-calendar-alt"></i></span>
<span class="btn--secondary__label">
<span class="dtctxt"><span class="d">12 Sep</span><span class="t"> 06:48</span></span></span>
</span>

现在我想{"sel":false,"latest":1599889680,"now":1599897600} 从这个文本中提取

我怎样才能做到这一点?

标签: pythonweb-scrapingbeautifulsoup

解决方案


尝试这个:

import ast

from bs4 import BeautifulSoup


html = """
<span class="js-date-picker btn--secondary btn--secondary--no-spacing" data-clear="/h/?type=ln&amp;search=ethereum&amp;lang=en&amp;searchheadlines=1" data-date='{"sel":false,"latest":1599889680,"now":1599897600}' data-href="/h/?type=ln&amp;search=ethereum&amp;lang=en&amp;searchheadlines=1&amp;d=" href="javascript://"></span>
<span class="btn--secondary__icon"><i class="far fa-calendar-alt"></i></span>
<span class="btn--secondary__label">
<span class="dtctxt"><span class="d">12 Sep</span><span class="t"> 06:48</span></span></span>
</span>
"""

soup = BeautifulSoup(html, 'html.parser').find("span", {"class": "js-date-picker btn--secondary btn--secondary--no-spacing"})
result = soup.get("data-date")
print(result)

输出:

{"sel":false,"latest":1599889680,"now":1599897600}

如果需要,可以将结果转换为dict对象,例如:

data_date = ast.literal_eval(result.replace("false", "False"))
print(data_date['now'])

输出:1599897600


推荐阅读