首页 > 解决方案 > Python BeautifulSoup 用过滤器抓取网站

问题描述

我试图抓取https://www.forexfactory.com/calendar?day=today。我的代码是

table = soup.find_all("tr",{"class":"calendar_row"})
    
forcal = []
for item in table:
    dict = {}
    
    dict["Currency"] = item.find_all("td", {"class":"calendar__currency"})[0].text.strip() #Currency
    dict["Event"] = item.find_all("td",{"class":"calendar__event"})[0].text.strip() #Event Name
    dict["Time_Eastern"] = item.find_all("td", {"class":"calendar__time"})[0].text #Time Eastern
    impact = item.find_all("td", {"class":"impact"})
    
    for icon in range(0,len(impact)):
        dict["Impact"] = impact[icon].find_all("span")[0]['title'].split(' ', 1)[0]

    dict["Actual"] = item.find_all("td", {"class":"calendar__actual"})[0].text #Actual Value
    dict["Forecast"] = item.find_all("td", {"class":"calendar__forecast"})[0].text #forecasted Value
    dict["Previous"] = item.find_all("td", {"class":"calendar__previous"})[0].text # Previous
    forcal.append(dict)

我能够抓取所有数据。但是我如何仅过滤美元货币?目前它正在刮掉所有货币。

谢谢你。

标签: pythonbeautifulsoup

解决方案


您可以在将值分配给字典之前添加 if 语句:

Currency = item.find_all("td", {"class":"calendar__currency"})[0].text.strip() #Currency
if Currency == "USD":
    dict["Currency"] = Currency
    dict["Event"] = item.find_all("td",{"class":"calendar__event"})[0].text.strip()
    ...

推荐阅读