首页 > 解决方案 > 在 Beautiful Soup 中使用 find 函数时返回 NoneType 值

问题描述

我正在使用 Beautiful Soup 从网站中提取表格。find 函数返回一个 NoneType 值,我不知道如何继续将所有表提取到 pandas DataFrames。

import pandas as pd
import datetime as dt
import pandas_datareader as web
import matplotlib.pyplot as plt
from matplotlib import style
import matplotlib.ticker as ticker
from bs4 import BeautifulSoup
import requests

url='https://www.federalreserve.gov/monetarypolicy/bst_recenttrends_accessible.htm'
html_content=requests.get(url).content
soup = BeautifulSoup(html_content, "html.parser")

get_table = soup.find("table", class_='pubtables')
get_table_data = get_table.find_all("tr")
print(type(get_table_data))

标签: pythonpandasbeautifulsoup

解决方案


您在表格中看到的数据是从外部 URL 加载的。您可以使用此示例将表加载到各种 DataFrame:

import requests
import pandas as pd
from bs4 import BeautifulSoup


url = 'https://www.federalreserve.gov/data.xml'
soup = BeautifulSoup(requests.get(url).content, 'html.parser')

for chart in soup.select('chart'):
    series = {}
    index = []
    for s in chart.select('series'):
        series[s['description']] = []
        temp_index = []
        for o in s.select('observation'):
            temp_index.append(o['index'])
            series[s['description']].append(o['value'])
        
        if len(temp_index) > len(index):
            index = temp_index

    series['index'] = index
    max_len = len(max(series.values(), key=len))
    for k in series:
        series[k] = series[k] + ['No Data'] * (max_len - len(series[k]))
    df = pd.DataFrame(series).set_index('index')
    print(df)
    print('-' * 80)

印刷:

          Total Assets
index                 
1-Aug-07     870261.00
8-Aug-07     865453.00
15-Aug-07    864931.00
22-Aug-07    862775.00
29-Aug-07    872873.00
...                ...
12-Aug-20   6957277.00
19-Aug-20   7010637.00
26-Aug-20   6990418.00
2-Sep-20    7017492.00
9-Sep-20    7010614.00

[685 rows x 1 columns]
--------------------------------------------------------------------------------
          Total Assets  ... Support for Specific Institutions**
index                   ...                                    
1-Aug-07     870261.00  ...                                   0
8-Aug-07     865453.00  ...                                   0
15-Aug-07    864931.00  ...                                   0
22-Aug-07    862775.00  ...                                   0
29-Aug-07    872873.00  ...                                   0
...                ...  ...                                 ...
12-Aug-20   6957277.00  ...                             No Data
19-Aug-20   7010637.00  ...                             No Data
26-Aug-20   6990418.00  ...                             No Data
2-Sep-20    7017492.00  ...                             No Data
9-Sep-20    7010614.00  ...                             No Data

[685 rows x 4 columns]
--------------------------------------------------------------------------------
          All Liquidity Facilities*  ... Term Asset-Backed Securities Loan Facility
index                                ...                                           
1-Aug-07                     235.00  ...                                          0
8-Aug-07                     255.00  ...                                          0
15-Aug-07                    264.00  ...                                          0
22-Aug-07                   2262.00  ...                                          0
29-Aug-07                   1358.00  ...                                          0
...                             ...  ...                                        ...
12-Aug-20                 116308.00  ...                                    1619.00
19-Aug-20                 112435.00  ...                                    2266.00
26-Aug-20                 107342.00  ...                                    2256.00
2-Sep-20                  103978.00  ...                                    2639.00
9-Sep-20                   85581.00  ...                                    2639.00

[685 rows x 5 columns]
--------------------------------------------------------------------------------
          Total Support to AIG***  ... Maiden Lane II LLC Maiden Lane III LLC
index                              ...                                       
1-Aug-07      0                 0  ...                  0                   0
8-Aug-07      0                 0  ...                  0                   0
15-Aug-07     0                 0  ...                  0                   0
22-Aug-07     0                 0  ...                  0                   0
29-Aug-07     0                 0  ...                  0                   0
...         ...               ...  ...                ...                 ...
12-Feb-20     0           No Data  ...                  0                   0
19-Feb-20     0           No Data  ...                  0                   0
26-Feb-20     0           No Data  ...                  0                   0
4-Mar-20      0           No Data  ...                  0                   0
11-Mar-20     0           No Data  ...                  0                   0

[659 rows x 5 columns]
--------------------------------------------------------------------------------
          Currency in Circulation  ... Treasury Balance
index                              ...                 
1-Aug-07                814159.00  ...          4769.00
8-Aug-07                814587.00  ...          4670.00
15-Aug-07               813042.00  ...          5109.00
22-Aug-07               811795.00  ...          5329.00
29-Aug-07               812431.00  ...          4924.00
...                           ...  ...              ...
12-Aug-20              2006160.00  ...       1635143.00
19-Aug-20              2009610.00  ...       1636393.00
26-Aug-20              2013933.00  ...       1607449.00
2-Sep-20               2021810.00  ...       1651823.00
9-Sep-20               2030151.00  ...       1570533.00

[685 rows x 3 columns]
--------------------------------------------------------------------------------

推荐阅读