首页 > 解决方案 > python包装列表并在美丽的汤中转换类型

问题描述

以下是我的代码:

from urllib.request import urlopen  # b_soup_1.py
from bs4 import BeautifulSoup

# Treasury Yield Curve web site, known to be HTML code
html = urlopen('https://www.treasury.gov/resource-center/'
               'data-chart-center/interest-rates/Pages/'
               'TextView.aspx?data=yieldYear&year=2018')

# create the BeautifulSoup object (BeautifulSoup Yield Curve)
bsyc = BeautifulSoup(html.read(), "lxml")

# save it to a file that we can edit
#fout = open('bsyc_temp.txt', 'wt', encoding='utf-8')

#fout.write(str(bsyc))

#fout.close()

# so get a list of all table tags
table_list = bsyc.findAll('table')


# to findAll as a dictionary attribute
tc_table_list = bsyc.findAll('table',
                      { "class" : "t-chart" } )

# only 1 t-chart table, so grab it
tc_table = tc_table_list[0]
# what are this table's components/children?
# tag tr means table row, containing table data
# what are the children of those rows?
# we have found the table data!
# just get the contents of each cell
print('\nthe contents of the children of the t-chart table:')
daily_yield_curves_temp = []
daily_yield_curves = []
for c in tc_table.children:
    for r in c.children:
        for i in r.contents:
            daily_yield_curves_temp.append(i)
for x in range(len(daily_yield_curves_temp) // 12):
    daily_yield_curves.append(daily_yield_curves_temp[12 * x : 12 * x + 12])

print(daily_yield_curves)

输出是:

[['日期','1 个月','3 个月','6 个月','1 年','2 年','3 年','5 年','7 年','10 年' , '20 岁', '30 岁'], ['01/02/18', '1.29', '1.44', '1.61', '1.83', '1.92', '2.01', '2.25', ' 2.38'、'2.46'、'2.64'、'2.81']、['01/03/18'、'1.29'、'1.41'、'1.59'、'1.81'、'1.94'、'2.02'、' 2.25'、'2.37'、'2.44'、'2.62'、'2.78']、['01/04/18'、'1.28'、'1.41'、'1.60'、'1.82'、'1.96'、' 2.05','2.27','2.38','2.46','2.62','2.79'],......]

但是,我想让输出看起来像这样:

daily_yield_curves = [
        [ … header list … ],
        [ … first data list … ],
        …
        [ … final data list … ]
    ]

['日期','1 个月','3 个月','6 个月','1 年','2 年','3 年','5 年','7 年','10 年', “20 岁”、“30 岁”]

接下来应该是每个数据行的列表。将每个利率值从字符串转换为浮点数:

['01/02/18', 1.29, 1.44, 1.61, 1.83, 1.92, 2.01, 2.25, 2.38, 2.46, 2.64, 2.81] ... ['09/14/18', 2.02, 2.16, 2.33, 2.56 , 2.78, 2.85, 2.90, 2.96, 2.99, 3.07, 3.13]

请帮助我如何更改它

标签: pythonlistbeautifulsoup

解决方案


推荐阅读