首页 > 解决方案 > 岁月迭代

问题描述

我想再次问这个问题。我有一个如下列表。

[('2017-12-01', ['5', '6', '0', False]), 
 ('2017-12-02', ['5', '7', '0', False]), 
 ('2017-12-03', ['6', '7', '0.5', True]), 
 ('2017-12-04', ['6', '7', '0.5', True]), 
 ('2017-12-05', ['5', '6', '0.4', True]), 
 ('2018-01-01', ['5', '6', '0', False]), 
 ('2018-01-02', ['5', '6', '0', False])]

0 索引是日期。我想为每年制作一本字典,显示答案的第一列和第二列的平均值 {2017:[5.4,6.6]2018:[5,6]}

标签: pythonlistiteration

解决方案


您可以将collections.defaultdictstatistics.mean一起使用:

from collections import defaultdict
from statistics import mean

l = [('2017-12-01', ['5', '6', '0', False]), 
     ('2017-12-02', ['5', '7', '0', False]), 
     ('2017-12-03', ['6', '7', '0.5', True]), 
     ('2017-12-04', ['6', '7', '0.5', True]), 
     ('2017-12-05', ['5', '6', '0.4', True]), 
     ('2018-01-01', ['5', '6', '0', False]), 
     ('2018-01-02', ['5', '6', '0', False])]

my_dict = defaultdict(lambda : [[], []])

for d, v  in l:
    y = int(d[:4])
    my_dict[y][0].append(float(v[0]))
    my_dict[y][1].append(float(v[1]))

result = {k: [mean(e) for e in v] for k, v in my_dict.items()}
result

输出:

{2017: [5.4, 6.6], 2018: [5.0, 6.0]}

另外,你可以使用熊猫

1)首先,您必须将数据转换为pandas.DataFrame

import pandas as pd

df = pd.DataFrame([[f, *map(float, s[:2])] for f, s in l], columns=['date', 'col0', 'col1'])
df['date']= pd.to_datetime(df['date']) 
df

输出:

在此处输入图像描述

2 ) 现在您可以pd.DataFrame使用pandas.Dataframe.groupby操作以获得所需的输出:

df.groupby(df.date.dt.year).mean().transpose().to_dict('l')

输出:

{2017: [5.4, 6.6], 2018: [5.0, 6.0]}

因为您需要一种更简单的方法,您可以使用:

# group col0 and col1 values base on the year
year_cols = {}
for date, cols in l:
    # the year is in the first 4 characters so using a slice will get the year
    # then convert to integer
    year = int(date[:4])

    col0 = cols[0]
    col1 = cols[1]

    # store the values from column 0 and column 1 base on the year
    if year in year_cols: # check if new element/year
        # if not a new elemnt
        year_cols[year]['col0'].append(float(col0)) # convert to float to be able to compute the average
        year_cols[year]['col1'].append(float(col1)) # convert to float to be able to compute the average
    else: # in case of a new element/year
        col01_data = {'col0': [float(col0)], 'col1': [float(col1)]}
        year_cols[year] = col01_data


# get the average for each year on each column 
result = {}
for year, col0_col1 in year_cols.items():
    col0 = col0_col1['col0']
    col1 = col0_col1['col1']

    # compute the average for each column
    # average formula: sum of all elements divided by the number of elemetns
    result[year] = [sum(col0) / len(col0), sum(col1) / len(col1)]

result

输出:

{2017: [5.4, 6.6], 2018: [5.0, 6.0]}

推荐阅读