首页 > 解决方案 > 从元组列表中计算平均值和 sd

问题描述

我有一个称为路由器的元组列表,如下所示:

('142.104.68.167', 11.111999999999853)
('142.104.68.167', 11.369000000000142)
('142.104.68.167', 11.618999999999915)
('142.104.68.1', 16.60699999999997)
('142.104.68.1', 16.847999999999956)
('142.104.68.1', 17.097000000000207)
('192.168.9.5', 15.727999999999838)
('192.168.9.5', 16.01800000000003)
('192.168.9.5', 16.279999999999973)

我在列表中有更多条目,但现在这应该足够了。我想计算具有相同“键”的值的平均值和标准差,例如,计算“键”为 142.104.68.167 的所有值的平均值和 sd,然后计算所有值的平均值和 sd “密钥”是 142.104.68.1 等等。

我试过用这种方式做,但不正确

for i in range(len(routers)):
        for j in range(len(routers)):
            if (routers[i][0] == routers[j][0]):
                if ((routers[i][0] not in final_router_list) and (routers[j][0] not in final_router_list)):
                    final_router_list.append(routers[i][0])

sum = 0
for i in range(len(routers)):
    for j in range(len(final_router_list)):
        if (routers[i][0] == final_router_list[j]):
            sum = sum + routers[i][1]
            print(routers[i][0],"rout:",final_router_list[j],"time:",routers[i][1],"sum:",sum)

这是我得到的输出:

142.104.68.167 rout: 142.104.68.167 time: 11.111999999999853 sum: 11.111999999999853
142.104.68.167 rout: 142.104.68.167 time: 11.369000000000142 sum: 22.480999999999995
142.104.68.167 rout: 142.104.68.167 time: 11.618999999999915 sum: 34.09999999999991
142.104.68.1 rout: 142.104.68.1 time: 16.60699999999997 sum: 50.70699999999988
142.104.68.1 rout: 142.104.68.1 time: 16.847999999999956 sum: 67.55499999999984
142.104.68.1 rout: 142.104.68.1 time: 17.097000000000207 sum: 84.65200000000004
192.168.9.5 rout: 192.168.9.5 time: 15.727999999999838 sum: 100.37999999999988
192.168.9.5 rout: 192.168.9.5 time: 16.01800000000003 sum: 116.39799999999991
192.168.9.5 rout: 192.168.9.5 time: 16.279999999999973 sum: 132.67799999999988

我想要的是:

142.104.68.167 rout: 142.104.68.167 time: 11.111999999999853 sum: 11.111999999999853
142.104.68.167 rout: 142.104.68.167 time: 11.369000000000142 sum: 22.480999999999995
142.104.68.167 rout: 142.104.68.167 time: 11.618999999999915 sum: 34.09999999999991
142.104.68.1 rout: 142.104.68.1 time: 16.60699999999997 sum: 16.60699999999997
142.104.68.1 rout: 142.104.68.1 time: 16.847999999999956 sum: 33.455
142.104.68.1 rout: 142.104.68.1 time: 17.097000000000207 sum: 50.552
192.168.9.5 rout: 192.168.9.5 time: 15.727999999999838 sum: 15.727999999999838
192.168.9.5 rout: 192.168.9.5 time: 16.01800000000003 sum: 31.746
192.168.9.5 rout: 192.168.9.5 time: 16.279999999999973 sum: 40.026

标签: python

解决方案


你的标题要求标准偏差和平均值,但你的代码似乎只是在计算时间的累积总和......

对于标题中要求的内容,有几种方法。我将提供一个纯 Python 解决方案。首先,将您的数据转换为更适合您尝试执行的操作的数据结构:

list_of_tups = [('142.104.68.167', 11.111999999999853),
                ('142.104.68.167', 11.369000000000142),
                ('142.104.68.167', 11.618999999999915),
                ('142.104.68.1', 16.60699999999997),
                ('142.104.68.1', 16.847999999999956),
                ('142.104.68.1', 17.097000000000207),
                ('192.168.9.5', 15.727999999999838),
                ('192.168.9.5', 16.01800000000003),
                ('192.168.9.5', 16.279999999999973)]


data = {}
for ip, time in list_of_tups:
    data[ip] = data.get(ip, []) + [time]

这给出了一个字典,其中每个 IP 地址都是一个键,时间存储在一个list. 从这里,您可以很容易地执行您想要的数学运算:

import statistics as stat

for ip, times in data.items():
    print(f"ip: {ip}\n  times: {times}\n  stdev: {stat.stdev(times)}\n  mean: {stat.mean(times)}\n")

输出:

ip: 142.104.68.167
  times: [11.111999999999853, 11.369000000000142, 11.618999999999915]
  stdev: 0.2535080537839964
  mean: 11.366666666666637

ip: 142.104.68.1
  times: [16.60699999999997, 16.847999999999956, 17.097000000000207]
  stdev: 0.2450108841120974
  mean: 16.85066666666671

ip: 192.168.9.5
  times: [15.727999999999838, 16.01800000000003, 16.279999999999973]
  stdev: 0.27611833212116077
  mean: 16.008666666666613

推荐阅读