python - 如何使此功能更有效?(板球统计)
问题描述
我正在编写一个函数batting_stats来显示击球手职业生涯中每一局的击球统计数据。该函数的输入是一个整数列表。顶级列表包含一局列表,而内部列表包含得分、面对的球和一个布尔值(1 = 出局或 0 = 未出局),指示击球手是否在局中被解雇。该函数是返回一个整数列表,表示击球手的平均数、击球率和每一局的转换率。
平均= 跑数 // 解雇次数(如果玩家没有被解雇,则平均值为总得分)
罢工率= 100*runs // 面对的球
转换率= 100 * 得分的世纪数 / 50+ 得分的数量。(如果玩家没有得分大于 50,则转换率为零)
Input Format:
[[r1,b1,d1],[r2,b2,d2],...]
where r=runs, b=balls, d=dissmissal
Output Format:
[[avg1,sr1,cr1],[avg2,sr2,cr2],...]
where avg=average, sr=strike rate, cr=conversion rate
例如:
>>> batting_stats([[12,24,0],[18,36,1]])
[[12,50,0],[30,50,0]]
我的代码给了我预期的结果,但显然实现不是最优的。对于非常大的输入,我遇到了超时错误。我该如何优化它?
def batting_stats(lst):
"""Compute the average, strike rate, and conversion rate of a batsman after each innings."""
innings = len(lst) # number of innings
last = 1 + innings
r_lst = [r[0] for r in lst] # list of runs per innings
b_lst = [b[1] for b in lst] # list of balls faced per innings
d_lst = [d[2] for d in lst] # list of dismissals per innings
c_lst = [1 if r >= 100 else 0 for r in r_lst] # list of 100+ scores
f_lst = [1 if r >= 50 else 0 for r in r_lst] # list of 50+ scores
# Keep track of sums after each innings
rt = [sum(r_lst[:n]) for n in range(1, last)] # runs scored
bt = [sum(b_lst[:n]) for n in range(1, last)] # balls faced
dt = [sum(d_lst[:n]) for n in range(1, last)] # dismissals
ct = [sum(c_lst[:n]) for n in range(1, last)] # 100+ scores
ft = [sum(f_lst[:n]) for n in range(1, last)] # 50+ scores
avg_ = [rt[i] if dt[i] == 0 else rt[i] // dt[i] for i in range(innings)] # averages after each innings
sr_ = [100 * rt[i] // bt[i] for i in range(innings)] # strike rates after each innings
cr_ = [0 if ft[i] == 0 else 100 * ct[i] // ft[i] for i in range(innings)] # conversion rates after each innings
return [[avg_[i], sr_[i], cr_[i]] for i in range(innings)]
解决方案
我结合了 Daniel Mesejo 的建议(使用itertools.accumulate
)和unzip
模块more_itertools
来修改您的功能(请原谅重新格式化/重命名 - 这是为了我自己的可读性)。我还对zip
.
def batting_stats2(lst):
"""Compute the average, strike rate, and conversion rate of a batsman after each innings."""
innings = len(lst) # number of innings
# Not needed
#last = 1 + innings
# Unzip reshapes the various stats into their own lists
r_lst, b_lst, d_lst = map(list, unzip(lst))
# Similarly, for the 100+ and 50+ scores
# Note: int(True) = 1, int(False) = 0
c_lst, f_lst = map(list, unzip((int(r >= 100), int(r >= 50)) for r in r_lst))
# Accumulate the sums
rt = list(accumulate(r_lst)) # list of runs per innings
bt = list(accumulate(b_lst)) # list of balls faced per innings
dt = list(accumulate(d_lst)) # list of dismissals per innings
ct = list(accumulate(c_lst)) # list of 100+ scores
ft = list(accumulate(f_lst)) # list of 50+ scores
# averages after each innings
avg_ = [run if dismiss == 0 else run // dismiss for run, dismiss in zip(rt, dt)]
# strike rates after each innings
sr_ = [100 * run // ball for run, ball in zip(rt, bt)]
# conversion rates after each innings
cr_ = [fifty if fifty == 0 else 100 * hundo // fifty for fifty, hundo in zip(ft, ct)]
# The "list(x)" is because your output is nested lists. Without, it would be
# list of tuples.
return list(list(x) for x in zip(avg_, sr_, cr_))
然后我用你发布的例子进行了测试,得到了相同的输出,所以它似乎计算正确。
之后,我输入了 1000000 个 3 成员列表(称为stats
)来计时:
>>> %%timeit
>>> batting_stats2(stats)
2.43 s ± 35.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
我也试过给原版计时,但等了10分钟就放弃了:)
推荐阅读
- mysql - 当只有 .frm 文件和 ibdata1 时如何恢复数据库上的数据
- python - 使用 python 代码未更新的行
- javascript - 为什么我不能对我的数组进行排序和重复数据删除?
- c# - 每隔 10 秒写入数据库
- python - Python删除列表的重叠
- php - 是否可以发布会话表单 PHP 和移动连接流?
- javascript - 将 php 正则表达式更改为 js 正则表达式 **text**
- amazon-web-services - AWS Elasticbeanstalk 将单个 IP 列入白名单以使用外部 API
- swift - Swift比较字符不区分大小写变音符号不区分
- c# - 使用c#运行2 pyhon.exe