python - Most pandas-onic way of getting statistics about the length of lists (average lengh, highest length, etc.) in a pandas df column
问题描述
I would like to get statistics on the list lengths in a pandas df column, such average length, lowest, highest, standard deviation, etc.
Example:
import pandas as pd
dfp = pd.DataFrame(
{'trial_num': [[1, 2, 3, 1, 2, 3], [3,4,6,7], [2,2]],
'subject': [[11, 2, 2, 2],[2,2,7],[4]]
}
)
dfp
Output:
trial_num subject
0 [1, 2, 3, 1, 2, 3] [11, 2, 2, 2]
1 [3, 4, 6, 7] [2, 2, 7]
2 [2, 2] [4]
So for this dataframe, I would like stats on the trial_num
and subject
columns.
So something like
trial_num
Average: 4
High: 6
Low: 2
Stdev: 2
What I have tried
I have tried
dfp.describe()
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-90-8a598dabea30> in <module>()
----> 1 dfp.describe()
6 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/algorithms.py in _value_counts_arraylike(values, dropna)
748 # TODO: handle uint8
749 f = getattr(htable, "value_count_{dtype}".format(dtype=ndtype))
--> 750 keys, counts = f(values, dropna)
751
752 mask = isna(values)
pandas/_libs/hashtable_func_helper.pxi in pandas._libs.hashtable.value_count_object()
pandas/_libs/hashtable_func_helper.pxi in pandas._libs.hashtable.value_count_object()
TypeError: unhashable type: 'list'
The only solution I can think of is to use iterrows to calculation the mean, high, and low, then with the mean, use iterrows again to calculate the stdev
解决方案
您可以使用str.len
来获取每行的列表长度。然后你可以使用.describe
:
s = dfp['trial_num'].str.len()
s.describe()
trial_num
count 3.0
mean 4.0
std 2.0
min 2.0
25% 3.0
50% 4.0
75% 5.0
max 6.0
推荐阅读
- java - 滚动时底部的添加到购物车按钮 - Shopify
- performance - 如何根据渲染时可用的值有条件地将组件添加到 JSF 组件树?
- java - Automatically spawns a code and get error
- c# - NopCommerce - Error with Search Box
- javascript - 带有翻转数字的数字倒计时动画
- android - BringToFront() 不适用于 ConstraintLayout
- r - 如何从 R 中的绘图元素中提取原始数据
- php - 如何链接到 ACF 关系“父级”
- tfs - 使用 TFS API 获取测试用例结果/结果
- python - 使用 postgresql 和 sql alchemy 在 python 中绑定聚合函数