python - 将列表值列表分组到相等距离的 bin 中
问题描述
在此之后,我设法将 1 个列表值分组到相等的距离箱中:
y = list(range(100))
def list_grouper(long_list, bins_number):
bins = np.linspace(min(long_list), max(long_list), bins_number)
bins_idx = np.digitize(long_list, bins)
bins_idx = bins_idx -1
return bins_idx
list_grouper(y, 20)
>>>array([ 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3,
3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6,
6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9,
9, 9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11, 12, 12, 12, 12, 12,
13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 14, 15, 15, 15, 15, 15, 16,
16, 16, 16, 16, 17, 17, 17, 17, 17, 18, 18, 18, 18, 18, 19])
有没有办法在列表列表上做同样的事情?
list_of_lists = [list(range(20)), list(range(10, 40)), list(range(40, 60))]
这样在划分为相等距离(而不是每个列表的值分别)时,将考虑所有值(来自所有列表),但每个列表仍然是分开的?
例如
list_of_lists_output = [[0,0,0,0,1,1,1...], [5,5...], [8...]]
更新:
@Giacomo 方法几乎可以工作。如果列表具有相同数量的值,它会起作用。否则我会收到这两个错误:
list_of_lists = [list(range(20)), list(range(10, 40)), list(range(40, 60))]
list_of_lists = np.array(list_of_lists)
list_grouper2(list_of_lists, 5)
>>>ValueError: object too deep for desired array
或这个:
list_of_lists = [list(range(20)), list(range(10, 40))]
list_of_lists = np.array(list_of_lists)
list_grouper2(list_of_lists, 5)
>>>ValueError: operands could not be broadcast together with shapes (30,) (20,)
解决方案
像这样的东西应该只适用于确定形状的数组
list_grouper(long_list, bins_number):
temp = long_list.reshape(-1)
bins = np.linspace(min(temp), max(temp), bins_number)
bins_idx = np.digitize(long_list, bins)
bins_idx = bins_idx -1
return bins_idx
list_of_lists = np.array([list(range(20)), list(range(20,40))]) #shape (2,20)
one_lists = np.array([list(range(20))])
print("List of lists:")
print(list_grouper(list_of_lists, 20))
#List of lists:
#[[ 0 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9]
#[ 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 17 17 18 18 19]]
print("One list:")
print(list_grouper(one_lists, 20))
#One list:
#[[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]]
在处理可变大小数组时,一般的解决方案可能是
def list_grouper(long_list, bins_number):
temp = np.concatenate(long_list, axis=0)
bins = np.linspace(min(temp), max(temp), bins_number)
toret = [np.digitize(i, bins) -1 for i in long_list]
return toret
list_of_lists = np.array([list(range(20)), list(range(20,60)),
list(range(90,100))])
print(list_grouper(list_of_lists, 20))
#[array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3]), array([ 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6,
# 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10,
# 10, 10, 10, 10, 11, 11]), array([17, 17, 17, 17, 18, 18, 18, 18, 18, 19])]
希望最终能奏效!
推荐阅读
- angular - 在其他多个延迟加载模块之间共享延迟加载(不在 App.module 中)服务(DI 上下文)?
- debugging - TestCafe 调试页脚阻止页面交互
- php - Jquery - 无法自动选择从回调派生的特定选项
- python - 有没有解释为什么python一直说“无法将'Answers'对象隐式转换为str”的错误?
- ruby-on-rails - Ruby Bundle 安装问题,/usr/bin/ld:找不到 -lssl,/usr/bin/ld:找不到 -lcrypto Dradis Framework
- fiddler - Fiddler 如何不代理一个 URL ?
- arrays - 无法初始化字符数组
- css - 添加变换比例过渡时自定义工具提示损坏
- javascript - React Redux(在单个文件中)在存储更改后不调用 render()
- c# - 获取大数据时导致连接时间执行错误的存储过程