python - 熊猫：如何按一列排序并按另一列剪切

问题描述

新手在这里...搜索了熊猫文档和stackoverflow，但找不到我要找的东西。提前致谢。

假设我想按字母顺序对书籍列表进行排序，并将它们放置在 3 个不同的书架上，使它们在每个书架上占据大致相同的书架空间。

我希望能够： 1. 按标题对 df 进行排序 2. 按 number_of_pages 将其分成 3 个，以获得总页数大致相同的三个箱（即使每个箱的书籍数量不同）

df = DataFrame(data={"title": ['animal farm', 'cat in the hat', 'the great gatsby', 'to kill a mockingbird', 'war and peace'], "number_of_pages": [200, 20, 300, 250, 400]})
df = df.sort_values("title")
df['bin'] = pd.cut(df.number_of_pages, bins=3, labels=[0,1,2])

我希望：

df
Out[34]: 
   number_of_pages                  title bin
0              200            animal farm   0
1               20         cat in the hat   0
2              300       the great gatsby   0
3              250  to kill a mockingbird   1
4              400          war and peace   2

但我得到：

df
Out[34]: 
   number_of_pages                  title bin
0              200            animal farm   1
1               20         cat in the hat   0
2              300       the great gatsby   2
3              250  to kill a mockingbird   1
4              400          war and peace   2

所以我有两个问题： 1. 对我正在剪切的列进行排序，而不是使用排序的 DF 2. 剪切使垃圾箱的书数相同，而不是大致相同的页数。

标签： pythonpandasdataframesortingcut

我想到了：

我需要在削减之前添加一个总和：

df = DataFrame(data={"title": ['animal farm', 'cat in the hat', 'the great gatsby', 'to kill a mockingbird', 'war and peace'], "number_of_pages": [200, 20, 300, 250, 400]})
df = df.sort_values("title")
df['cum'] = df.number_of_pages.cumsum()
df['bin'] = pd.cut(df.cum, bins=3, labels=[0,1,2])

python - 熊猫：如何按一列排序并按另一列剪切

问题描述

解决方案

推荐阅读