python - 根据 groupby 的结果对列进行排序
问题描述
我有一个这样的数据集
df = pd.DataFrame({'time':['13:30', '9:20', '18:12', '19:00', '11:20', '13:30', '15:20', '17:12', '16:00', '8:20'],
'item': ["coffee", "bread", "pizza", "rice", "soup", "coffee", "bread", "pizza", "rice", "soup"]})
这是我的代码
#split the hour part from the time string
df['hour'] = df.Time.apply(lambda x: int(x.split(':')[0]))
def time_period(hour):
if hour >= 6 and hour < 11:
return 'breakfast'
elif hour >= 11 and hour < 15:
return 'lunch'
else:
return 'dinner'
df['meal'] = df['hour'].apply(lambda x: time_period(x))
a = df.groupby(['meal','Item']).size()
l = []
for i in np.sort(a.index.get_level_values(level=0).unique().tolist()):
l.append(a.loc[i].reset_index().rename(columns = {0:'count'}))
b = pd.concat(l,axis=1)
c = [i for i in a.index.get_level_values(level=0).unique().tolist()*2]
c = np.sort(c)
b.columns = [c,b.columns]
display(b.head(10))
我只想根据早餐数量对表格进行排序,但我不知道该怎么做。
解决方案
实现您需要的更好的代码是:
df = pd.DataFrame({'time':['13:30', '9:20', '18:12', '19:00', '11:20', '13:30',
'15:20', '17:12', '16:00', '8:20'],
'item': ["coffee", "bread", "pizza", "rice", "soup", "coffee", "bread", "pizza", "rice", "soup"]})
df['hour'] = df.time.apply(lambda x: int(x.split(':')[0]))
df['meal'] = np.where((df.hour >= 6) & (df.hour < 11), 'breakfast',
np.where((df.hour>=11) & (df.hour < 15), 'lunch', 'dinner'))
df = df.groupby(['meal',
'item']).size().rename('count').to_frame().reset_index().pivot(columns=['meal'])
df.columns = df.columns.swaplevel(0,1)
df.sort_index(axis=1, level=0, inplace=True)
df.sort_values(by=('breakfast', 'count'), inplace=True)
df
breakfast dinner lunch
count item count item count item
0 1.0 bread NaN NaN NaN NaN
1 1.0 soup NaN NaN NaN NaN
2 NaN NaN 1.0 bread NaN NaN
3 NaN NaN 2.0 pizza NaN NaN
4 NaN NaN 2.0 rice NaN NaN
5 NaN NaN NaN NaN 2.0 coffee
6 NaN NaN NaN NaN 1.0 soup
推荐阅读
- bash - 使用 read -p 命令保存响应并覆盖脚本中的配置文件
- bash - MacOS(Catalina)终端 bash 看起来很奇怪
- angular - 从我的服务文件定位时未反映对 BehaviorSubject 的更新
- javascript - 使用 JS/Regex 将 Vimeo URL 与 Player 包装起来
- haskell - 尝试打印出随机 IP 列表,但在 Haskell 中不断遇到错误
- php - 如何让服务器从客户端获取除表单之外的内容
- database - 用于检查两个不同数据库 PL/SQL 的两个表中的匹配记录的存储过程
- django - DRF:无法使用 HyperlinkedIdentityField 将 URL 转换为实例
- mako - 如何在标签页面之外使用 Nikola 标签变量
- node.js - 团队错误的 SSO 机器人 - “由于错误而无法授权”