python - 熊猫将行融为一列
问题描述
我有一个数据框,当前读取如下:
df_new = pd.DataFrame({'Week':['nan',14, 14, 14, 14, 14],
'Date':['NaT','2020-04-01', '2020-04-02', '2020-04-03', '2020-04-04', '2020-04-05'],
'site 1':['entry',0, 0, 0, 0, 0],
'site 1':['exit',0, 0, 0, 0, 0],
'site 2':['entry',1, 0,50, 7, 0],
'site 2':['exit',10, 0, 7, 19, 0],
'site 3':['entry',0, 100, 14, 9, 0],
'site 3':['exit',0, 0, 7, 0, 0],
'site 4':['entry',0, 0, 0, 0, 0],
'site 4':['exit',0, 0, 0, 0, 0],
'site 5':['entry',0, 0, 0, 0, 0],
'site 5':['exit',15, 0, 25, 0, 80],
})
然而,我想要的是指示每个站点的退出/进入的列(列来自合并的 Excel 标题)
下面是所需的示例(在我输入时忽略实际值)
df_target = pd.DataFrame({'Week':[14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14, 14],
'Date':['2020-04-01', '2020-04-02', '2020-04-03', '2020-04-04', '2020-04-05','2020-04-01', '2020-04-02', '2020-04-03', '2020-04-04', '2020-04-05','2020-04-01', '2020-04-02', '2020-04-03', '2020-04-04', '2020-04-05'],
'site':['site 1', 'site 1', 'site 1', 'site 1', 'site 1', 'site 1', 'site 1', 'site 1', 'site 1', 'site 2', 'site 2','site 2','site 2','site 2','site 2'],
'entry/exit':['exit','exit', 'exit', 'entry', 'entry', 'entry', 'entry', 'entry', 'entry', 'exit', 'exit', 'exit', 'exit', 'entry', 'entry'],
'Value':[12 ,1, 0, 50, 7, 0, 12 ,1, 0, 50, 7, 0, 12 ,1, 0]
})
我努力了
df_target = df_new.melt(id_vars=['Week','Date'], var_name="Site", value_name="Value")
但我想我也需要以某种方式按第二行分组或将其视为第二个标题?
解决方案
首先MultiIndex
从输入创建DataFrame
:
#if possible
#df = pd.read_csv(file, header=[0,1], index_col=[0,1])
df_new.columns = [df_new.columns, df_new.iloc[0]]
df = df_new.iloc[1:]
print (df.columns)
MultiIndex([( 'Week', 'nan'),
( 'Date', 'NaT'),
('site 1', 'exit'),
('site 2', 'exit'),
('site 3', 'exit'),
('site 4', 'exit'),
('site 5', 'exit')],
)
然后将前 2 转换MultiIndex columns
为index
,因此可能用于与和
DataFrame.unstack
融合:Series.rename_axis
Series.reset_index
df = (df.set_index(df.columns[:2].tolist())
.unstack([0,1])
.rename_axis(['site','entry/exit','Week','Date'])
.reset_index(name='Value'))
print (df)
site entry/exit Week Date Value
0 site 1 exit 14 2020-04-01 0
1 site 1 exit 14 2020-04-02 0
2 site 1 exit 14 2020-04-03 0
3 site 1 exit 14 2020-04-04 0
4 site 1 exit 14 2020-04-05 0
5 site 2 exit 14 2020-04-01 10
6 site 2 exit 14 2020-04-02 0
7 site 2 exit 14 2020-04-03 7
8 site 2 exit 14 2020-04-04 19
9 site 2 exit 14 2020-04-05 0
10 site 3 exit 14 2020-04-01 0
11 site 3 exit 14 2020-04-02 0
12 site 3 exit 14 2020-04-03 7
13 site 3 exit 14 2020-04-04 0
14 site 3 exit 14 2020-04-05 0
15 site 4 exit 14 2020-04-01 0
16 site 4 exit 14 2020-04-02 0
17 site 4 exit 14 2020-04-03 0
18 site 4 exit 14 2020-04-04 0
19 site 4 exit 14 2020-04-05 0
20 site 5 exit 14 2020-04-01 15
21 site 5 exit 14 2020-04-02 0
22 site 5 exit 14 2020-04-03 25
23 site 5 exit 14 2020-04-04 0
24 site 5 exit 14 2020-04-05 80
推荐阅读
- java - spring-message依赖连接问题
- php - 使用 JMS 3.11 反序列化时,setSerializeNull() 报告为未定义
- azure-service-fabric - 启用 gMSA 时如何设置安全服务结构仪表板?
- python - 如何在 Huggingface + CUDA 内存不足的 BERT 之上添加 BiLSTM。尝试分配 16.00 MiB
- java - 用斜杠替换文字的 Java 把手
- python - Django 模型如何修复循环导入错误?
- c - C 指向数据。为什么显示错误;二进制的无效操作数?
- c - 将 char* 复制到 char* 的指针
- android - Android错误膨胀类
加载高图时 - css - 如何在溢出设置为可见时显示线性渐变的边框半径 - React Native IOS