python - 更新数据框列并丢失日期索引
问题描述
我有两个数据框
df1:
col2 col3 dept
date
2020-05-06 29 21 A
2020-05-07 56 12 B
2020-05-08 82 15 C
2020-05-09 13 9 D
2020-05-10 35 13 E
2020-05-11 53 87 F
2020-05-12 25 9 G
2020-05-13 23 63 H
df2:
col2 dept
date
2020-05-06 64 A
2020-05-07 41 B
2020-05-08 95 C
2020-05-09 58 D
2020-05-10 89 E
2020-05-11 37 F
2020-05-12 24 G
2020-05-13 67 H
我想用列col2
中df1
的值更新列col2
,df2
所以我的输出如下所示:
col2 col3 dept
date
2020-05-06 64 21 A
2020-05-07 41 12 B
2020-05-08 95 15 C
2020-05-09 58 9 D
2020-05-10 89 13 E
2020-05-11 37 87 F
2020-05-12 24 9 G
2020-05-13 67 63 H
我写了一些看起来像这样的代码:
df1=df1.set_index('dept')
df1.update(df2.set_index('dept'))
df1=df1.reset_index()
但是它将索引重置df1
为整数而不是日期,因此我得到的输出如下所示:
dept col2 col3
0 A 64 21
1 B 41 12
2 C 95 15
3 D 58 9
4 E 89 13
5 F 37 87
6 G 24 9
7 H 67 63
我的完整代码如下:
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
import datetime
dept=['A','B','C','D','E','F','G','H']
date_today = datetime.date.today()
days = pd.date_range(date_today, date_today + timedelta(7), freq='D')
np.random.seed(seed=1111)
data1 = np.random.randint(1, high=100, size=len(days))
data2 = np.random.randint(1, high=100, size=len(days))
df1 = pd.DataFrame({'date': days, 'dept':dept,'col2': data1, 'col3': data2})
df1 = df1.set_index('date')
print(df1)
dept=['A','B','C','D','E','F','G','H']
date_today = datetime.date.today()
days = pd.date_range(date_today, date_today + timedelta(7), freq='D')
np.random.seed(seed=1331)
data3 = np.random.randint(1, high=100, size=len(days))
df2 = pd.DataFrame({'date': days, 'dept':dept,'col2': data3})
df2 = df2.set_index('date')
print(df2)
df1=df1.set_index('dept')
df1.update(df2.set_index('dept'))
df1=df1.reset_index()
print(df1)
如何更新df1
并df2
保持索引日期格式df1
?
解决方案
正如我对您的示例所了解的那样,您df1
从index 和 column的df2
基础上进行更新。您需要添加到索引并调用date
dept
dept
update
df1 = df1.set_index('dept', append=True)
df1 = df1.update(df2.set_index('dept', append=True))
df1 = df1.reset_index('dept')
Out[35]:
dept col2 col3
date
2020-05-06 A 64 21
2020-05-07 B 41 12
2020-05-08 C 95 15
2020-05-09 D 58 9
2020-05-10 E 89 13
2020-05-11 F 37 87
2020-05-12 G 24 9
2020-05-13 H 67 63