python-3.x - 如何从 pandas join 中的第二个数据框中填充先前的值
问题描述
我想加入 2 个数据框并填写任何 nan 值。但是,df 缺少 df2 中的第一个值。我怎样才能从df中填写呢?
import pandas as pd
from datetime import datetime, timedelta
date_today = datetime.now()
days = pd.date_range(date_today, date_today + timedelta(7), freq='D')
data = range(len(days)-1)
days = days.delete(3)
date_today = date_today + timedelta(days=3)
df = pd.DataFrame({'test': days, 'col_df': data})
df = df.set_index('test')
print(df)
days2 = pd.date_range(date_today, date_today + timedelta(7), freq='D')
data2 = range(len(days2))
df2 = pd.DataFrame({'test': days2, 'col_df22': data2})
df2 = df2.set_index('test')
print(df2)
print(df2.join(df))
df
col_df
test
2020-12-08 15:22:00.997578 0
2020-12-09 15:22:00.997578 1
2020-12-10 15:22:00.997578 2
2020-12-12 15:22:00.997578 3
2020-12-13 15:22:00.997578 4
2020-12-14 15:22:00.997578 5
2020-12-15 15:22:00.997578 6
df2
col_df22
test
2020-12-11 15:22:00.997578 0
2020-12-12 15:22:00.997578 1
2020-12-13 15:22:00.997578 2
2020-12-14 15:22:00.997578 3
2020-12-15 15:22:00.997578 4
2020-12-16 15:22:00.997578 5
2020-12-17 15:22:00.997578 6
2020-12-18 15:22:00.997578 7
df2.join(df)
col_df22 col_df
test
2020-12-11 15:22:00.997578 0 NaN
2020-12-12 15:22:00.997578 1 3.0
2020-12-13 15:22:00.997578 2 4.0
2020-12-14 15:22:00.997578 3 5.0
2020-12-15 15:22:00.997578 4 6.0
2020-12-16 15:22:00.997578 5 NaN
2020-12-17 15:22:00.997578 6 NaN
2020-12-18 15:22:00.997578 7 NaN
我想:
col_df22 col_df
test
2020-12-11 15:22:00.997578 0 2.0
2020-12-12 15:22:00.997578 1 3.0
2020-12-13 15:22:00.997578 2 4.0
2020-12-14 15:22:00.997578 3 5.0
2020-12-15 15:22:00.997578 4 6.0
2020-12-16 15:22:00.997578 5 6.0
2020-12-17 15:22:00.997578 6 6.0
2020-12-18 15:22:00.997578 7 6.0
解决方案
你可以试试merge_asof
:
pd.merge_asof(df2, df, left_index=True, right_index=True)
输出:
col_df22 col_df
test
2020-12-11 10:30:20.464611 0 2
2020-12-12 10:30:20.464611 1 3
2020-12-13 10:30:20.464611 2 4
2020-12-14 10:30:20.464611 3 5
2020-12-15 10:30:20.464611 4 6
2020-12-16 10:30:20.464611 5 6
2020-12-17 10:30:20.464611 6 6
2020-12-18 10:30:20.464611 7 6