python - 连接多索引
问题描述
在区域和日期上合并 2 个数据框的最简单方法是什么?
我尝试加入并合并和连接。我收到“float”和“str”实例之间不支持“'<”和“无法处理非唯一多索引”错误
old_df old_value
region date
England 2010-01-01 4
2010-01-02 5
Wales 2010-01-01 4
2010-01-02 3
...
new_df
name new_value
region date
England 2010-01-01 10
2010-01-02 10
Wales 2010-01-01 9
2010-01-02 10
...
预期产出
old_value new_value
region date
England 2010-01-01 4 10
2010-01-02 5 10
Wales 2010-01-01 4 9
2010-01-02 3 10
解决方案
完美运行。您确定您已确保您的日期列是日期吗?pd.to_datetime()
df_old = pd.read_csv(io.StringIO("""
region date old_value
England 2010-01-01 4
nan 2010-01-02 5
Wales 2010-01-01 4
nan 2010-01-02 3
"""), sep="\s+")
df_new = pd.read_csv(io.StringIO("""
region date new_value
England 2010-01-01 10
nan 2010-01-02 10
Wales 2010-01-01 9
nan 2010-01-02 10"""), sep="\s+")
df_old.region = df_old.region.fillna(method="ffill")
df_new.region = df_new.region.fillna(method="ffill")
df_old.date = pd.to_datetime(df_old.date)
df_new.date = pd.to_datetime(df_new.date)
dfj = df_old.set_index(["region","date"]).join(df_new.set_index(["region","date"]))
old_value new_value
region date
England 2010-01-01 4 10
2010-01-02 5 10
Wales 2010-01-01 4 9
2010-01-02 3 10