python - Outer join two dataframes based on index without disrupting the original order of both dataframes
问题描述
I intend to create a table that compares the contents of two dataframes. The indices are part of a process and the order has been predefined. My dataframes look like as below:
df1:
Index | Content |
---|---|
xxa | 0 |
init | 1 |
tyui | 2 |
ddas | 3 |
gsdj | 4 |
hasf | 5 |
df2:
Index | Content |
---|---|
xxa | 1 |
init | 1 |
fafa | 6 |
eafa | 7 |
gsdj | 6 |
hasf | 5 |
Intended merged output:
Index1 | Index2 | Content1 | Content2 |
---|---|---|---|
xxa | xxa | 0 | 1 |
init | init | 1 | 1 |
tyui | 2 | ||
ddas | 3 | ||
fafa | 6 | ||
eafa | 7 | ||
gsdj | gsdj | 4 | 6 |
hasf | hasf | 5 | 5 |
I have tried taking the union of the two indices, i.e.
index_all = df1.index.union(df2.index, sort=False)
However, the indices are sorted as | Index1| Index2 | Content1 | Content2 | ------| ------- | -------- | -------- | xxa | xxa | 0 | 1 | init | init | 1 | 1 | tyui | |2 | | ddas | |3 | | gsdj | gsdj |4 | 6 | hasf | hasf |5 | 5 | | fafa | | 6 | | eafa | | 7
解决方案
像这样的东西怎么样:
df = pd.DataFrame(index = pd.concat([df1, df2])["Index"].unique(), columns = ["Index1", "Index2"])
df["Index1"] = df1.set_index("Index", drop = False)
df["Index2"] = df2.set_index("Index", drop = False)
df.sort_index().reset_index(drop = True)
# Index1 Index2
#0 a a
#1 b b
#2 c NaN
#3 d NaN
#4 NaN e
#5 NaN f
#6 g g
#7 h h
推荐阅读
- python - 如何将包含输入函数的python函数调用到HTML
- firebase - 如何向 Firestore Go SDK 验证用户身份?
- list - 在 dart 列表的开头插入元素
- java - java.util.Scanner 中的 NoSuchElementException
- php - PHP sum 如何从其他php文件调用
- javascript - 为什么我的代码中出现“GET/304 --”?(vue.js,快递)
- java - 如果文件不存在则返回空响应
- python - 有没有办法修改 GluonCV 预训练对象检测模型的层
- python - 有没有办法在矩阵中找到连接?
- android - 我正在尝试使用改造 android 在数据库中插入数据