python - 如何在熊猫中连接两个或多个具有不同列名的数据框
问题描述
我有数百个 csv 文件,我需要将其加入一个文件。我将它们全部加载为熊猫数据框。示例数据框:
df1 = pd.DataFrame({'a':['e1','e1','e1'],'x':[4,5,6],'y':[7,8,9]})
df2 = pd.DataFrame({'a':['e2','e2','e2'],'x':[13,14,15],'y':[16,17,18], 'z':[100,101,102]})
我需要这个输出:
a x y z
0 e1 4 7
1 e1 5 8
2 e1 6 9
3 e2 13 16 100
4 e2 14 17 101
5 e2 15 18 102
或者
a x y z
0 e1 4 7 na
1 e1 5 8 na
2 e1 6 9 na
3 e2 13 16 100
4 e2 14 17 101
5 e2 15 18 102
我怎样才能做到这一点?谢谢
编辑:
我有 cca 500 个 csv 文件,这是我从中制作一个文件的代码:
import glob
import pandas as pd
path = r'C:/Users/Miro/data hist'
all_files = glob.glob(path + "/*.csv")
li = []
for filename in all_files:
df = pd.read_csv(filename, index_col=None, sep='delimiter', header=None)
li.append(df)
frame = pd.concat(li, axis=0, ignore_index=True)
frame.to_csv( "full.csv", index=False, encoding='utf-8-sig')
解决方案
这应该工作
df1 = pd.DataFrame({'a':['e1','e1','e1'],'x':[4,5,6],'y':[7,8,9]})
df2 = pd.DataFrame({'a':['e2','e2','e2'],'x':[13,14,15],'y':[16,17,18], 'z':[100,101,102]})
newdf = df1.append(df2, ignore_index=True)
a x y z
0 e1 4 7 NaN
1 e1 5 8 NaN
2 e1 6 9 NaN
3 e2 13 16 100.0
4 e2 14 17 101.0
5 e2 15 18 102.0
或者如果你真的想要na
价值观而不是NaN
你可以做的
newdf = df1.append(df2, ignore_index=True).fillna("na")
a x y z
0 e1 4 7 na
1 e1 5 8 na
2 e1 6 9 na
3 e2 13 16 100
4 e2 14 17 101
5 e2 15 18 102
要使其在您编辑的问题中起作用:
import glob
import pandas as pd
path = r'C:/Users/Miro/data hist'
all_files = glob.glob(path + "/*.csv")
li = pd.DataFrame()
for filename in all_files:
df = pd.read_csv(filename, index_col=None, sep='delimiter', header=None)
li = li.append(df, ignore_index=True)
li.to_csv( "full.csv", index=False, encoding='utf-8-sig')
推荐阅读
- java - Java code to compare excel sheets does not work for larger files
- cumulocity - 如何使用 REST 在 Cumulocity 中创建设备组
- c++ - boost log file rotation and compression
- vb.net - Button Performclick another Tab Page
- ios - iOS: How to remove the reference to API that is not in use?
- java - Notifications are not send while I'm receiving no error
- python - How to manually import a module in python
- asp.net - FindControl in Gridview is empty with Framework 4.5.2, with Framework 4 is working
- docker - 如果Dockerfile描述为EXPOSE,是否需要在K8s中定义ports.containerPort
- angular - 如何在 Ionic Angular 的 GET HTTP 中解析 JSON?