首页 > 解决方案 > 按与字符串相关的列错误合并 2 个 csv 文件?

问题描述

我正在尝试按列合并 2 个 csv 文件。
我的两个 csv 都以 '_4.csv' 作为文件名结尾,合并后的 csv 的最终结果如下所示:

    0-10       ,83.72,66.76,86.98  ,0-10       ,83.72,66.76,86.98
    11-20      ,15.01,31.12,12.04  ,11-20      ,15.01,31.12,12.04
    21-30      ,1.14,2.05,0.94     ,21-30      ,1.14,2.05,0.94
    31-40      ,0.13,0.07,0.03     ,31-40      ,0.13,0.07,0.03
    over 40    ,0.0,0.0,0.0        ,over 40    ,0.0,0.0,0.0
    UHF case   ,0.0,0.0,0.0        ,UHF case   ,0.0,0.0,0.0

我的代码:

    #combine 2 csv into 1 by columns
    files_in_dir = [f for f in os.listdir(os.getcwd()) if f.endswith('_4.csv')]
    temp_data = []
    for filenames in files_in_dir:
        temp_data.append(np.loadtxt(filenames,dtype='str'))
    temp_data = np.array(temp_data)
    np.savetxt('_mix.csv',temp_data.transpose(),fmt='%s',delimiter=',')

但是错误说:

    temp_data.append(np.loadtxt(filenames,dtype='str'))
    for x in read_data(_loadtxt_chunksize):
    raise ValueError("Wrong number of columns at line %d"
    ValueError: Wrong number of columns at line 2

不确定它是否与第一列是字符串而不是值有关。
有谁知道如何修理它?非常感谢

标签: pythonpandascsv

解决方案


我想你正在寻找join方法。如果我们有两个.csv表格文件:

0-10       ,83.72,66.76,86.98
11-20      ,15.01,31.12,12.04
21-30      ,1.14,2.05,0.94
31-40      ,0.13,0.07,0.03
over 40    ,0.0,0.0,0.0
UHF case   ,0.0,0.0,0.0

假设它们都具有相似的结构,我们将使用其中一个命名data.csv

import pandas as pd

# Assumes there are no headers
df1 = pd.read_csv("data.csv", header=None)
df2 = pd.read_csv("data.csv", header=None)

# By default: DataFrame headers are assigned numbers 0, 1, 2, 3
# In the second data frame, we will rename columns so they do not clash.
#   meaning `df2` will now have columns named: 4, 5, 6, 7
df2 = df2.rename(
    columns={
        x: y for x, y in zip(df1.columns, range(len(df2.columns), len(df2.columns) * 2))
    }
)

print(df1.join(df2))

示例输出:

             0      1      2      3            4      5      6      7
0  0-10         83.72  66.76  86.98  0-10         83.72  66.76  86.98
1  11-20        15.01  31.12  12.04  11-20        15.01  31.12  12.04
2  21-30         1.14   2.05   0.94  21-30         1.14   2.05   0.94
3  31-40         0.13   0.07   0.03  31-40         0.13   0.07   0.03
4  over 40       0.00   0.00   0.00  over 40       0.00   0.00   0.00
5  UHF case      0.00   0.00   0.00  UHF case      0.00   0.00   0.00

推荐阅读