首页 > 解决方案 > 无法合并对象列类型上的两个数据框集

问题描述

merged_df = file1.merge(file1, file2, on="WineType")

ValueError:未知类型 str160

file2.info()

<class ‘pandas.core.frame.DataFrame’&gt;
RangeIndex: 4898 entries, 0 to 4897
Data columns (total 13 columns):

Column Non-Null Count Dtype
0 fixed acidity 4898 non-null float64
1 volatile acidity 4898 non-null float64
2 citric acid 4898 non-null float64
3 residual sugar 4898 non-null float64
4 chlorides 4898 non-null float64
5 free sulfur dioxide 4898 non-null float64
6 total sulfur dioxide 4898 non-null float64
7 density 4898 non-null float64
8 pH 4898 non-null float64
9 sulphates 4898 non-null float64
10 alcohol 4898 non-null float64
11 quality 4898 non-null int64
12 WineType 4898 non-null object
dtypes: float64(11), int64(1), object(1)
memory usage: 497.6+ KB

file1.info()

<class ‘pandas.core.frame.DataFrame’&gt;
RangeIndex: 1599 entries, 0 to 1598
Data columns (total 13 columns):

Column Non-Null Count Dtype
0 fixed acidity 1599 non-null float64
1 volatile acidity 1599 non-null float64
2 citric acid 1599 non-null float64
3 residual sugar 1599 non-null float64
4 chlorides 1599 non-null float64
5 free sulfur dioxide 1599 non-null float64
6 total sulfur dioxide 1599 non-null float64
7 density 1599 non-null float64
8 pH 1599 non-null float64
9 sulphates 1599 non-null float64
10 alcohol 1599 non-null float64
11 quality 1599 non-null int64
12 WineType 1599 non-null object
dtypes: float64(11), int64(1), object(1)
memory usage: 162.5+ KB

您能否指导我,如何将两个数据框与公共列合并到一个文件中。提前致谢。

标签: pythonpandasmerge

解决方案


您正在尝试合并object数据类型。pandas不知道如何正确匹配这些字段中的值。您可以尝试在这两种情况下转换为字符串,例如

file1['WineType'] = file1['WineType'].astype(str) 
file2['WineType'] = file2['WineType'].astype(str)
merged_df = file1.merge(file1, file2, on="WineType")

然后合并应该成功。

你也可以试试.astype('category')。有关列类型转换的更多信息: 在 Pandas 中将列类型从字符串更改为浮点


推荐阅读