python - 如何将数据类型从对象转换为数字,然后在 pandas 中找到每一行的平均值?例如。将 '<17,500, >=15,000' 转换为 16250(平均值)
问题描述
data['family_income'].value_counts()
>=35,000 2517
<27,500, >=25,000 1227
<30,000, >=27,500 994
<25,000, >=22,500 833
<20,000, >=17,500 683
<12,500, >=10,000 677
<17,500, >=15,000 634
<15,000, >=12,500 629
<22,500, >=20,000 590
<10,000, >= 8,000 563
< 8,000, >= 4,000 402
< 4,000 278
Unknown 128
要显示为 MEAN 值而不是范围内的值的数据列
data['family_income']
0 <17,500, >=15,000
1 <27,500, >=25,000
2 <30,000, >=27,500
3 <15,000, >=12,500
4 <30,000, >=27,500
...
10150 <30,000, >=27,500
10151 <25,000, >=22,500
10152 >=35,000
10153 <10,000, >= 8,000
10154 <27,500, >=25,000
Name: family_income, Length: 10155, dtype: object
输出:作为平均估算值
0 16250
1 26250
3 28750
...
10152 35000
10153 9000
10154 26500
data['family_income']=data['family_income'].str.replace(',', ' ').str.replace('<',' ')
data[['income1','income2']] = data['family_income'].apply(lambda x: pd.Series(str(x).split(">=")))
data['income1']=pd.to_numeric(data['income1'], errors='coerce')
data['income1']
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
..
10150 NaN
10151 NaN
10152 NaN
10153 NaN
10154 NaN
Name: income1, Length: 10155, dtype: float64
在这种情况下,数据类型从对象到数字的转换似乎不起作用,因为所有值都返回为 NaN。那么,如何转换为数值数据类型并找到平均估算值?
解决方案
您可以使用以下代码段:
# Importing Dependencies
import pandas as pd
import string
# Replicating Your Data
data = ['<17,500, >=15,000', '<27,500, >=25,000', '< 4,000 ', '>=35,000']
df = pd.DataFrame(data, columns = ['family_income'])
# Removing punctuation from family_income column
df['family_income'] = df['family_income'].apply(lambda x: x.translate(str.maketrans('', '', string.punctuation)))
# Splitting ranges to two columns A and B
df[['A', 'B']] = df['family_income'].str.split(' ', 1, expand=True)
# Converting cols A and B to float
df[['A', 'B']] = df[['A', 'B']].apply(pd.to_numeric)
# Creating mean column from A and B
df['mean'] = df[['A', 'B']].mean(axis=1)
# Input DataFrame
family_income
0 <17,500, >=15,000
1 <27,500, >=25,000
2 < 4,000
3 >=35,000
# Result DataFrame
mean
0 16250.0
1 26250.0
2 4000.0
3 35000.0
推荐阅读
- vue.js - VueJS - 悬停时切换列表项的背景图像
- datetime - 将 Win32 纪元的纳秒添加到 Go 时间
- javascript - 如何有条件地加载和运行 javascript 文件
- c# - Asp Net Core 3.1 自定义角色授权
- c# - 为什么我使用此代码时没有显示任何内容?
- scala - Intellij IDEA - 无法为 Scala 添加框架支持
- java - 将嵌套循环的输出存储为新数组 - Java
- azure-service-fabric - ELK 堆栈未将内部解构属性接收为已解构,但作为转义字符串
- python - 如何将 If 函数应用于数据框
- javascript - 从数组末尾查找第 n 个元素