python - 如何在 python Dataframe 中提高以下代码的性能,如果可能,请提及复杂性的顺序:
问题描述
下面的代码工作正常,但想提高代码的性能。
我们可以通过索引来做到这一点还是有任何其他方式。
我正在尝试将 40 个单热编码器字段复制到单列中。
def soil_typ(row):
if row['Soil_Type1'] == 1:
return 1
elif row['Soil_Type2'] == 1:
return 2
elif row['Soil_Type3'] == 1:
return 3
elif row['Soil_Type4'] == 1:
return 4
elif row['Soil_Type5'] == 1:
return 5
elif row['Soil_Type6'] == 1:
return 6
elif row['Soil_Type7'] == 1:
return 7
elif row['Soil_Type8'] == 1:
return 8
elif row['Soil_Type9'] == 1:
return 9
elif row['Soil_Type10'] == 1:
return 10
elif row['Soil_Type11'] == 1:
return 11
elif row['Soil_Type12'] == 1:
return 12
elif row['Soil_Type13'] == 1:
return 13
elif row['Soil_Type14'] == 1:
return 14
elif row['Soil_Type15'] == 1:
return 15
elif row['Soil_Type16'] == 1:
return 16
elif row['Soil_Type17'] == 1:
return 17
elif row['Soil_Type18'] == 1:
return 18
elif row['Soil_Type19'] == 1:
return 19
elif row['Soil_Type20'] == 1:
return 20
elif row['Soil_Type21'] == 1:
return 21
elif row['Soil_Type23'] == 1:
return 22
elif row['Soil_Type23'] == 1:
return 23
elif row['Soil_Type24'] == 1:
return 24
elif row['Soil_Type25'] == 1:
return 25
elif row['Soil_Type26'] == 1:
return 26
elif row['Soil_Type27'] == 1:
return 27
elif row['Soil_Type28'] == 1:
return 28
elif row['Soil_Type29'] == 1:
return 29
elif row['Soil_Type30'] == 1:
return 30
elif row['Soil_Type31'] == 1:
return 31
elif row['Soil_Type32'] == 1:
return 32
elif row['Soil_Type33'] == 1:
return 33
elif row['Soil_Type34'] == 1:
return 34
elif row['Soil_Type35'] == 1:
return 35
elif row['Soil_Type36'] == 1:
return 36
elif row['Soil_Type37'] == 1:
return 37
elif row['Soil_Type38'] == 1:
return 38
elif row['Soil_Type39'] == 1:
return 39
elif row['Soil_Type40'] == 1:
return 40
else:
return 0
在此之后,我应用此函数来创建一个新变量,如下所示:
data_train['Soil'] = [soil_typ(row_[1]) for row_ in data_train.iterrows()]
数据集包含近 150 万条记录
上面的代码正在运行,但想探索这段代码的性能
解决方案
无需在这里重复很多相同的代码。代码后用“#”解释的步骤。
n = 40
def soil_typ(row):
for x in range(n+1): # iters through a list of values and returns n+1
y = 'Soil_Type%s' % x # translates integer to string (label)
if row[y] == True: # value 1 is equal to "True"; less confusing if
# false or true being used here during a 0/1 com-
# parison.
return x
else:
return 0
.. code snippet ..
推荐阅读
- php - 响应式汉堡包/下拉菜单在 wordpress 网站中不起作用
- webstorm - 文件观察器:“编译目录时必须指定输出目录”
- python - 计算 Pandas 系列中的值组
- python - 防止 Django 的 JsonResponse 序列化我的字符串
- laravel - 如何在 Laravel Homestead 中启用 php-Mcrypt 扩展
- python - 如何在文件名中插入零以使它们的长度相同
- java - 使用JpaPagingItemReader时spring批处理如何在内部初始化状态?
- scala - 从案例类集合创建 Flink DataStream 时“未找到隐含”
- python - 爬虫脚本运行没有错误,但没有我预期的输出 excel
- server - 如何配置 sendmail 以将电子邮件从邮件服务器转发到邮件服务器