python-3.x - Lambda Function To Categorise Terms Causing NameError
问题描述
I'm trying to tag some keywords by topic using some regex strings that they might contain. This will ideally append a "Category" column to the dataframe with either the tag that it falls into or "other" if none was found.
The data I'm trying to tag basically looks like the following:
| Keyword | Volume |
|:-----------|------------:|
| audi specs | 4000 |
| bmw width | 170 |
| a45 bhp | 30 |
| a1 length | 210 |
| alfa co2 | 10 |
And the code I've got currently is:
import pandas as pd
import numpy as np
import re
from IPython.display import display
df = pd.read_csv("make-model-keywords.csv")
df = pd.DataFrame(df, columns=['Keyword', 'Volume','Keyword Difficulty','CPC (USD)', 'SERP Features'])
tags = [
{
"name": "Dimensions",
"regex": "dimension|width|height|length|size"
},
{
"name": "MPG",
"regex": "mpg|co2|emission|consumption|running|economy|fuel"
},
{
"name": "Specs",
"regex": "spec|specification|torque|bhp|weight|rpm|62|mph|kmh"
}
]
def basic_tagging(string, tags):
for tag in tags:
if re.match(tag['regex'], row['Keyword']):
return tag['name']
else:
return "other"
df['Category'] = df.apply(lambda x: basic_tagging(x['Keyword'], tags), axis=1)
However it's giving me the following error:
---------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-71-31890ef48022> in <module>()
----> 1 df['Category'] = df.apply(lambda row: basic_tagging(row['Keyword'], tags), axis=1)
2 df.head()
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds)
6012 args=args,
6013 kwds=kwds)
-> 6014 return op.get_result()
6015
6016 def applymap(self, func):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)
140 return self.apply_raw()
141
--> 142 return self.apply_standard()
143
144 def apply_empty_result(self):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
246
247 # compute the result using the series generator
--> 248 self.apply_series_generator()
249
250 # wrap results
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
275 try:
276 for i, v in enumerate(series_gen):
--> 277 results[i] = self.f(v)
278 keys.append(v.name)
279 except Exception as e:
<ipython-input-71-31890ef48022> in <lambda>(row)
----> 1 df['Category'] = df.apply(lambda row: basic_tagging(row['Keyword'], tags), axis=1)
2 df.head()
<ipython-input-68-1867110ca579> in basic_tagging(string, tags)
1 def basic_tagging(string, tags):
2 for tag in tags:
----> 3 if re.match(tag['regex'], row['Keyword']):
4 return tag['name']
5 else:
NameError: ("name 'row' is not defined", 'occurred at index 0')
Is there something patently obvious that I'm missing?
解决方案
将您的功能更改为:
def basic_tagging(row):
for tag in tags:
if re.match(tag['regex'], row['Keyword']):
return tag['name']
else:
return "other"
接着:
df['Category'] = df.apply(basic_tagging, axis=1)
推荐阅读
- r - 更改多个变量的更快方法
- html - 尝试创建一个新的钱包地址但没有任何反应
- azure-devops - Azure DevOps 仅在拉取请求生成验证期间运行生成
- multithreading - Jupyter Notebook 中的守护线程不会停止
- android - 从 RecyclerView 中“隐藏”项目
- python - 在pygame中彼此相邻放置时尝试删除两个对象
- django - 通过提交按钮运行函数
- python - 动态有条件地格式化文本 - 如果单词在列表中
- deep-learning - 如何将新图像转换为时尚 mnist 数据集图像格式?
- javascript - 与下一个数组或对象相比,仅更改数组或对象的更改属性