首页 > 解决方案 > 数据名中列的调用函数

问题描述

我已成功导入这些库

import pandas as pd
from underthesea import word_tokenize

并加载数据

df=pd.read_csv("/content/gdrive/MyDrive/BigData/BookingReview.csv")
print (df)

这是数据的图像

我尝试对名为“Review”的列使用函数 word_tokenize

df['Review']=df['Review'].apply(word_tokenize)

但它显示了这个错误

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-80-386271f3d8ee> in <module>()
----> 1 df['Review']=df['Review'].apply(word_tokenize)

3 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in apply(self, func, convert_dtype, args, **kwds)
   4211             else:
   4212                 values = self.astype(object)._values
-> 4213                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   4214 
   4215         if len(mapped) and isinstance(mapped[0], Series):

pandas/_libs/lib.pyx in pandas._libs.lib.map_infer()

/usr/local/lib/python3.7/dist-packages/underthesea/pipeline/word_tokenize/__init__.py in word_tokenize(sentence, format)
     32     'Bác_sĩ bây_giờ có_thể thản_nhiên báo_tin bệnh_nhân bị ung_thư'
     33     """
---> 34     tokens = tokenize(sentence)
     35     crf_model = CRFModel.instance()
     36     output = crf_model.predict(tokens, format)

/usr/local/lib/python3.7/dist-packages/underthesea/pipeline/word_tokenize/regex_tokenize.py in tokenize(text, format, tag)
    233     :return: tokenize text
    234     """
--> 235     text = Text(text)
    236     text = text.replace("\t", " ")
    237     matches = [m for m in re.finditer(patterns, text)]

/usr/local/lib/python3.7/dist-packages/underthesea/feature_engineering/text.py in Text(text)
      9     """
     10     if not is_unicode(text):
---> 11         text = text.decode("utf-8")
     12     text = unicodedata.normalize("NFC", text)
     13     return text

AttributeError: 'float' object has no attribute 'decode'

我使用上面的函数来分隔情绪分析的单词。我试图找到我的问题的答案,但仍然不知道为什么会发生此错误。任何帮助是极大的赞赏。谢谢大家!

标签: pythonpandasnlptokenizedata-preprocessing

解决方案


推荐阅读