首页 > 解决方案 > 数据预处理类型错误:'在' 需要字符串作为左操作数,而不是列表

问题描述

我正在编写一个对数据集进行文本预处理的函数:

def text_transform(text):
    text = text.lower()
    text = nltk.word_tokenize(text)
    
    x = []
    for i in text:
        if i.isalnum():
            x.append(x)
            
    text = x[:] 
    x.clear() 
    
    for j in text:
        if j not in stopwords.words('english') and j not in string.punctuation:
            x.append(j)
    return x

我在停用词部分收到错误:

 TypeError: 'in <string>' requires string as left operand, not list
 --------------------------------------------------------------------------- TypeError                                 Traceback (most recent call
 last) <ipython-input-52-7f819487e6f8> in <module>
 ----> 1 text_transform('Hello How are you ?')
 
 <ipython-input-51-4ace2423bd95> in text_transform(text)
      12 
      13     for j in text:
 ---> 14         if j not in stop and j not in string.punctuation:
      15             x.append(j)
      16     return x
 
 TypeError: 'in <string>' requires string as left operand, not list

标签: pythonstringlisttypeerrordata-preprocessing

解决方案


不要重新附加相同的列表。

x.append(x)

只有字符串

x.append(i)

更好的是使用列表理解

x = [i for i in text if i.isalnum()]

推荐阅读