首页 > 解决方案 > 如何在 python 中使用基于函数返回的值设置新列?

问题描述

我正在 python 中进行一些文本挖掘,如果我的搜索函数的返回为 true,则希望设置一个值为 1 的新列,如果为 false,则设置为 0。

我尝试了各种 if 语句,但无法正常工作。

我正在做的简化版本如下:

import pandas as pd
import nltk
nltk.download('punkt')

df = pd.DataFrame (
        {
        'student number' : [1,2,3,4,5],
        'answer' : [ 'Yes, she is correct.', 'Yes', 'no', 'north east', 'No its North East']
        # I know there's an apostrophe missing
        }
)       
print(df)

# change all text to lower case
df['answer'] = df['answer'].str.lower()

# split the answer into individual words
df['text'] = df['answer'].apply(nltk.word_tokenize)

# Check if given words appear together in a list of sentence 
def check(sentence, words): 
   res = [] 
   for substring in sentence: 
       k = [ w for w in words if w in substring ] 
       if (len(k) == len(words) ): 
            res.append(substring) 
   return res

# Driver code 
sentence = df['text'] 
words = ['no','north','east'] 
print(check(sentence, words))

标签: python-3.x

解决方案


这就是你想要的我认为:

df['New'] = df['answer'].isin(words)*1

这个对我有用:

for i in range(0, len(df)):
    if set(words) <= set(df.text[i]):
        df['NEW'][i] = 1
    else:
        df['NEW'][i] = 0

如果您使用此方法,则不需要该功能。


推荐阅读