pandas - 您如何将搬运工词干分析器应用于熊猫 df？

问题描述

每个人。我在尝试对 pd df 中的所有内容进行搬运时遇到问题。

这就是我正在尝试的。

df['txt'] = pos_tag(word_tokenize(df['txt']))

返回的错误是...

TypeError：预期的字符串或类似字节的对象

标签： pandasdataframenltkseriesps

您没有共享数据，也没有定义pos_tag，但我从您的标题中假设它实际上是porter_stemmer您所指的。现在，假设您有以下数据框：

 id                                   txt
0   1        I am the greatest. I am liked 
1   2     You are the best. You are loved. 
2   3                                     3
3   4          Why is that so? Chocolates. 
4   5                          It tried me!
5   5         do it! He retrieves the dogs!
6   6  Why not ? He rocketed to the stars.

然后，分两步进行标记化和词干化：

from nltk.stem import PorterStemmer, WordNetLemmatizer
import nltk
from nltk.tokenize import word_tokenize
porter_stemmer = PorterStemmer()
import pandas as pd

df['tokenized_sentence'] = df.apply(lambda row: nltk.word_tokenize(row['txt']), axis=1)
df['stem'] = df['tokenized_sentence'].apply(lambda x : [porter_stemmer.stem(y) for y in x])

id                                   txt  \
0   1        I am the greatest. I am liked    
1   2     You are the best. You are loved.    
2   3                                     3   
3   4          Why is that so? Chocolates.    
4   5                          It tried me!   
5   5         do it! He retrieves the dogs!   
6   6  Why not ? He rocketed to the stars.    

                               tokenized_sentence  \
0         [I, am, the, greatest, ., I, am, liked]   
1    [You, are, the, best, ., You, are, loved, .]   
2                                             [3]   
3           [Why, is, that, so, ?, Chocolates, .]   
4                              [It, tried, me, !]   
5        [do, it, !, He, retrieves, the, dogs, !]   
6  [Why, not, ?, He, rocketed, to, the, stars, .]   

                                          stem  
0       [I, am, the, greatest, ., I, am, like]  
1  [you, are, the, best, ., you, are, love, .]  
2                                          [3]  
3            [why, is, that, so, ?, chocolate, .]  
4                             [It, tri, me, !]  
5        [do, it, !, He, retriev, the, dog, !]  
6  [why, not, ?, He, rocket, to, the, star, .]

pandas - 您如何将搬运工词干分析器应用于熊猫 df？

问题描述

解决方案

推荐阅读