首页 > 解决方案 > 为什么即使在 tm_map 函数中使用 content_transformer for tolower 后,我在 R 中的 DocumentTerm 矩阵中也会出现错误?

问题描述

我在这里经历了很多答案,并尝试使用 stackoverflow 中给出的所有建议,但似乎没有什么对我有用。在使用 R 中的 tm 包创建文档术语矩阵之前是否有任何顺序?

email_corpus <- VCorpus(VectorSource(df2$final_text))

email_corpus_clean <- tm_map(email_corpus,content_transformer(tolower))     

#remove special characters

for(j in seq(email_corpus_clean))  {        

email_corpus_clean[[j]] <- gsub("\n", " ", email_corpus_clean[[j]]) 
email_corpus_clean[[j]] <- gsub("\r", " ", email_corpus_clean[[j]])        
email_corpus_clean[[j]] <- gsub(">>", " ", email_corpus_clean[[j]])     

}


email_corpus_clean <- tm_map(email_corpus_clean,removeNumbers)        

myStopWords<- c("said","from","what")

email_corpus_clean <- tm_map(email_corpus_clean, removeWords, c(stopwords("english"), myStopWords))    

email_corpus_clean <- tm_map(email_corpus_clean, removePunctuation)   

email_corpus_clean <- tm_map(email_corpus_clean, stemDocument)   

email_corpus_clean <- tm_map(email_corpus_clean,stripWhitespace)  

#This is the line of code , where i get error 

email_dtm <- DocumentTermMatrix(email_corpus_clean)   #creating document term matrix


# this is the error 

Error in UseMethod("meta", x) : 
no applicable method for 'meta' applied to an object of class "character"

标签: rtext-miningtm

解决方案


推荐阅读