首页 > 解决方案 > R 项频率分析 (TF-IDF) 中的错误

问题描述

我尝试使用以下数据运行以下代码:

library(dplyr)
library(janeaustenr)
library(tidytext)

book_words <- austen_books() %>%
 unnest_tokens(word, text) %>%
 count(book, word, sort = TRUE)

为此,我收到此错误消息:

Error in count(., book, word, sort = TRUE) : 
  unused argument (sort = TRUE)

我必须更改哪些代码才能正常工作?

标签: rtexttf-idftidytext

解决方案


有可能countdplyr加载具有相同功能的任何其他包中屏蔽count。所以,使用dplyr::count

austen_books() %>%
  unnest_tokens(word, text) %>% 
  dplyr::count(book, word, sort = TRUE)
# A tibble: 40,379 × 3
   book              word      n
   <fct>             <chr> <int>
 1 Mansfield Park    the    6206
 2 Mansfield Park    to     5475
 3 Mansfield Park    and    5438
 4 Emma              to     5239
 5 Emma              the    5201
 6 Emma              and    4896
 7 Mansfield Park    of     4778
 8 Pride & Prejudice the    4331
 9 Emma              of     4291
10 Pride & Prejudice to     4162
# … with 40,369 more rows

即如果我们在plyr之后加载dplyr,它可能会掩盖一些可用的常用功能dplyr

> austen_books() %>%
+   unnest_tokens(word, text) %>% 
+   plyr::count(book, word, sort = TRUE)
Error in plyr::count(., book, word, sort = TRUE) : 
  unused argument (sort = TRUE)

推荐阅读