r - 如何返回一个 DataFrame 中与另一个 DataFrame 中的行部分匹配的行(字符串匹配)
问题描述
我想返回 list2 中包含 list1 中的字符串的所有行。
list1 <- tibble(name = c("the setosa is pretty", "the versicolor is the best", "the mazda is not a flower"))
list2 <- tibble(name = c("the setosa is pretty and the best flower", "the versicolor is the best and a red flower", "the mazda is a great car"))
例如,代码应该从 list2 中返回“the setosa is pretty and the best 花”,因为它包含来自 list1 的短语“the setosa is pretty”。我努力了:
grepl(list1$name, list2$name)
但我收到以下警告: “警告消息:在 grepl(commonPhrasesNPSLessthan6$value, dfNPSLessthan6$nps_comment) 中:参数 'pattern' 的长度 > 1,并且只会使用第一个元素”。
我会很感激一些帮助!谢谢!
编辑
list1 <- structure(list(value = c("it would not let me", "to go back and change",
"i was not able to", "there is no way to", "to pay for a credit"
), n = c(15L, 14L, 12L, 11L, 9L)), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
list2 <- structure(list(comment = c("it would not let me go back and change things",
"There is no way to back up without starting allover.", "Could not link blah blah account. ",
"i really just want to speak to someone - and, now that I'm at the very end of the process-",
"i felt that some of the information that was asked to provide wasn't necessary",
"i was not able to to go back and make changes")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame")
)
解决方案
编辑基于新数据:
list2 %>%
filter(stringr::str_detect(comment,paste0(list1$value,collapse = "|")))
# A tibble: 2 x 1
comment
<chr>
1 it would not let me go back and change things
2 i was not able to to go back and make changes
原来的
一个stringr
选项:
list2[stringr::str_detect(list2$name,list1$name),]
# A tibble: 2 x 1
name
<chr>
1 the setosa is pretty and the best flower
2 the versicolor is the best and a red flower
唯一的base
解决方案:
list2[lengths(lapply(list1$name,grep,list2$name))>0,]
# A tibble: 2 x 1
name
<chr>
1 the setosa is pretty and the best flower
2 the versicolor is the best and a red flower
推荐阅读
- excel - 如果文件名重复,将文件移动到(错误)不同的目录
- r - 通过终端和通过 RStudio 安装软件包有什么区别?
- c - Pthread 意外输出但结果良好
- laravel - 这个 laravel 问题的要求列表是什么
- random - 从数据集中采样 5 个观察值,其中排名变量并不总是有 5 个观察值
- python - Python 错误地从 curl 获取 json 列表
- python - python在类方法的装饰器中使用self
- swift - 从 Date 获取当前月份的周数
- javascript - JavaScript - 勾选复选框时文本不会改变颜色
- google-analytics - 是否可以将 Discord 或 Microsoft Teams 聊天机器人连接到 Google Analytics?