r - 从 R 中的列中搜索单词/短语
问题描述
我的数据如下所示:
> head(df)
ID Comment
1 1 I ate dinner.
2 2 We had a three-course meal.
3 3 Brad came to dinner with us.
4 4 He loves fish tacos.
5 5 In the end, we all felt like we ate too much. Code 5.16
6 6 We all agreed; it was a magnificent evening.72 points.
我想创建两个新列,一个称为A
,一个称为B
。如果出现以下任何单词/短语,我希望 A 列等于 1:如果出现以下任何单词/短语,我dinner,evening,we ate
希望 B 列等于 1:.in the end,all,Brad,5.16
我该怎么做呢?请注意,我需要完全匹配。
解决方案
我们可以用grepl
在base R
df$A <- +(grepl("\\b(dinner|evening|we|ate)\\b", df$Comment))
df$B <- +(grepl("\\b(in the end|all|Brad|5\\.16)\\b", df$Comment))
-输出
df
ID Comment A B
1 1 I ate dinner. 1 0
2 2 We had a three-course meal. 0 0
3 3 Brad came to dinner with us. 1 1
4 4 He loves fish tacos. 0 0
5 5 In the end, we all felt like we ate too much. Code 5.16 1 1
6 6 We all agreed; it was a magnificent evening.72 points. 1 1
paste
注意:模式也可以创建
v1 <- c("dinner", "evening", "we", "ate")
v2 <- c("in the end", "all", "Brad", "5.16")
pat1 <- paste0("\\b(", paste(v1, collapse = "|"), ")\\b")
pat2 <- paste0("\\b(", paste(v2, collapse = "|"), ")\\b")
df$A <- +(grepl(pat1, df$Comment))
df$B <- +(grepl(pat2, df$Comment))
数据
df <- structure(list(ID = 1:6, Comment = c("I ate dinner.", "We had a three-course meal.",
"Brad came to dinner with us.", "He loves fish tacos.", "In the end, we all felt like we ate too much. Code 5.16",
"We all agreed; it was a magnificent evening.72 points.")),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
推荐阅读
- java - 如何在两个时间戳之间存在的 java fx 8 表视图中显示数据
- vue.js - 在 Vue.js 中找不到图像路径
- maven - Nexus 3 存储库:升级后旧路径不起作用
- java - 在 FirebaseAuth android 中添加额外的个人资料信息
- resources - 找不到元素“资源”的声明
- reactjs - 在单个请求中发送 Multipart 文件和 @RequestBody
- javascript - 使用 ajax 更新 div
- typescript - 如何调试递归 TypeScript 泛型类型
- c# - 如何处理可能是同一调用的对象数组、objectA 或 objectB 的 JSON 输出
- asp.net-mvc-5 - 在线多供应商购物商店的子域