r - R - 使用带有 Purrr 的 Dplyr mutate 进行字符串操作
问题描述
我有两个小标题,每个小标题都有字符串列表。我需要将一个字符串列表与另一个字符串列表进行比较,并根据比较创建一个新列。
下面的小例子:
## Tibble 1 - the 'master'
structure(list(terms = c("This", "is", "a", "stri", "of", "areas",
"times", "two", "to", "see", "what", "will", "be", "in", "the",
"magic", "will", "rally", "for", "a", "cry", "from", "the", "deepest",
"part", "of", "the", "ocean", "com", "en", "au", "us"), rank = c("A",
"B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N",
"O", "P", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K",
"L", "M", "N", "O", "P"), id = 1:32), row.names = c(NA, -32L), class = c("tbl_df",
"tbl", "data.frame"))
## Tibble 2 - the 'comparison'
structure(list(conds = c("this.com", "two.org", "magic.edu",
"cry/en/org", "magic.com"), ind = structure(c(2L, 1L, 5L, 3L,
4L), .Label = c("bad", "good", "Indifferent", "Maybe", "Ugly"
), class = "factor")), row.names = c(NA, -5L), class = c("tbl_df",
"tbl", "data.frame"))
理想情况下,输出将是一个变异的“主”小标题,其中插入的 ind 值取决于字符串的比较
到目前为止的尝试:
terms <- terms %>% mutate(
test = ifelse(
sapply(lapply(terms, grepl, condition_str$conds), any) == TRUE,
condition_str$ind,
'NA'))
terms
结果
# A tibble: 32 x 4
terms rank id test
<chr> <chr> <int> <chr>
1 This A 1 NA
2 is B 2 1
3 a C 3 5
4 stri D 4 NA
5 of E 5 NA
6 areas F 6 NA
7 times G 7 NA
8 two H 8 5
9 to I 9 NA
10 see J 10 NA
它给了我一个结果,因子水平被传递,但因子名称没有。它在我正在处理的更大数据集上失败。
问题:
是否有使用 stringr 或 stringi 的 purrr 解决方案?我的问题可能出在我的字符串匹配中
有没有办法
fixed = TRUE
在 grepl 函数中使用合并?有没有办法将分类级别放入变异列?
感谢您的任何帮助。
詹姆士
解决方案
推荐阅读
- visual-studio-code - Visual Studio 代码烦人的自动启动
- android - 奇怪的绑定适配器点击监听行为kotlin android
- javascript - React webpack config:是否可以仅替换数组中一个插件的配置,而无需重置插件数组?
- mysql - mysql 选择加入第二行
- elasticsearch - ElasticSearch - 使用一个查询的结果作为另一个查询的过滤器
- c# - Directory.GetDirectories 顺序与 Directory.GetFiles 顺序
- mysql - 如何防止json数据输出按字母顺序排序并存储在mysql工作中?
- kotlin - Companion 与 INSTANCE 有何不同
- deep-learning - Pytorch 嵌入层中使用的默认权重初始化是什么?
- javascript - 传单:当缩放相同时删除上一层并添加新的点击层