r - 在两个数据框中寻找共同的条目
问题描述
df1:
state group species
1 CA 2 cat, dog, chicken, mouse
2 CA 1 cat
3 NV 1 dog, chicken
4 NV 2 chicken
5 WA 1 chicken, rat, mouse, lion
6 WA 2 dog, cat
7 WA 3 dog, chicken
8 WA 4 cat, chicken
df2:
state special_species
1 CA cat
2 CA chicken
3 CA mouse
4 WA cat
5 WA chicken
6 NV dog
我有兴趣确定哪些 special_speciesdf2
存在于df1
. 我想要一个具有state
、group
和special_species
.
预期输出:
state group special_species
CA 2 cat, chicken, mouse
CA 1 cat
NV 1 dog
NV 2 NA
WA 1 chicken
WA 2 cat
WA 3 chicken
WA 4 cat, chicken
解决方案
这比我想象的要难。我认为以下工作,但希望有人能想出更漂亮的东西。
首先,我们制作一些数据来处理(请以后自己做),如果其他人想要尝试,我会包括在内:
library(tidyverse)
df1 <- tribble(
~state, ~group, ~species,
"CA", 2, "cat, dog, chicken, mouse",
"CA", 1, "cat",
"NV", 1, "dog, chicken",
"NV", 2, "chicken",
"WA", 1, "chicken, rat, mouse, lion",
"WA", 2, "dog, cat",
"WA", 3, "dog, chicken",
"WA", 4, "cat, chicken")
df2 <- tribble(
~state, ~special_species,
"CA", "cat",
"CA", "chicken",
"CA", "mouse",
"WA", "cat",
"WA", "chicken",
"NV", "dog")
那么解决方案是:
df1 %>%
separate_rows(species) %>%
full_join(df2, on = "state") %>%
filter(species == special_species) %>%
group_by(state, group) %>%
summarise(species = paste(special_species, collapse = ", ")) %>%
full_join(df1, by = c("state" = "state", "group" = "group")) %>%
select(state, group, special_species = species.x) %>%
arrange(state)
#> Joining, by = "state"
#> # A tibble: 8 x 3
#> # Groups: state [3]
#> state group special_species
#> <chr> <dbl> <chr>
#> 1 CA 1 cat
#> 2 CA 2 cat, chicken, mouse
#> 3 NV 1 dog
#> 4 NV 2 <NA>
#> 5 WA 1 chicken
#> 6 WA 2 cat
#> 7 WA 3 chicken
#> 8 WA 4 cat, chicken
如果您接受格式略有不同的所需输出,则可以显着简化代码,例如,以下是正确的保存NA
:
df1 %>%
separate_rows(species) %>%
full_join(df2, on = "state") %>%
filter(species == special_species) %>%
group_by(state, group) %>%
summarise(species = paste(special_species, collapse = ", "))
#> Joining, by = "state"
#> # A tibble: 7 x 3
#> # Groups: state [3]
#> state group species
#> <chr> <dbl> <chr>
#> 1 CA 1 cat
#> 2 CA 2 cat, chicken, mouse
#> 3 NV 1 dog
#> 4 WA 1 chicken
#> 5 WA 2 cat
#> 6 WA 3 chicken
#> 7 WA 4 cat, chicken
由reprex 包(v0.3.0)于 2019 年 12 月 3 日创建
推荐阅读
- zap - 提取在 Jenkins 代理上的容器中运行的 ZAP 报告(基于 docker)
- vue.js - 使用 vue 和 axios 从 API 数据中获取每个对象
- javascript - 只执行一次“onscroll”功能
- reactjs - 扩展 React.Component 与扩展组件
- r - 在 Mac Catallina 上安装 RMySQL 包
- javascript - 在一行上重命名导入数组的键(很像解构)
- javascript - 我无法在 React 中将道具从父母传递给孩子
- linux - 如何修复 fglrun:符号查找错误:/u2/acct/d84-lc/obj/4js/lib/userextension.so:未定义符号:fglcapi_deccvasc
- conda - conda build 是否支持构建软件包的调试版本?
- sql - 在 Impala 中选择列值只有数字的行