r - 如何合并两个数据框并仅保留不同的列(内容)?
问题描述
我有两个具有相同行大小和不同列号的数据框,列的名称也不同,但是其中一些内容可能相似。
即df1:
df1<- data.frame("a"=c("0","1","0","1","0","0","0"),
"b"=c("1","1","1","1","1","0","0"),
"c"=c("1","1","0","0","1","0","0"),
"d"=c("1","1","1","1","1","1","1"))
df2:
df2<- data.frame("e"=c("1","1","0","1","0","0","0"),
"f"=c("1","1","1","1","1","0","0"),
"g"=c("0","0","0","0","1","0","0"),
"h"=c("0","0","0","0","1","1","1"))
如果您看到,df1 的“b”列和 df2 的“f”列是相等的。因此,我想要的结果是一个新的数据框,如下所示:
df3 <- data.frame("a"=c("0","1","0","1","0","0","0"),
"c"=c("1","1","0","0","1","0","0"),
"d"=c("1","1","1","1","1","1","1"),
"e"=c("1","1","0","1","0","0","0"),
"g"=c("0","0","0","0","1","0","0"),
"h"=c("0","0","0","0","1","1","1"))
注意:列“b”和“f”(相似)不在新的 df3 中。我在网上看过,但我没有找到一个例子。我认为主要的复杂性是合并是按内容而不是按列名。
解决方案
这是一个更多的tidyverse
解决方案。
library(dplyr)
library(tidyr)
# based on Ronak's sapply approach
matches <- as.data.frame(sapply(df1, function(x) sapply(df2, function(y) identical(x, y)))) %>%
rownames_to_column(var = "df2") %>%
pivot_longer(-df2, names_to = "df1") %>% # pivot longer
filter(value) # keep only the matches
# programmatically build list of names to remove
vars_remove <- c(matches$df1, matches$df2) # will remove var names that are matches
df1 %>% bind_cols(df2) %>%
select(-any_of(vars_remove))
a c d e g h
1 0 1 1 1 0 0
2 1 1 1 1 0 0
3 0 0 1 0 0 0
4 1 0 1 1 0 0
5 0 1 1 0 1 1
6 0 0 1 0 0 1
7 0 0 1 0 0 1
推荐阅读
- java - 尝试以墨西哥波浪样式模式将字符更改为大写
- node.js - 不断从流响应中获取 Node js 中损坏或受密码保护的 PDF
- unit-testing - NET CORE - 单元测试 - CustomWebApplicationFactory
- reactjs - 如何映射对象的槽数组并切换选定的布尔属性
- sql-server - 计算 SQL 连接上的缺失值
- java - 如何有效地从rest api下载文件并将其发送到像RabbitMq这样的消息代理?
- c++ - C++ 错误:在抛出 'std::bad_alloc' what() 的实例后调用终止:std::bad_alloc
- powerbi - 如何在power bi中制作一个包含3个表格的折线图
- flutter - 模型类在不应该更新时更新
- spring - Maven pom.xml 依赖