首页 > 解决方案 > 根据条件验证R中两个数据框之间的列中的值

问题描述

我有两个数据框。我必须匹配和之间的前两列nndftndf如果有匹配,我必须检查第三列中的值是否相同并更新第三个数据框。问题是nndf长于tndf

nndf <- data.frame("var1" = c("ABC","ABC","DEF", "FED","DGS"), "var2" = c("xyz","abc","def","dsf","dsf"), "var3" = c(1234.21,3432.12,0.12,1232.44,873.00))

tndf <- data.frame("var1" = c("ABC","ABC","DEF"), "var2" = c("xyz","abc","def"), "var3" = c(1234.21,3432.12,0.11))

ndf <- data.frame("var1" = c("ABC","ABC"), "var2" = c("xyz","abc"))

我想在第三个数据框中填充结果。该数据框从前两列中获取公共值,nndf并且tndf无论它们是公共的,检查第三列是否相同1234.213432.12如果值相同,则返回 TRUE 并填充该列。所需的输出是

var1   var2    var3
ABC    xyz     TRUE (indicating 1234.21 and 1234.21 in first two df are same)
ABC    abc     TRUE
DEF    def     FALSE (indicating 0.12 is not equal to 0.11)

我尝试使用forloop + if condition. 但是,它会多次遍历每一行并填充结果。

标签: r

解决方案


我们可以做一个inner_join然后比较两列中的值

library(dplyr)

inner_join(nndf, tndf, by = c("var1", "var2")) %>%
   mutate(var3 = var3.x == var3.y) %>%
   dplyr::select(var1, var2, var3)


#  var1 var2  var3
#1  ABC  xyz  TRUE
#2  ABC  abc  TRUE
#3  DEF  def FALSE

或类似地在基础 R

df1 <- merge(nndf, tndf, by = c("var1", "var2"))
df1$var3 <- df1$var3.x == df1$var3.y

推荐阅读