首页 > 解决方案 > How to compare each row of dataframe 1 with each of dataframe 2 and sum up if the values are the same?

问题描述

Consider you have two dataframes of the same size, consisting in entries of either TRUE, FALSE or NA:

df1 <- data.frame(a = c(T, F, F, F, T), b=c(F, NA, NA, F, T))
View(df1)

df2 <- data.frame(a = c(T, T, T, F, NA), b=c(T, NA, T, F, T))
View(df2)

Now you want to see how many entries in each row of df1 are exactly the same as in df2. The result should look like this:

SumOfSimilarValuesPerRow <- data.frame(Result = c(1, 0, 0, 2, 1))
View(SumOfSimilarValuesPerRow)

So first you probably need to compare each entry in every row of df1 with the same in df2 and then sum up the result.

I tried it with a double loop, but I keep getting the error

missing value where TRUE/FALSE needed

when trying the following:

for (i in 1:5) {
  for (j in 1:2) {
    if(df1[i, j] == df2[i, j]) {
      print("OK")
    } 
  }
}

I haven´t tried to sum up the result yet, bacause I already struggeled to compare every entry.

Does anyone know, how that would work in an easy to understand fassion?

Any help would be appreciated a lot!

标签: rloopsdataframematrixcompare

解决方案


You can do this simply as

rowSums(df1 == df2, na.rm = TRUE)

== does an element by element comparison, which works for the requirements of your problem because your df1 and df2 have the same sizes (and structure).


推荐阅读