r - How to compare each row of dataframe 1 with each of dataframe 2 and sum up if the values are the same?
问题描述
Consider you have two dataframes of the same size, consisting in entries of either TRUE, FALSE or NA:
df1 <- data.frame(a = c(T, F, F, F, T), b=c(F, NA, NA, F, T))
View(df1)
df2 <- data.frame(a = c(T, T, T, F, NA), b=c(T, NA, T, F, T))
View(df2)
Now you want to see how many entries in each row of df1 are exactly the same as in df2. The result should look like this:
SumOfSimilarValuesPerRow <- data.frame(Result = c(1, 0, 0, 2, 1))
View(SumOfSimilarValuesPerRow)
So first you probably need to compare each entry in every row of df1 with the same in df2 and then sum up the result.
I tried it with a double loop, but I keep getting the error
missing value where TRUE/FALSE needed
when trying the following:
for (i in 1:5) {
for (j in 1:2) {
if(df1[i, j] == df2[i, j]) {
print("OK")
}
}
}
I haven´t tried to sum up the result yet, bacause I already struggeled to compare every entry.
Does anyone know, how that would work in an easy to understand fassion?
Any help would be appreciated a lot!
解决方案
You can do this simply as
rowSums(df1 == df2, na.rm = TRUE)
==
does an element by element comparison, which works for the requirements of your problem because your df1
and df2
have the same sizes (and structure).